聚类算法---MBSAS

2019年3月21日 322次阅读来源: 聚类算法

算法思路：

确定一种恒量两个数据之间相似度（距离），确定一个阀值theta以及最多能够聚类的类别个数q.先将第一个作为第一类，然后每进入一个样本与之前的所有样本计算距离，当距离大于阀值并且没有达到类别个数的时候，我们将其当作另外一类，否则把它归为离它最近的那个。同时这里与一个类别的距离是与这一类样本的均值(m)的距离，不断更新。

mCnewk=(nCnewk−1)mColdk+xnCnewk

function [bel,m]=MBSAS(X,threshold,q,order)
%Input
% :the column of X represents a sample
% :thershold is uesd to divide whether the sample into the C
% :q is the number of clusters
% :order represents the order of presentation of the vectors of X

%Output:
% :bel is the corresponding label;
% :m
%---------------------------------Ordering the data------------------------
[l,N]=size(X);
if(length(order)==N)
    X1=[];
    for i=1:N
        X1=[X1 X(:,order(i))];
    end
   X=X1;
   clear X1;
end
%--------------------------------Cluster determining phase-----------------

n_clust=1;
[l,N]=size(X);
bel=zeros(1,N);
bel(1)=n_clust;
m=X(:,1);
for i=2:N
    [m1,m2]=size(m);
    %Dertermining the closest cluster representative
      [s1,s2]=min(sqrt(sum((m-X(:,i)*ones(1,m2)).^2)));
    if (s1>threshold)&&(n_clust<q)
        n_clust=n_clust+1;
        bel(i)=n_clust;
        m=[m X(:,i)];
    end
end
[m1,m2]=size(m);%m2 is the number of cluster
%----------------------------Pattern classification phase-------------------
for i=1:N
    if(bel(i)==0)
        [s1,s2]=min(sqrt(sum((m-X(:,i)*ones(1,m2)).^2)));
        bel(i)=s2;
        m(:,s2)=((sum(bel==s2)-1)*m(:,s2)+X(:,i))/sum(bel==s2);
    end
end






end

算法缺点：

聚类依赖与样本出现的顺序，以及阀值对其结果的影响非常大。

    原文作者：聚类算法
    原文地址: https://blog.csdn.net/xiepeng1128/article/details/44985117
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。