【ML】K均值聚类算法 (K-means Clustering)

Intro

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. – Wikipedia

K-means Clustering 就是把 n n 个 sample 迭代划分为 k k 个 cluster, 保证

  • 不同 cluster 质心之间距离差足够大
  • 同一 cluster 内 samples 的距离差足够小

Algo

《【ML】K均值聚类算法 (K-means Clustering)》

  • Step 1 : 随机初始化质心 :随机选择 k k 个 sample 作为质心 c1,c2,...,ck c 1 , c 2 , . . . , c k , 特征空间被划分为 k k 个 voronoi 子空间, k k 个 cluster
  • Step 2 :给每个 sample 分配所属的cluster: sample 从属与距离最近的质心所划分的cluster
  • Step 3:所有 sample 划分完成后,重新计算每个 cluster 的质心
  • Step 4: 重复 Step 2, Step 3, 直到达至最大迭代次数或两次迭代的差小于阈值,则停止迭代,输出结果。

Ref

    原文作者:聚类算法
    原文地址: https://blog.csdn.net/baishuo8/article/details/81909295
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞