文章詳目資料

Journal of Computers EIMEDLINEScopus

  • 加入收藏
  • 下載文章
篇名 DP-Kmeans and Beyond: Optimal Clustering with a New Clustering Validity Index
卷期 33:5
作者 Zhu-Juan MaZi-Han WangXiang-Hua ChenFeng Liu
頁次 001-017
關鍵字 K-meansclustering validityoptimal clustering numberdata miningEIMEDLINEScopus
出刊日期 202210
DOI 10.53106/199115992022103305001

中文摘要

英文摘要

The K-means clustering algorithm is widely used in many areas for its high efficiency. However, the performance of the traditional K-means algorithm is very sensitive to the selection of initial clustering centers. Furthermore, except the convex distributed datasets, the traditional K-means algorithm still cannot optimally process many non-convex distributed datasets and datasets with outliers. To this end, this paper proposes the DP-Kmeans, an improved K-means algorithm based on the Density Parameter and center replacement, which can be more accurate than the traditional K-means by dropping the random selection of the initial clustering centers and continuous updating of the new centers. Due to the unsupervised learning feature, the number of clusters and the quality of data partitions generated by the clustering algorithm cannot be guaranteed. In order to evaluate the results of the DP-Kmeans algorithm, this paper proposes the SII, a new clustering validity index based on the Sum of the Inner-cluster compactness and the Inter-cluster separateness. Based on the DP-Kmeans algorithm and the SII index, a new method is proposed to determine the optimal clustering numbers for different datasets. Experimental results on ten datasets with different distributions demonstrate that the proposed clustering method is more effective the existing ones.

本卷期文章目次

相關文獻