本研究提出以N 組連結平均法的階層式自動分群演算法,其具備任意形狀的群聚探索能力,並有效避免鏈結效應的影響而提升分群結果的正確率。與相關文獻比較,對於自動分析群聚數量能更加精確。
本研究實驗採用人工合成的二維資料集,分別與分割式分群演算法(k-means and PAM)、階層式分群演算法(Single-link, Complete-link, Group average, and Centroid)與結合k-means 及階層式分群法之二階段分群演算法(HKC)比較,獲得對於任意形狀資料集有更正確的分群結果。另以CHAMELEON 的資料集比較文獻在自動分群的正確性,獲得更具精確性的群數判斷。
This study proposed a novel method of using N-link average for hierarchical automatic clustering, which has the ability to explore arbitrary shapes and can improve the accuracy of clustering to avoid chaining effect efficiently. Comparing with relevant literature, this method is more correct for the data of automatic clustering analysis. The experiment uses two-dimensional synthetic data to compare separately with Partitional Clustering Algorithm(k-means and PAM), Hierarchical Clustering Algorithms(Single-link, Complete-link, Group average, and Centroid)and a Two-Phase Clustering Algorithm based on K-means and Hierarchical Clustering with Single-Linkage Agglomerative Method and the results shows the new method we proposed can generate the clustering effect more correct for the data set of arbitrary shapes. Besides, comparison with the accuracy of automatic clustering in other relevant literature, adopting the data set of CHAMELEON can obtain more precise judgment of the number of clusters.