篇名 | Research on Unequal Time Series Clustering for Hot Topics |
---|---|
卷期 | 29:4 |
作者 | Fu-Lian Yin 、 Bei-Bei Zhang 、 Jing-Chun Liao 、 Jian-Bo Liu |
頁次 | 122-134 |
關鍵字 | data acquisition 、 DTW distance 、 hierarchical clustering 、 time series clustering 、 EI 、 MEDLINE 、 Scopus |
出刊日期 | 201808 |
DOI | 10.3966/199115992018082904010 |
In the traditional research on time series clustering for hot topics, the granularity of time series with day as the unit is coarse, which causing the bad timeliness. In this paper, a fine - grained hot topic time series acquisition scheme is proposed, which can make the time series of topics accurate to T0 hour. The distance calculation of time series lost part of information to fit the unequal time series clustering in the traditional clustering algorithm. In this paper, the S-Euc distance (Segmented Euclidean distance) and S-DTW distance (Segmented Dynamic Time Warping distance) are introduced to segment the time series and calculate the total distance. The two method significantly raise in computational speed, silhouette coefficient and cluster compactness, compared with the traditional DTW distance and Euclidean. When the cluster number is bigger, the clustering silhouette coefficient of S-Euc based algorithm is about 8% higher than the S-DTW based algorithm. In the case of small number of clusters, the silhouette coefficient of S-DTW is about 65% higher than the S-Euc, but the computational complexity is higher.