篇名 | A Thailand Tourism Web Analysis and Clustering Tool Using a Word Weight Calculation Algorithm |
---|---|
卷期 | 30:2 |
作者 | Chakkrit Snae Namahoot 、 Desmond Lobo |
頁次 | 116-125 |
關鍵字 | algorithms 、 metadata 、 text mining 、 web clustering 、 website analysis 、 EI 、 MEDLINE 、 Scopus |
出刊日期 | 201904 |
DOI | 10.3966/199115992019043002010 |
The result of searching for tourism material with a search engine typically ends up with an overload of information. These results are often presented in an uncategorized and incoherent manner since most travel sites are not classified. This causes difficulties in searching as well as a lot of time wasted extracting the relevant information. It also leads to inconvenient information gathering, even from a single information source. In this study, the researchers resolved these issues by developing a Word Weight Calculation Algorithm (WWCA). The algorithm calculates weights of words and are applied not only for the analysis of Thai travel websites, but also for their classification using four categories of tourist information: attractions, accommodation, restaurants and souvenir shops. Parts of the website HTML structure were extracted and used for the analysis and classification of 800 Thai tourist websites. The results of the WWCA were measured in terms of its efficiency using the F-measure statistic. The outcomes showed that (1) the content within the HTML body tag alone is sufficient to classify the sites and (2) the WWCA was a good indicator for the Thai travel websites classification.