HyRead Journal 台灣全文資料庫

文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

自然科學/資訊/科技

篇名	A Study on Word Similarity using Context Vector Models
卷期	7:2
作者	Chen, Keh-jiann 、 You, Jia-ming
頁次	037-057
關鍵字	THCI Core
出刊日期	200208

There is a need to measure word similarity when processing natural languages, especially when using generalization, classification, or example -based approaches. Usually, measures of similarity between two words are defined according to the distance between their semantic classes in a semantic taxonomy . The taxonomy approaches are more or less semantic -based that do not consider syntactic similarit ies. However, in real applications, both semantic and syntactic similarities are required and weighted differently. Word similarity based on context vectors is
a mixture of syntactic and semantic similarit ies. In this paper, we propose using only syntactic related co-occurrences as context vectors and adopt information theoretic models to solve the problems of data
sparseness and characteristic precision. The probabilistic distribution of
co-occurrence context features is derived by parsing the contextual environment of each word , and all the context features are adjusted according to their IDF (inverse document frequency) values. The agglomerative clustering algorithm is applied to group similar words according to their similarity values. It turns out that words with similar syntactic categories and semantic classes are grouped together.

本卷期文章目次

關鍵知識WIKI

THCI Core

文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

中文摘要

英文摘要

本卷期文章目次

關鍵知識WIKI

相關文獻