文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

  • 加入收藏
  • 下載文章
篇名 從詞網出發的中文複合名詞的語意表達
卷期 8:2
作者 柯淑津
頁次 093-107
關鍵字 THCI Core
出刊日期 200308

中文摘要

WordNet 提供豐富的詞彙語意資訊,因此對於自然語言處理相關研究有很大的幫助。但是由於Princeton WordNet 的語意資訊僅以英文的形式呈現,為了能讓WordNet 所蘊含的豐富資源也能應用到中文相關處理,我們試圖利用雙語字典等多項已存在的資源做為橋樑,希望能將英文WordNet 的豐富資源自動引介到中文。但是,在我們觀察這些連結英文WordNet 與雙語字典所產生的初步結果後,發現由於語言之間的藩籬以及雙語字典的目標語詞彙大都偏向於解釋等多種原因,使得英文同義詞集(Synset)所對應到的中文翻譯,常是一些不具結構性的中文複合詞、片語、甚至是一長串的句子,而不是獨立的中文詞彙。這樣的現象與中文詞網應以詞彙為基本元件的要求相違背。因此,本研究將針對這種現象作進一步的處理。本文的主要目標有下列兩項:首先,自中文複合詞找出最能代表其意義的中心詞彙,及若干個特徵詞彙。其次,將這些詞彙進一步以語意概念形式表達出來。第一個部分,我們透過語法結構分析來完成。至於,第二個部分,詞彙的語意我們透過知網的概念特徵來表示。當然,在中文詞彙轉為詞義概念的部分,是存在著歧義現象的。辨識語意歧義的方法,我們除了用到詞彙的詞性之外,還透過WordNet 的上位關係來降低歧義度。我們以名詞部分進行實驗,實驗結果顯示在語意標示方面,可達到93.5%的應用率以及93.8%的正確率。

英文摘要

WordNet provides plenty of lexical meaning; therefore, it is very helpful in natural
language processing research. Each lexical meaning in Princeton WordNet is
presented in English. In this work, we attempt to use a bilingual dictionary as the
backbone to automatically map English WordNet to a Chinese form. However,
we encounter many barriers between the two different languages when we observe
the preliminary result for the linkage between English WordNet and the bilingual
dictionary. This mapping causes the Chinese translation of the English synonym collection (Synset) to correspond to unstructured Chinese compound words, phrases, and even long string sentence instead of independent Chinese lexical words. This phenomenon violates the aim of Chinese WordNet to take the lexical word as the basic component. Therefore, this research will perform further processing to study this phenomenon. The objectives of this paper are as follows: First, we will discover core lexical words and characteristic words from Chinese compound words. Next, those lexical words will be expressed by means of conceptual representations. For the core lexical words, we use grammar structure analysis to locate such words. For characteristic words, we use sememes in HowNet to represent their lexical meanings. Certainly, there exists a problem of ambiguity when Chinese lexical words are translated into their lexical meanings. To resolve this problem, we use lexical parts-of-speech and hypernyms of WordNet to reduce the lexical ambiguity.
We experimented on nouns, and the experimental results show that sense
disambiguation could achieve a 93.8% applicability rate and a 93.5% correct rate.

關鍵知識WIKI

相關文獻