文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

  • 加入收藏
  • 下載文章
篇名 Auto-Generation of NVEF Knowledge in Chinese
卷期 9:1
作者 Tsai, Jia-linHsieh, GladysHsuag, Wen-lian
頁次 041-063
關鍵字 natural language understandingHowNetmachine learningverb-noun collectionTHCI Core
出刊日期 200402

中文摘要

英文摘要

Noun-verb event frame (NVEF) knowledge in conjunction with an NVEF word-pair identifier [Tsai et al. 2002] comprises a system that can be used to support natural language processing (NLP) and natural language understanding (NLU). In [Tsai et al. 2002a], we demonstrated that NVEF knowledge can be used effectively to solve the Chinese word-sense disambiguation (WSD) problem with 93.7% accuracy for nouns and verbs. In [Tsai et al. 2002b], we showed that NVEF knowledge can be applied to the Chinese syllable-to-word (STW) conversion problem to achieve 99.66% accuracy for the NVEF related portions of Chinese sentences. In [Tsai et al. 2002a], we defined a collection of NVEF knowledge as an NVEF word-pair (a meaningful NV word-pair) and its corresponding NVEF sense-pairs. No methods exist that can fully and automatically find collections of NVEF knowledge from Chinese sentences. We propose a method here for automatically acquiring large-scale NVEF knowledge without human intervention in order to identify a large, varied range of NVEF-sentences (sentences containing at least one NVEF word-pair). The auto-generation of NVEF knowledge (AUTO-NVEF) includes four major processes: (1) segmentation checking; (2) Initial Part-of-Speech (IPOS) sequence generation; (3) NV knowledge generation; and (4) NVEF knowledge auto-confirmation. Our experimental results show that AUTO-NVEF achieved 98.52% accuracy for news and 96.41% for specific text types, which included research reports, classical literature and modern literature. AUTO-NVEF automatically discovered over 400,000 NVEF word-pairs from the 2001 United Daily News (2001 UDN) corpus. According to our estimation, the acquired NVEF knowledge from 2001 UDN helped to identify 54% of the NVEF-sentences in the Academia Sinica Balanced Corpus (ASBC), and 60% in the 2001 UDN corpus.We plan to expand NVEF knowledge so that it is able to identify more than 75% of NVEF-sentences in ASBC. We will also apply the acquired NVEF knowledge to support other NLP and NLU researches, such as machine translation, shallow parsing, syllable and speech understanding and text indexing. The auto-generation of bilingual, especially Chinese-English, NVEF knowledge will be also addressed in our future work.

相關文獻