文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

  • 加入收藏
  • 下載文章
篇名 Automatic Pronominal Anaphora Resolution in English Texts
卷期 9:1
作者 Liang, TyneWu, Dian-song
頁次 021-039
關鍵字 THCI Core
出刊日期 200402

中文摘要

英文摘要

Anaphora is a common phenomenon in discourses as well as an important research issue in the applications of natural language processing. In this paper, anaphora resolution is achieved by employing WordNet ontology and heuristic rules. The proposed system identifies both intra-sentential and inter-sentential antecedents of anaphors. Information about animacy is obtained by analyzing the hierarchical relations of nouns and verbs in the surrounding context. The identification of animacy entities and pleonastic-it usage in English discourses are employed to promote resolution accuracy. Traditionally, anaphora resolution systems have relied on syntactic, semantic or pragmatic clues to identify the antecedent of an anaphor. Our proposed method makes use of WordNet ontology to identify animate entities as well as essential gender information. In the animacy agreement module, the property is identified by the hypernym relation between entities and their unique beginners defined in WordNet. In addition, the verb of the entity is also an important clue used to reduce the uncertainty. An experiment was conducted using a balanced corpus to resolve the pronominal anaphora phenomenon. The methods proposed in [Lappin and Leass, 94] and [Mitkov, 01] focus on the corpora with only inanimate pronouns such as “it” or “its”. Thus the results of intra-sentential and inter-sentential anaphora distribution are different. In an experiment using Brown corpus, we found that the distribution proportion of intra-sentential anaphora is about 60%. Seven heuristic rules are applied in our system; five of them are preference rules, and two are constraint rules. They are derived from syntactic, semantic, pragmatic conventions and from the analysis of training data. A relative measurement indicates that about 30% of the errors can be eliminated by applying heuristic module.

關鍵知識WIKI

相關文獻