文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

  • 加入收藏
  • 下載文章
篇名 The Design and Construction of the PolyU Shallow Treebank
卷期 10:3
作者 Xu, RuifengLu, QinLi, YinLi, Wanyin
頁次 397-415
關鍵字 Shallow TreebankNatural Language ProcessingCorpus AnnotationShallow ParsingTHCI Core
出刊日期 200509

中文摘要

英文摘要

This paper presents the design and construction of the PolyU Treebank, a manually annotated Chinese shallow treebank. The PolyU Treebank is based on shallow annotation where only partial syntactical structures within sentences are annotated. Guided by the Phrase-Standard Grammar proposed by Peking University, the PolyU Treebank has been designed and constructed to provide a large amount of annotated data containing shallow syntactical information and limited semantic information for use in natural language processing (NLP) research. This paper describes the relevant design principles, annotation guidelines, and implementation
issues, including the achievement of high quality annotation through the use of well-designed annotation workflow and effective post-annotation checking tools. Currently, the PolyU Treebank consists of a one-million-word annotated corpus and has been used in a number of NLP research projects with promising results.

相關文獻