文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

  • 加入收藏
  • 下載文章
篇名 The Formosan Language Archive:Linguistic Analysis and Language Processing
卷期 10:2
作者 Zeitoun, ElizabethYu, Ching-hua
頁次 167-199
關鍵字 Formosan languageslanguage processinglinguistic analysiscorporaFormosan Language ArchiveTHCI Core
出刊日期 200506

中文摘要

英文摘要

In this paper, we deal with the linguistic analysis approach adopted in the Formosan Language Corpora, one of the three main information databases included in the Formosan Language Archive, and the language processing programs that have been built upon it. We first discuss problems related to the transcription of different language corpora. We then deal with annotation rules and standards. We go on to explain the linguistic identification of clauses, sentences and paragraphs, and the computer programs used to obtain an alignment of words, glosses and
sentences in Chinese and English. We finally show how we try to cope with
analytic inconsistencies through programming. This paper is a complement to Zeitoun et al. [2003] in which we provided an overview of the whole architecture of the Formosan Language Archive.

相關文獻