文章詳目資料

語文與國際研究

  • 加入收藏
  • 下載文章
篇名 Analyses on the Used Vocabulary in the Corpus of Taiwanese Learner of Japanese (CTLJ): Comparisons between CTLJ and Self-Constructed Natural Corpus
卷期 11
並列篇名 「台灣日語學習者語料庫」(CTLJ)之使用語彙分析-與自然語料庫之比較為本
作者 黃淑妙
頁次 071-096
關鍵字 語料庫詞素出現頻率易錯語彙Corpusmorphemesfrequency of occurrenceprone-to-error vocabulary
出刊日期 201406
DOI 10.3966/181147172014060011003

中文摘要

本論文針對台灣日語學習者語料庫(CTLJ)之原文部分,先以詞素解析器MeCab將其中之語彙加以分割,針對解析錯誤,前後歷時三年並經兩次校正後,再進行使用語彙分析。為了凸顯學習者語彙之特徵,分析時透過與筆者自行建構之自然語料庫進行比較。經分析結果得知:CTLJ原文部分之詞素總數超過39萬詞,其中個別詞素約1萬3千詞,名詞最多,連7千4百餘詞(佔57.2%);其次為動詞,逾3千1百餘詞(佔24.2%)。此外,藉由比較CTLJ與筆者自行建構之自然語料庫,可以掌握學習者使用語彙之實際狀況與易錯語彙之使用情形,提供學習者強化學習之參考。

英文摘要

This paper presents an in-depth analysis of the use of vocabulary covered by the Corpus of Taiwanese Learner of Japanese. Our method consists, firstly, in applying the Japanese morphological analyzer, MeCab, to segment vocabularies of the original writings in Japanese in CTLJ, and then proceeding with morpheme-level analysis of errors in grammar and usage, which process has been repeated twice in the recent three years. In order to highlight the words characteristic of the Taiwanese Learners' Japanese, comparisons are made between CTLJ and a corpus of current Japanese, which have been constructed by the author. The result indicates that the number of morpheme tokens used in the original students' essays in Japanese in CTLJ is more than 390 thousand, or around 13 thousand morpheme types. The number of nouns amounts to 7,400, which accounts for 57.2% of morpheme types. The number of verbs is 3,100 (24.2%). In addition, comparisons between CTLJ and the above-mentioned natural corpus help the instructors to grasp the actual situations of how the learners use and reveal what sort of items are particularly prone to errors, thereby enabling them to provide apt and systematic instructions to the learners.

相關文獻