文章詳目資料

測驗學刊 TSSCI

  • 加入收藏
  • 下載文章
篇名 利用Google BERT提升中文寫作自動評分之準確率
卷期 68:1
並列篇名 Applying Google BERT to Enhance the Correct Rate of Automatic Scoring in Chinese Writing
作者 郭伯臣李政軒黃淇瀅
頁次 053-074
關鍵字 Google BERT模型中文寫作自動評分automatic scoring modelChinese writingGoogle BERT modelLSATSSCI
出刊日期 202103

中文摘要

中文寫作測驗在臺灣大型測驗中已行之有年,但是在作文評分時會因為評分教師的教育背景、經驗及認知上之不同而產生差異性,因此發展中文寫作自動評分模型協助教師進行評分工作將顯得極其重要。目前有許多自動計分模型是採用潛在語意分析(Latent Semantic Analysis, LSA),由於其表示於向量空間中的詞彙並未考慮到上下文順序,故在文本分析上將有所限制。有鑑於此,本研究利用Google在2018年所提出的自然語言處理深度學習BERT(Bidirectional Encoder Representation from Transformers)建立中文寫作自動評分模型。Google BERT模型以預訓練及微調為主,預訓練過程中藉由Masked LM(MLM)、Next Sentence Prediction(NSP),以及Transformer編碼的過程,讓模型在文本的處理上更為精確。本研究隨機挑選大學生語文素養檢測之三等級評分(0分、1分及2分)寫作測驗為分析樣本進行研究,共有1,185名學生之寫作文本。利用微調後的Google BERT模型進行中文寫作自動評分,系統與專家評分的整體準確率(Accuracy)達92.07%,優於傳統以潛在語意分析進行中文寫作自動評分(其與專家評分的整體準確率達64.73%)。

英文摘要

In Taiwan, the Chinese writing test has been used in a large-scale test for many years. When grading Chinese writing, the score may different since the educational background, experience, and cognition of the grading teachers. Therefore, the development of automatic scoring of Chinese writing to assist teachers in grading is extremely important. At present, many automatic scoring models use Latent Semantic Analysis (LSA). Since the vocabulary expressed in the vector space does not consider the context, there will be restrictions on text analysis. In view of this, this research uses the Bidirectional Encoder Representation from Transformers (BERT) proposed by Google in 2018 to establish the model of the automatic scoring model for Chinese writing. Google BERT is mainly based on pretraining and fine-tuning. Google BERT is mainly based on pretraining including the process of Masked LM (MLM), Next Sentence Prediction (NSP), and Transformer encoding. After pretraining, a fine-tuning step was applied by specific essays. These two steps make the model more accurate in text processing. This study randomly selected the three grades (0 points, 1 point, and 2 points) writing test of “Chinese language literacy test for undergraduate” as the analysis sample for research and a total of 1,185 students’ writing texts. Compared to the traditional automatic scoring method of LSA (which overall accuracy rate of expert scoring is 64.73%), using the fine-tuned Google BERT model has better results since the overall accuracy rate (Accuracy) of the system and expert scoring reached 92.07%.

相關文獻