文章詳目資料

測驗學刊 TSSCI

  • 加入收藏
  • 下載文章
篇名 以支援向量機處理題型符號與文字特徵應用於微積分試題難度分類
卷期 68:2
並列篇名 Difficulty Level Classification of Calculus Exam Questions Using SVMs with Descriptive Features of Symbols and Texts
作者 林泓宏蘇家鈜張勝麟
頁次 075-100
關鍵字 支援向量機文字特徵微積分試題難度分類題型符號calculusquestion description symbolquestion difficulty level classificationsupport vector machinetext featuresTSSCI
出刊日期 202106

中文摘要

本研究主要在建立微積分試題的「題型符號與文字特徵」,透過人工歸納,對各類型的微積分試題擷取試題符號特徵,並轉換為向量表示。接著,對試題特徵向量,分別以主成分分析(Principal Component Analysis, PCA)及線性判別分析(Linear Discriminant Analysis, LDA)做降維處裡,找出較符合試題難易分布的特徵空間,最後利用支援向量機對降維後的試題特徵,估計試題的難易度,透過使用支援向量機RBF核函數進行「難、中、易」之試題分類。就文獻探討所知,本研究所提出的「題型符號與文字特徵」計算表示形式,為國內外相關研究中創新的特徵集設計。實驗結果顯示:在5摺交叉驗證測試下,對單一摺測試集之微積分試題難易度分類,最高可獲取95%的正確率,而5摺的平均測試正確率也可達90.19%,基於實驗測試結果遠高於隨機亂猜的33.33%,而對3個類別中,隨機亂猜的95%信賴區間上限約在42.69%,可看出本研究方法的實驗結果大幅高於亂猜達47.5%,顯示本研究所提出的「題型符號與文字特徵」對於微積分試題難易度分類具有顯著的功效。

英文摘要

A new design of "descriptive symbol and text features" of calculus exam questions has been proposed in this paper. The proposed descriptive features of symbols and texts can be extracted from various calculus questions and are represented by vectors. The high dimensionality of extracted features from test questions is then reduced by principal component analysis (PCA) or by linear discriminant analysis (LDA) for finding a lower dimensional feature space that better fits the difficulty-level distribution of test questions. Subsequently, a support vector machine with radial basis kernel is adopted to categorize calculus questions into three degrees of difficulty, i.e., hard, medium and easy. To the best of our knowledge, the proposed descriptive feature representation of symbols and texts of mathematical questions is a novel design for difficulty level estimation of calculus exam questions and is rarely seen in previous literature. In our experiments of difficulty level classification with 5-fold cross validation (CV), the highest classification accuracy of difficulty level of calculus questions in a test fold is 95%, while the average classification accuracy of 5-fold CV is 90.19%. These results are far higher than the mere 33.33% accuracy of random guess. For the three categories, the upper limit of the 95% confidence interval for random guess is about 42.69%. It can be seen that our result is much higher than the upper limit of random guess by about 47.5%. Validate the significant effectiveness of the proposed descriptive features of symbols and texts of calculus exam questions on automatic difficulty level prediction.

相關文獻