文章詳目資料

測驗學刊 TSSCI

  • 加入收藏
  • 下載文章
篇名 Can We Rely Too Much on Testlets? The Influence of the Number of Testlet Items on Parameter Estimation
卷期 60:4
並列篇名 題組在測驗上的使用是否須謹慎?題組的題數對參數估計的影響
作者 林奕宏施慶麟
頁次 649-680
關鍵字 parameter estimationRasch testlet modeltest designtestlet effecttestlet參數估計測驗設計題組題組效果羅序題組模式TSSCI
出刊日期 201312

中文摘要

題組題已被廣泛應用在各種測驗情境裡,然而,研究已發現題組效果會對測驗結果產生某種程度的影響。本研究目的即在進一步探究題組的題數與整體測驗結果的關係,並聚焦在參數估計結果與測驗信度的變化。本研究包含實徵分析及模擬研究。實徵分析以台灣2007 年大學入學考試英語科測驗為例,發現測驗資料中含有顯著的題組效果;接著以實徵分析所得的參數值為基礎,進行模擬研究。模擬的結果發現,測驗中題組題的數目會對受試者的能力估計值產生顯著的影響:當題組題數目減少時,與受試者能力估計值有關的偏誤、標準誤、均方差、平均絕對誤差等也會隨之降低,而測驗信度則會隨之提高,但試題難度受到的影響就相對較小。換言之,如果測驗目的是希望獲得精確的受試者能力估計值,如入學測驗等,則對題組題的使用,尤其是題組題的數目,就須特別小心控制。

英文摘要

Testlet items are commonly used in test situations. However, studies have found that the testlet effects have some impact on test results. The purpose of this study is to investigate further the influence of the number of testlet items on the entire test and to observe changes in the parameter estimates as well as test reliability. This study consists of an empirical analysis and two simulation studies. The English test in Taiwan's 2007
College Entrance Examination was analyzed in this study as an example. The empirical analysis demonstrates the non-ignorable testlet effects in the dataset. The parameter estimates obtained from the empirical analysis are then used in the simulation studies. The simulation studies reveal that the total number of testlet items has a significant impact on the person ability estimate; bias, standard error, mean square error and mean absolute
error drop, but the EAP test reliability rises, when fewer testlet items were included in the test. In terms of the item difficulty estimate, this impact is relatively small; only standard error shows a consistent increase when the number of testlet items increases, and this effect is not consistent for bias, mean square error and mean absolute error. In sum, it can be concluded that the testlet effects are not beneficial to ability estimation, and this influence undermines test fairness. Other suggestions for test design are provided in the conclusion.

相關文獻