文章詳目資料

特殊教育研究學刊 TSSCI

  • 加入收藏
  • 下載文章
篇名 科學資優生鑑定之性向測驗題庫發展及運用
卷期 48:1
並列篇名 Development of item banks for middle school and high school science aptitude tests
作者 侯雅齡
頁次 097-128
關鍵字 科學性向測驗測驗編製資優學生題目反應理論題庫發展gifted studentsitem bankitem response theorynatural science academic aptitude testtest developmentTSSCI
出刊日期 202303
DOI 10.6172/BSE.202303_48(1).0004

中文摘要

我國在科學學術性向資賦優異學生的鑑定,有標準化科學性向測驗不足,以及測驗題目外洩的風險,影響鑑定結果的效度。本研究目的在建置國中科學性向測驗題庫與高中科學性向測驗題庫,並運用題庫中的試題,組成可提供資優生鑑定用的測驗。科學性向題目採取科學能力、認知歷程與學科知識三向度的構念進行命題,其中科學能力包含科學地解釋現象、評估與設計科學探究、科學地詮釋資料與證據與科學問題解決四種科學學習能力。所有試題皆經過嚴格的內容審查,並以試題反應理論(IRT)透過水平等化設計發展題庫,建立各試題的鑑別度、難度、猜測度三參數。國中科學性向測驗題庫,先透過全國4,663位學生,建立140題的校準題庫,再從中編組一套7題的定錨測驗做為後續擴增試題等化之用,使不同測驗施測不同的樣本,也能與校準題庫有相同基準的參數。後續再以每式測驗34題組卷,總計擴增了309題,IRT參數顯示此題庫的試題有良好鑑別度、難度值中間偏難,及合理猜測度。高中科學性向測驗題庫,先透過全國3,702位學生,建立274題的校準題庫,再從中編組一套10題的定錨測驗做為後續擴增試題等化之用。後續以每式測驗50題進行組卷,總計擴增了412題,IRT參數顯示此測驗具有良好鑑別度、難度值中間偏難,及合理猜測度。以資優學生之學校自然科學期成績,作為效標關聯效度考驗,結果符合期待。資優生與一般生在測驗各向度的表現,皆有顯著差異,顯示區辨效度良好。最後也考量實際鑑定需求,由題庫中選50題組成測驗,並建立九年級學生的百分等級及常態化的T分數常模,以說明題庫的運用方式。

英文摘要

Rationale & Purpose: Because of the need to evaluate gifted students in scientific aptitude yearly in Taiwan, several methods are necessary to reduce the risk of leaked questions. The Programme for International Student Assessment, commissioned by Organization for Economic Co-operation and Development, emphasizes literacy in test design. One of the main goals of aptitude tests is to assess students' ability to navigate a rapidly changing society. Measuring scientific aptitude includes evaluating the ability to explain phenomena scientifically, evaluate and design scientific enquiries, solve scientific problems, and interpret scientific data and evidence. Scientific aptitude has considerable explanatory power for understanding the academic potential and learning attitudes of students. This study involved the construction of sustainable development item banks or question bank for the Science Aptitude Test taken by junior and senior high school students. Additional content can be added to the item banks, and the items we developed were constructed based on the same scale according to item response theory (IRT) for comparison. Consequently, when gifted students are identified, the appropriate items can be selected from the item bank for test design, and the item bank can be increased through proper way in the future. Developing a gifted identification item bank is a good practice, as it addresses the needs for such tools, appropriateness in the giftedness identification process, the necessity of maintaining fairness and rigor throughout the identification process, as well as the cost savings from avoiding duplicate developing test items. Methods: To determine the items for the Science Aptitude Test, we adopted three constructs: content knowledge, cognitive process, and scientific competency. All the items underwent strict content review, and the item banks were developed using horizontal equalization based on IRT. Furthermore, every item had three parameters: parameter A was discrimination, parameter B was difficulty, and parameter C was guessing. In order to expand the Science Aptitude Test item bank for junior high school students, 4,663 students across Taiwan were analyzed to establish a 140- item calibration test. Thereafter, 7 of the 140 items were selected for the anchored test. One anchored test (seven items) and several new tests (each composed of 27 different items) were combined to form new tests. In total, 309 items were added. The three parameters for each item demonstrated good fit. For the high school Science Aptitude Test item bank, 3,702 high school students across the country participated to establish a 274-item calibration test. Only 10 of the 274 items were selected for the anchored test. One anchored test (10 items) and several new tests (each composed of 40 different items) were then combined to form new tests. In total, 412 items were added. Similarly, the three parameters demonstrated good fit. Results/Findings: The assignment committee members were selected using the talent database on the basis of their subject of expertise to ensure that the test items had high validity. A total of 72 committee members participated in designing the assignments for the scientific aptitude tests. The correctness and appropriateness of items were reviewed by four separate review teams with backgrounds in physics, chemistry, biology, and geology. Statistical tests revealed that the parameters for the anchor tests and aptitude tests (discrimination, difficulty, and guessing) all had good fit, and proved the items are unidimensional. Thereafter, high school students were tested, and the correlation between their aptitude test scores in the item bank and their semester scores in natural sciences indicated high criterion related validity. The test information curve indicated that the tests provide the maximum amount of information with the minimum number of errors when the student's capability is 1.1 standard deviations above the mean. Moreover, significant differences were observed between gifted students and other students in each item of the aptitude test, which further indicated good discriminant validity. Finally, 754 ninth-grade students were used as the norm to establish a percentile grade and a normalized t score norm through a 50-item test that selected from the item bank. Conclusions/Implications: After the literature review and several expert panel discussions, we adopted the three aforementioned constructs for scientific aptitude (content knowledge, cognitive process, and scientific competency). To reflect the 12-year national education system, interdisciplinary test questions were also added in the final round in addition to the four subjects. The test questions in the question bank were determined by many test committee members at different times. To ensure the content validity and quality of the test questions, all the questions were first given to the review team to determine whether the questions were consistent with the test structure and make appropriate revisions accordingly. After the content knowledge revisions, the test experts reviewed the principles for compiling the test question. The response data of students from the fieldwork samples were referenced during revisions. This study used rigorous procedures to ensure that the questions in the question bank were of high quality. Because the Ministry of Education requires that the identification of gifted students must be based on national norms, this study first established national representative samples of the middle school and high school Science Aptitude Test. To allow future test committees to save money by conducting regional tests, labor and other related costs were conducive to obtaining the support of schools and expanding the question bank. Those managing the question bank must continually increase the number of test questions. To allow the scores of each test to be on the same scale and remain unaffected by the scores of other testers and tests, the equalization between tests was ensured using a common test. To reduce the exposure of the common test questions and improve the confidentiality of the test questions in the test bank, the researchers of this study selected a set of anchor tests covering each subject from the middle school and high school Science Aptitude Test to make additional tests and calibrations in the future. With the addition of the common test, the two-form anchor test not only considers the validity of the aptitude test constructs but also ensures that the values of the three parameters (Discrimination, Difficulty, and Guessing) are ideal.

相關文獻