TY - JOUR
T1 - Essay Selection Methods for Adaptive Rater Monitoring
AU - Wang, Chun
AU - Song, Tian
AU - Wang, Zhuoran
AU - Wolfe, Edward
N1 - Publisher Copyright:
© 2016, © The Author(s) 2016.
PY - 2017/1/1
Y1 - 2017/1/1
N2 - Constructed-response items are commonly used in educational and psychological testing, and the answers to those items are typically scored by human raters. In the current rater monitoring processes, validity scoring is used to ensure that the scores assigned by raters do not deviate severely from the standards of rating quality. In this article, an adaptive rater monitoring approach that may potentially improve the efficiency of current rater monitoring practice is proposed. Based on the Rasch partial credit model and known development in multidimensional computerized adaptive testing, two essay selection methods—namely, the D-optimal method and the Single Fisher information method—are proposed. These two methods intend to select the most appropriate essays based on what is already known about a rater’s performance. Simulation studies, using a simulated essay bank and a cloned real essay bank, show that the proposed adaptive rater monitoring methods can recover rater parameters with much fewer essay questions. Future challenges and potential solutions are discussed in the end.
AB - Constructed-response items are commonly used in educational and psychological testing, and the answers to those items are typically scored by human raters. In the current rater monitoring processes, validity scoring is used to ensure that the scores assigned by raters do not deviate severely from the standards of rating quality. In this article, an adaptive rater monitoring approach that may potentially improve the efficiency of current rater monitoring practice is proposed. Based on the Rasch partial credit model and known development in multidimensional computerized adaptive testing, two essay selection methods—namely, the D-optimal method and the Single Fisher information method—are proposed. These two methods intend to select the most appropriate essays based on what is already known about a rater’s performance. Simulation studies, using a simulated essay bank and a cloned real essay bank, show that the proposed adaptive rater monitoring methods can recover rater parameters with much fewer essay questions. Future challenges and potential solutions are discussed in the end.
KW - Fisher information matrix
KW - Rasch partial credit model
KW - essay selection
KW - interim scoring
UR - http://www.scopus.com/inward/record.url?scp=85002323007&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85002323007&partnerID=8YFLogxK
U2 - 10.1177/0146621616672855
DO - 10.1177/0146621616672855
M3 - Article
C2 - 29881078
AN - SCOPUS:85002323007
SN - 0146-6216
VL - 41
SP - 60
EP - 79
JO - Applied Psychological Measurement
JF - Applied Psychological Measurement
IS - 1
ER -