TY - JOUR

T1 - Deriving Stopping Rules for Multidimensional Computerized Adaptive Testing

AU - Wang, Chun

AU - Chang, Hua Hua

AU - Boughton, Keith A.

PY - 2013/3

Y1 - 2013/3

N2 - Multidimensional computerized adaptive testing (MCAT) is able to provide a vector of ability estimates for each examinee, which could be used to provide a more informative profile of an examinee's performance. The current literature on MCAT focuses on the fixed-length tests, which can generate less accurate results for those examinees whose abilities are quite different from the average difficulty level of the item bank when there are only a limited number of items in the item bank. Therefore, instead of stopping the test with a predetermined fixed test length, the authors use a more informative stopping criterion that is directly related to measurement accuracy. Specifically, this research derives four stopping rules that either quantify the measurement precision of the ability vector (i.e., minimum determinant rule [D-rule], minimum eigenvalue rule [E-rule], and maximum trace rule [T-rule]) or quantify the amount of available information carried by each item (i.e., maximum Kullback-Leibler divergence rule [K-rule]). The simulation results showed that all four stopping rules successfully terminated the test when the mean squared error of ability estimation is within a desired range, regardless of examinees' true abilities. It was found that when using the D-, E-, or T-rule, examinees with extreme abilities tended to have tests that were twice as long as the tests received by examinees with moderate abilities. However, the test length difference with K-rule is not very dramatic, indicating that K-rule may not be very sensitive to measurement precision. In all cases, the cutoff value for each stopping rule needs to be adjusted on a case-by-case basis to find an optimal solution.

AB - Multidimensional computerized adaptive testing (MCAT) is able to provide a vector of ability estimates for each examinee, which could be used to provide a more informative profile of an examinee's performance. The current literature on MCAT focuses on the fixed-length tests, which can generate less accurate results for those examinees whose abilities are quite different from the average difficulty level of the item bank when there are only a limited number of items in the item bank. Therefore, instead of stopping the test with a predetermined fixed test length, the authors use a more informative stopping criterion that is directly related to measurement accuracy. Specifically, this research derives four stopping rules that either quantify the measurement precision of the ability vector (i.e., minimum determinant rule [D-rule], minimum eigenvalue rule [E-rule], and maximum trace rule [T-rule]) or quantify the amount of available information carried by each item (i.e., maximum Kullback-Leibler divergence rule [K-rule]). The simulation results showed that all four stopping rules successfully terminated the test when the mean squared error of ability estimation is within a desired range, regardless of examinees' true abilities. It was found that when using the D-, E-, or T-rule, examinees with extreme abilities tended to have tests that were twice as long as the tests received by examinees with moderate abilities. However, the test length difference with K-rule is not very dramatic, indicating that K-rule may not be very sensitive to measurement precision. In all cases, the cutoff value for each stopping rule needs to be adjusted on a case-by-case basis to find an optimal solution.

KW - Fisher information

KW - Kullback-Leibler information

KW - continuous entropy

KW - multidimensional computerized adaptive testing

UR - http://www.scopus.com/inward/record.url?scp=84874003772&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874003772&partnerID=8YFLogxK

U2 - 10.1177/0146621612463422

DO - 10.1177/0146621612463422

M3 - Article

AN - SCOPUS:84874003772

SN - 0146-6216

VL - 37

SP - 99

EP - 122

JO - Applied Psychological Measurement

JF - Applied Psychological Measurement

IS - 2

ER -