Several methods used to examine differential item functioning (DIF) in Patient-Reported Outcomes Measurement Information System (PROMIS®) measures are presented, including effect size estimation. A summary of factors that may affect DIF detection and challenges encountered in PROMIS DIF analyses, e.g., anchor item selection, is provided. An issue in PROMIS was the potential for inadequately modeled multidimensionality to result in false DIF detection. Section 1 is a presentation of the unidimensional models used by most PROMIS investigators for DIF detection, as well as their multidimensional expansions. Section 2 is an illustration that builds on previous unidimensional analyses of depression and anxiety short-forms to examine DIF detection using a multidimensional item response theory (MIRT) model. The Item Response Theory-Log-likelihood Ratio Test (IRT-LRT) method was used for a real data illustration with gender as the grouping variable. The IRT-LRT DIF detection method is a flexible approach to handle group differences in trait distributions, known as impact in the DIF literature, and was studied with both real data and in simulations to compare the performance of the IRT-LRT method within the unidimensional IRT (UIRT) and MIRT contexts. Additionally, different effect size measures were compared for the data presented in Section 2. A finding from the real data illustration was that using the IRT-LRT method within a MIRT context resulted in more flagged items as compared to using the IRT-LRT method within a UIRT context. The simulations provided some evidence that while unidimensional and multidimensional approaches were similar in terms of Type I error rates, power for DIF detection was greater for the multidimensional approach. Effect size measures presented in Section 1 and applied in Section 2 varied in terms of estimation methods, choice of density function, methods of equating, and anchor item selection. Despite these differences, there was considerable consistency in results, especially for the items showing the largest values. Future work is needed to examine DIF detection in the context of polytomous, multidimensional data. PROMIS standards included incorporation of effect size measures in determining salient DIF. Integrated methods for examining effect size measures in the context of IRT-based DIF detection procedures are still in early stages of development.
Bibliographical noteFunding Information:
The Patient-Reported Outcomes Measurement Information System (PROMIS®) “Roadmap Initiative” was funded by the National Institutes of Health in 2004 to improve and standardize the measurement of symptoms and health outcomes by constructing item banks using item response theory (Cella et al., ; Reeve et al., ). Although the original anxiety and depression item banks were evaluated for DIF using the unidimensional IRT-based methods described earlier (Choi et al., ; Teresi et al., ), little data existed that permitted evaluation of the performance of PROMIS measures across ethnically diverse groups. The Measuring Your Health (MYHealth; Jensen et al., ) study of PROMIS short-form measures in a stratified random sample of 5506 ethnically diverse patients with cancer was thus initiated in 2010 to partially redress this gap.
U01AR057971 (PIs: Potosky, Moinpour), NCI P30CA051008, UL1TR000101 (previously UL1RR031975) from the National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, through the Clinical and Translational Science Awards Program (CTSA). Analyses of these data were supported by the Mount Sinai Claude D. Pepper Older Americans Independence Center (National Institute on Aging, 1P30AG028741, Siu) and the Columbia University Alzheimer’s Disease Resource Center for Minority Aging Research (National Institute on Aging, 1P30AG059303, Manly, Luchsinger). This research was also supported by the Eunice Kennedy Shriver National Institutes of Child Health and Human Development of the National Institutes of Health under Award Number R01HD079439 to the Mayo Clinic in Rochester Minnesota through subcontracts to the University of Minnesota and the University of Washington. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors thank Katja Ocepek-Welikson, M.Phil., for analytic assistance and Ruoyi Zhu, a doctoral student in the College of Education, University of Washington for assistance in conducting the simulation study.
© 2021, The Psychometric Society.
- differential item functioning
- effect size estimates
- multidimensional IRT
PubMed: MeSH publication types
- Journal Article
- Research Support, N.I.H., Extramural