An Updated Meta-Analysis of the Interrater Reliability of Supervisory Performance Ratings

You Zhou, Paul R. Sackett, Winny Shen, Adam S. Beatty

Research output: Contribution to journalArticlepeer-review


Given the centrality of the job performance construct to organizational researchers, it is critical to understand the reliability of the most common way it is operationalized in the literature. To this end, we conducted an updated meta-analysis on the interrater reliability of supervisory ratings of job performance (k = 132 independent samples) using a new meta-analytic procedure (i.e., the Morris estimator), which includes both within- and between-study variance in the calculation of study weights. An important benefit of this approach is that it prevents large-sample studies from dominating the results. In this investigation, we also examined different factors that may affect interrater reliability, including job complexity, managerial level, rating purpose, performance measure, and rater perspective. We found a higher interrater reliability estimate (r =.65) compared to previous meta-analyses on the topic, and our results converged with an important, but often neglected, finding from a previous meta-analysis by Conway and Huffcutt (1997), such that interrater reliability varies meaningfully by job type (r =.57 for managerial positions vs. r =.68 for nonmanagerial positions). Given this finding, we advise against the use of an overall grand mean of interrater reliability. Instead, we recommend using job-specific or local reliabilities for making corrections for attenuation.

Original languageEnglish (US)
Pages (from-to)949-970
Number of pages22
JournalJournal of Applied Psychology
Issue number6
StatePublished - Jan 25 2024

Bibliographical note

Publisher Copyright:
© 2024 American Psychological Association


  • interrater reliability
  • job performance
  • meta-analysis
  • supervisory ratings


Dive into the research topics of 'An Updated Meta-Analysis of the Interrater Reliability of Supervisory Performance Ratings'. Together they form a unique fingerprint.

Cite this