How Do Programs Measure Resident Performance? A Multi-Institutional Inventory of General Surgery Assessments

John Luckoski, Danielle Jean, Angela Thelen, Laura Mazer, Brian George, Daniel E. Kendrick

Research output: Contribution to journalArticlepeer-review

11 Scopus citations


OBJECTIVE: To perform an inventory of assessment tools in use at surgical residency programs and their alignment with the Milestone Competencies. DESIGN: We conducted an inventory of all assessment tools from a sample of general surgery training programs participating in a multi-center study of resident operative development in the United States. Each instrument was categorized using a data extraction tool designed to identify criteria for effective assessment in competency based education and according to which Milestone Competency was being evaluated. Tabulations of each category were then analyzed using descriptive statistics. Interviews with program directors and assessment coordinators were conducted to understand each instrument's intended use within each program. SETTING: Multi-institutional review of general surgery assessment programs. PARTICIPANTS: We identified assessment tools used by 10 general surgery programs during the 2019 to 2020 academic year. Programs were selected from a cohort already participating in a separate research study of resident operative development in the United States. RESULTS: We identified 42 unique assessment tools used. Each program used an average of 7.2 (range 4-13) unique assessment instruments to measure performance, of which only 5 (11.9%) were used by at least 1 other program in our sample. Of all assessments, 59.5% were used monthly or less frequently. The majority (66.7%) of instruments were retrospective global assessments, rather than discrete observed performances. There were 4 (9.5%) instruments with established reliability or validity evidence. Across programs there was also significant variation in the volume of assessment used to evaluate residents, with the median total number of evaluations/trainee across all Milestone Competencies being 217 (IQR 78) per year. Patient care was the most frequently evaluated Milestone Competency. CONCLUSIONS: General surgical assessment systems predominantly employ non-standardized global assessment tools that lack reliability or validity evidence. This variability makes it challenging to interpret and compare competency standards across programs. A standardized assessment toolkit with established reliability and validity evidence would allow training programs to measure the competence of their trainees more uniformly and understand where improvements in our training system can be made.

Original languageEnglish (US)
Pages (from-to)e189-e195
JournalJournal of surgical education
Issue number6
Early online dateSep 27 2021
StatePublished - Nov 1 2021

Bibliographical note

Funding Information:
Funding source: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Publisher Copyright:
© 2021 Association of Program Directors in Surgery


  • Assessment
  • Competency measures
  • Milestones
  • Surgical education

PubMed: MeSH publication types

  • Journal Article
  • Multicenter Study


Dive into the research topics of 'How Do Programs Measure Resident Performance? A Multi-Institutional Inventory of General Surgery Assessments'. Together they form a unique fingerprint.

Cite this