Skip to main navigation Skip to search Skip to main content

Evaluating fairness and generalizability in models predicting on-time graduation from college applications

  • Stephen Hutt
  • , Margo Gardner
  • , Angela L. Duckworth
  • , Sidney K. D'Mello

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We explore generalizability and fairness across sociodemographic groups for predicting on-time college graduation using a national dataset of 41,359 college applications. Our features include socio-demographics, institutional graduation rates, academic achievement, standardized test scores, engagement in extracurricular activities, and work experiences. We identify five latent classes based on available sociodemographic data and train Random Forest classifiers to successfully predict 4-year graduation. When individually trained and tested on each class using a split-half validation method, we achieved AUROCs between 0.629 and 0.694. We then evaluate how a model trained on the entire dataset performs on each latent class by performing a slicing analysis, finding a 6 to 10 percent improvement in AUROCs compared to the individual-class models. We explore fairness of our model by extending the slicing analysis to consider Absolute Between ROC Area (ABROCA), finding similar values for each of our latent classes. We contemplate how our results might be used to avoid perpetuating biases inherent in college application data.

Original languageEnglish (US)
Title of host publicationEDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining
EditorsCollin F. Lynch, Agathe Merceron, Michel Desmarais, Roger Nkambou
PublisherInternational Educational Data Mining Society
Pages79-88
Number of pages10
ISBN (Electronic)9781733673600
StatePublished - 2019
Externally publishedYes
Event12th International Conference on Educational Data Mining, EDM 2019 - Montreal, Canada
Duration: Jul 2 2019Jul 5 2019

Publication series

NameEDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining

Conference

Conference12th International Conference on Educational Data Mining, EDM 2019
Country/TerritoryCanada
CityMontreal
Period7/2/197/5/19

Bibliographical note

Publisher Copyright:
© EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining. All rights reserved.

Keywords

  • College applications
  • College success
  • Common App
  • Fairness
  • Generalizability
  • National student clearinghouse
  • Slicing analysis

Fingerprint

Dive into the research topics of 'Evaluating fairness and generalizability in models predicting on-time graduation from college applications'. Together they form a unique fingerprint.

Cite this