Abstract
We explore generalizability and fairness across sociodemographic groups for predicting on-time college graduation using a national dataset of 41,359 college applications. Our features include socio-demographics, institutional graduation rates, academic achievement, standardized test scores, engagement in extracurricular activities, and work experiences. We identify five latent classes based on available sociodemographic data and train Random Forest classifiers to successfully predict 4-year graduation. When individually trained and tested on each class using a split-half validation method, we achieved AUROCs between 0.629 and 0.694. We then evaluate how a model trained on the entire dataset performs on each latent class by performing a slicing analysis, finding a 6 to 10 percent improvement in AUROCs compared to the individual-class models. We explore fairness of our model by extending the slicing analysis to consider Absolute Between ROC Area (ABROCA), finding similar values for each of our latent classes. We contemplate how our results might be used to avoid perpetuating biases inherent in college application data.
| Original language | English (US) |
|---|---|
| Title of host publication | EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining |
| Editors | Collin F. Lynch, Agathe Merceron, Michel Desmarais, Roger Nkambou |
| Publisher | International Educational Data Mining Society |
| Pages | 79-88 |
| Number of pages | 10 |
| ISBN (Electronic) | 9781733673600 |
| State | Published - 2019 |
| Externally published | Yes |
| Event | 12th International Conference on Educational Data Mining, EDM 2019 - Montreal, Canada Duration: Jul 2 2019 → Jul 5 2019 |
Publication series
| Name | EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining |
|---|
Conference
| Conference | 12th International Conference on Educational Data Mining, EDM 2019 |
|---|---|
| Country/Territory | Canada |
| City | Montreal |
| Period | 7/2/19 → 7/5/19 |
Bibliographical note
Publisher Copyright:© EDM 2019 - Proceedings of the 12th International Conference on Educational Data Mining. All rights reserved.
Keywords
- College applications
- College success
- Common App
- Fairness
- Generalizability
- National student clearinghouse
- Slicing analysis
Fingerprint
Dive into the research topics of 'Evaluating fairness and generalizability in models predicting on-time graduation from college applications'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS