The Problem of Overfitting

Douglas M. Hawkins

Research output: Contribution to journalReview articlepeer-review

2044 Scopus citations

Abstract

Overfitting problem in model fitting for quantitative measurements is discussed. Two types of overfitting can be distinguished, which include using a model that is more flexible than it needs to be and using a model that includes irrelevant components or predictors. Adding predictors that perform no useful function means that in future use of the regression to make predictions it will be needed to measure and record the predictors so that their values can be substituted in the model. Adding irrelevant predictors can also make predictions worse because the coefficients fitted to them add random variation to the subsequent predictions.

Original languageEnglish (US)
Pages (from-to)1-12
Number of pages12
JournalJournal of chemical information and computer sciences
Volume44
Issue number1
DOIs
StatePublished - Jan 2004

Bibliographical note

Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.

Fingerprint

Dive into the research topics of 'The Problem of Overfitting'. Together they form a unique fingerprint.

Cite this