Lung cancer risk prediction: Prostate, lung, colorectal and ovarian cancer screening trial models and validation

C. Martin Tammemagi, Paul F. Pinsky, Neil E. Caporaso, Paul A. Kvale, William G. Hocking, Timothy R. Church, Thomas L. Riley, John Commins, Martin M. Oken, Christine D. Berg, Philip C. Prorok

Research output: Contribution to journalArticlepeer-review

232 Scopus citations


IntroductionIdentification of individuals at high risk for lung cancer should be of value to individuals, patients, clinicians, and researchers. Existing prediction models have only modest capabilities to classify persons at risk accurately.MethodsProspective data from 70962 control subjects in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO) were used in models for the general population (model 1) and for a subcohort of ever-smokers (N = 38254) (model 2). Both models included age, socioeconomic status (education), body mass index, family history of lung cancer, chronic obstructive pulmonary disease, recent chest X-ray, smoking status (never, former, or current), pack-years smoked, and smoking duration. Model 2 also included smoking quit-time (time in years since ever-smokers permanently quit smoking). External validation was performed with 44223 PLCO intervention arm participants who completed a supplemental questionnaire and were subsequently followed. Known available risk factors were included in logistic regression models. Bootstrap optimism-corrected estimates of predictive performance were calculated (internal validation). Nonlinear relationships for age, pack-years smoked, smoking duration, and quit-time were modeled using restricted cubic splines. All reported P values are two-sided.ResultsDuring follow-up (median 9.2 years) of the control arm subjects, 1040 lung cancers occurred. During follow-up of the external validation sample (median 3.0 years), 213 lung cancers occurred. For models 1 and 2, bootstrap optimism-corrected receiver operator characteristic area under the curves were 0.857 and 0.805, and calibration slopes (model-predicted probabilities vs observed probabilities) were 0.987 and 0.979, respectively. In the external validation sample, models 1 and 2 had area under the curves of 0.841 and 0.784, respectively. These models had high discrimination in women, men, whites, and nonwhites.ConclusionThe PLCO lung cancer risk models demonstrate high discrimination and calibration.

Original languageEnglish (US)
Pages (from-to)1058-1068
Number of pages11
JournalJournal of the National Cancer Institute
Issue number13
StatePublished - Jul 6 2011

Bibliographical note

Funding Information:
This research was supported by contracts from the Division of Cancer Prevention, National Cancer Institute, National Institute of Health, Department of Health and Human Services. The funders did not have any involvement in the design of this ancillary study; the collection, analysis, and interpretation of the data; the writing of the manuscript; or the decision to submit the manuscript for publication.


Dive into the research topics of 'Lung cancer risk prediction: Prostate, lung, colorectal and ovarian cancer screening trial models and validation'. Together they form a unique fingerprint.

Cite this