Novel strategy for disease risk prediction incorporating predicted gene expression and DNA methylation data: a multi-phased study of prostate cancer

Chong Wu, Jingjing Zhu, Austin King, Xiaoran Tong, Qing Lu, Jong Y. Park, Liang Wang, Guimin Gao, Hong Wen Deng, Yaohua Yang, Karen E. Knudsen, Timothy R. Rebbeck, Jirong Long, Wei Zheng, Wei Pan, David V. Conti, Christopher A. Haiman, Lang Wu

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


Background: DNA methylation and gene expression are known to play important roles in the etiology of human diseases such as prostate cancer (PCa). However, it has not yet been possible to incorporate information of DNA methylation and gene expression into polygenic risk scores (PRSs). Here, we aimed to develop and validate an improved PRS for PCa risk by incorporating genetically predicted gene expression and DNA methylation, and other genomic information using an integrative method. Methods: Using data from the PRACTICAL consortium, we derived multiple sets of genetic scores, including those based on available single-nucleotide polymorphisms through widely used methods of pruning and thresholding, LDpred, LDpred-funt, AnnoPred, and EBPRS, as well as PRS constructed using the genetically predicted gene expression and DNA methylation through a revised pruning and thresholding strategy. In the tuning step, using the UK Biobank data (1458 prevalent cases and 1467 controls), we selected PRSs with the best performance. Using an independent set of data from the UK Biobank, we developed an integrative PRS combining information from individual scores. Furthermore, in the testing step, we tested the performance of the integrative PRS in another independent set of UK Biobank data of incident cases and controls. Results: Our constructed PRS had improved performance (C statistics: 76.1%) over PRSs constructed by individual benchmark methods (from 69.6% to 74.7%). Furthermore, our new PRS had much higher risk assessment power than family history. The overall net reclassification improvement was 69.0% by adding PRS to the baseline model compared with 12.5% by adding family history. Conclusions: We developed and validated a new PRS which may improve the utility in predicting the risk of developing PCa. Our innovative method can also be applied to other human diseases to improve risk prediction across multiple outcomes.

Original languageEnglish (US)
Pages (from-to)1387-1397
Number of pages11
JournalCancer Communications
Issue number12
Early online dateSep 14 2021
StatePublished - Dec 2021

Bibliographical note

Funding Information:
Lang Wu is supported by the University of Hawaii Cancer Center Seed Grant. Chong Wu is partially supported by NIH R03 AG070669 and the Florida State University Committee on Faculty Research Support Grant. The prostate cancer genome‐wide association analyses are supported by the Canadian Institutes of Health Research, European Commission's Seventh Framework Programme grant agreement(HEALTH‐F2‐2009‐223175), Cancer Research UK Grants (C5047/A7357, C1287/A10118, C1287/A16563, C5047/A3354, C5047/A10692, C16913/A6135), and The National Institute of Health (NIH) Cancer Post‐Cancer GWASinitiative grant (No. 1 U19 CA 148537‐01, the GAME‐ON initiative). We would also like to thank the following for funding support: The Institute of Cancer Research and The Everyman Campaign, The Prostate Cancer Research Foundation, Prostate Research Campaign UK (now PCUK), The Orchid Cancer Appeal, Rosetrees Trust, The National Cancer Research Network UK, The National Cancer Research Institute (NCRI) UK. We are grateful for support of NIHR funding to the NIHR Biomedical Research Centre at The Institute of Cancer Research and The Royal Marsden NHS Foundation Trust. The Prostate Cancer Program of Cancer Council Victoria also acknowledge grant support from The National Health and Medical Research Council, Australia(126402, 209057, 251533, 396414, 450104, 504700, 504702, 504715, 623204, 940394, 614296), VicHealth, Cancer Council Victoria, The Prostate Cancer Foundation of Australia, The Whitten Foundation, PricewaterhouseCoopers, and Tattersall's. EAO, DMK, and EMK acknowledge the Intramural Program of the National Human Genome Research Institute for their support. Genotyping of the OncoArray was funded by the US National Institutes of Health (NIH)[U19 CA 148537 for ELucidating Loci Involved in Prostate cancer SuscEptibility (ELLIPSE)project and X01HG007492 to the Center for Inherited Disease Research (CIDR)under contract number HHSN268201200008I] and by Cancer Research UK grant A8197/A16565. Additional analytic support was provided by NIH NCIU01 CA188392 (PI: Schumacher). Funding for the iCOGS infrastructure came from: the European Community's Seventh Framework Programme under grant agreement n° 223175 (HEALTH‐F2‐2009‐223175) (COGS), Cancer Research UK (C1287/A10118, C1287/A 10710, C12292/A11174, C1281/A12014, C5047/A8384, C5047/A15007, C5047/A10692, C8197/A16565), the National Institutes of Health (CA128978, CA128813) and Post‐Cancer GWAS initiative(1U19 CA148537, 1U19 CA148065 and 1U19 CA148112 – the GAME‐ON initiative), the Department of Defense (W81XWH‐10‐1‐0341), the Canadian Institutes of Health Research (CIHR) for the CIHR Team in Familial Risks of Breast Cancer, Komen Foundation for the Cure, the Breast Cancer Research Foundation, and the Ovarian Cancer Research Fund. The BPC3 was supported by the U.S. National Institutes of Health, National Cancer Institute(cooperative agreements U01‐CA98233 to D.J.H., U01‐CA98710 to S.M.G., U01‐CA98216 to E.R., and U01‐CA98758 to B.E.H., and Intramural Research Program of NIH/National Cancer Institute, Division of Cancer Epidemiology and Genetics). CAPS GWAS study was supported by the Swedish Cancer Foundation (grant no 09‐0677, 11‐484, 12‐823), the Cancer Risk Prediction Center (CRisP; ), a Linneus Centre (Contract ID 70867902) financed by the Swedish Research Council, Swedish Research Council(grant no K2010‐70X‐20430‐04‐3, 2014‐2269). PEGASUS was supported by the Intramural Research Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health.

Publisher Copyright:
© 2021 The Authors. Cancer Communications published by John Wiley & Sons Australia, Ltd. on behalf of Sun Yat-sen University Cancer Center


  • integrative models
  • polygenic risk scores
  • predicted DNA methylation
  • predicted gene expression
  • prostate cancer
  • risk prediction

PubMed: MeSH publication types

  • Journal Article
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.


Dive into the research topics of 'Novel strategy for disease risk prediction incorporating predicted gene expression and DNA methylation data: a multi-phased study of prostate cancer'. Together they form a unique fingerprint.

Cite this