Prediction of cardiovascular outcomes with machine learning techniques

Application to the cardiovascular outcomes in renal atherosclerotic lesions (CORAL) study

Tian Chen, Pamela Brewster, Katherine R. Tuttle, Lance D. Dworkin, William Henrich, Barbara A. Greco, Michael W Steffes, Sheldon Tobe, Kenneth Jamerson, Karol Pencina, Joseph M. Massaro, Ralph B. D’Agostino, Donald E. Cutlip, Timothy P. Murphy, Christopher J. Cooper, Joseph I. Shapiro

Research output: Contribution to journalArticle

Abstract

Background: Data derived from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions (CORAL) study were analyzed in an effort to employ machine learning methods to predict the composite endpoint described in the original study. Methods: We identified 573 CORAL subjects with complete baseline data and the presence or absence of a composite endpoint for the study. These data were subjected to several models including a generalized linear (logistic-linear) model, support vector machine, decision tree, feed-forward neural network, and random forest, in an effort to attempt to predict the composite endpoint. The subjects were arbitrarily divided into training and testing subsets according to an 80%:20% distribution with various seeds. Prediction models were optimized within the CARET package of R. Results: The best performance of the different machine learning techniques was that of the random forest method which yielded a receiver operator curve (ROC) area of 68.1%±4.2% (mean ± SD) on the testing subset with ten different seed values used to separate training and testing subsets. The four most important variables in the random forest method were SBP, serum creatinine, glycosylated hemoglobin, and DBP. Each of these variables was also important in at least some of the other methods. The treatment assignment group was not consistently an important determinant in any of the models. Conclusion: Prediction of a composite cardiovascular outcome was difficult in the CORAL population, even when employing machine learning methods. Assignment to either the stenting or best medical therapy group did not serve as an important predictor of composite outcome. Clinical Trial Registration: ClinicalTrials.gov, NCT00081731.

Original languageEnglish (US)
Pages (from-to)49-58
Number of pages10
JournalInternational Journal of Nephrology and Renovascular Disease
Volume12
DOIs
StatePublished - Jan 1 2019

Fingerprint

Kidney
Seeds
Decision Trees
Glycosylated Hemoglobin A
Group Psychotherapy
Machine Learning
Linear Models
Creatinine
Logistic Models
Clinical Trials
Serum
Population

Keywords

  • Cardiovascular disease
  • Chronic kidney disease
  • Glomerular filtration rate
  • Hypertension
  • Ischemic renal disease
  • Renal artery stenosis

Cite this

Prediction of cardiovascular outcomes with machine learning techniques : Application to the cardiovascular outcomes in renal atherosclerotic lesions (CORAL) study. / Chen, Tian; Brewster, Pamela; Tuttle, Katherine R.; Dworkin, Lance D.; Henrich, William; Greco, Barbara A.; Steffes, Michael W; Tobe, Sheldon; Jamerson, Kenneth; Pencina, Karol; Massaro, Joseph M.; D’Agostino, Ralph B.; Cutlip, Donald E.; Murphy, Timothy P.; Cooper, Christopher J.; Shapiro, Joseph I.

In: International Journal of Nephrology and Renovascular Disease, Vol. 12, 01.01.2019, p. 49-58.

Research output: Contribution to journalArticle

Chen, T, Brewster, P, Tuttle, KR, Dworkin, LD, Henrich, W, Greco, BA, Steffes, MW, Tobe, S, Jamerson, K, Pencina, K, Massaro, JM, D’Agostino, RB, Cutlip, DE, Murphy, TP, Cooper, CJ & Shapiro, JI 2019, 'Prediction of cardiovascular outcomes with machine learning techniques: Application to the cardiovascular outcomes in renal atherosclerotic lesions (CORAL) study', International Journal of Nephrology and Renovascular Disease, vol. 12, pp. 49-58. https://doi.org/10.2147/IJNRD.S194727
Chen, Tian ; Brewster, Pamela ; Tuttle, Katherine R. ; Dworkin, Lance D. ; Henrich, William ; Greco, Barbara A. ; Steffes, Michael W ; Tobe, Sheldon ; Jamerson, Kenneth ; Pencina, Karol ; Massaro, Joseph M. ; D’Agostino, Ralph B. ; Cutlip, Donald E. ; Murphy, Timothy P. ; Cooper, Christopher J. ; Shapiro, Joseph I. / Prediction of cardiovascular outcomes with machine learning techniques : Application to the cardiovascular outcomes in renal atherosclerotic lesions (CORAL) study. In: International Journal of Nephrology and Renovascular Disease. 2019 ; Vol. 12. pp. 49-58.
@article{4dd3da28012a4525b09b4336cee7887b,
title = "Prediction of cardiovascular outcomes with machine learning techniques: Application to the cardiovascular outcomes in renal atherosclerotic lesions (CORAL) study",
abstract = "Background: Data derived from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions (CORAL) study were analyzed in an effort to employ machine learning methods to predict the composite endpoint described in the original study. Methods: We identified 573 CORAL subjects with complete baseline data and the presence or absence of a composite endpoint for the study. These data were subjected to several models including a generalized linear (logistic-linear) model, support vector machine, decision tree, feed-forward neural network, and random forest, in an effort to attempt to predict the composite endpoint. The subjects were arbitrarily divided into training and testing subsets according to an 80{\%}:20{\%} distribution with various seeds. Prediction models were optimized within the CARET package of R. Results: The best performance of the different machine learning techniques was that of the random forest method which yielded a receiver operator curve (ROC) area of 68.1{\%}±4.2{\%} (mean ± SD) on the testing subset with ten different seed values used to separate training and testing subsets. The four most important variables in the random forest method were SBP, serum creatinine, glycosylated hemoglobin, and DBP. Each of these variables was also important in at least some of the other methods. The treatment assignment group was not consistently an important determinant in any of the models. Conclusion: Prediction of a composite cardiovascular outcome was difficult in the CORAL population, even when employing machine learning methods. Assignment to either the stenting or best medical therapy group did not serve as an important predictor of composite outcome. Clinical Trial Registration: ClinicalTrials.gov, NCT00081731.",
keywords = "Cardiovascular disease, Chronic kidney disease, Glomerular filtration rate, Hypertension, Ischemic renal disease, Renal artery stenosis",
author = "Tian Chen and Pamela Brewster and Tuttle, {Katherine R.} and Dworkin, {Lance D.} and William Henrich and Greco, {Barbara A.} and Steffes, {Michael W} and Sheldon Tobe and Kenneth Jamerson and Karol Pencina and Massaro, {Joseph M.} and D’Agostino, {Ralph B.} and Cutlip, {Donald E.} and Murphy, {Timothy P.} and Cooper, {Christopher J.} and Shapiro, {Joseph I.}",
year = "2019",
month = "1",
day = "1",
doi = "10.2147/IJNRD.S194727",
language = "English (US)",
volume = "12",
pages = "49--58",
journal = "International Journal of Nephrology and Renovascular Disease",
issn = "1178-7058",
publisher = "Dove Medical Press Limited",

}

TY - JOUR

T1 - Prediction of cardiovascular outcomes with machine learning techniques

T2 - Application to the cardiovascular outcomes in renal atherosclerotic lesions (CORAL) study

AU - Chen, Tian

AU - Brewster, Pamela

AU - Tuttle, Katherine R.

AU - Dworkin, Lance D.

AU - Henrich, William

AU - Greco, Barbara A.

AU - Steffes, Michael W

AU - Tobe, Sheldon

AU - Jamerson, Kenneth

AU - Pencina, Karol

AU - Massaro, Joseph M.

AU - D’Agostino, Ralph B.

AU - Cutlip, Donald E.

AU - Murphy, Timothy P.

AU - Cooper, Christopher J.

AU - Shapiro, Joseph I.

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Background: Data derived from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions (CORAL) study were analyzed in an effort to employ machine learning methods to predict the composite endpoint described in the original study. Methods: We identified 573 CORAL subjects with complete baseline data and the presence or absence of a composite endpoint for the study. These data were subjected to several models including a generalized linear (logistic-linear) model, support vector machine, decision tree, feed-forward neural network, and random forest, in an effort to attempt to predict the composite endpoint. The subjects were arbitrarily divided into training and testing subsets according to an 80%:20% distribution with various seeds. Prediction models were optimized within the CARET package of R. Results: The best performance of the different machine learning techniques was that of the random forest method which yielded a receiver operator curve (ROC) area of 68.1%±4.2% (mean ± SD) on the testing subset with ten different seed values used to separate training and testing subsets. The four most important variables in the random forest method were SBP, serum creatinine, glycosylated hemoglobin, and DBP. Each of these variables was also important in at least some of the other methods. The treatment assignment group was not consistently an important determinant in any of the models. Conclusion: Prediction of a composite cardiovascular outcome was difficult in the CORAL population, even when employing machine learning methods. Assignment to either the stenting or best medical therapy group did not serve as an important predictor of composite outcome. Clinical Trial Registration: ClinicalTrials.gov, NCT00081731.

AB - Background: Data derived from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions (CORAL) study were analyzed in an effort to employ machine learning methods to predict the composite endpoint described in the original study. Methods: We identified 573 CORAL subjects with complete baseline data and the presence or absence of a composite endpoint for the study. These data were subjected to several models including a generalized linear (logistic-linear) model, support vector machine, decision tree, feed-forward neural network, and random forest, in an effort to attempt to predict the composite endpoint. The subjects were arbitrarily divided into training and testing subsets according to an 80%:20% distribution with various seeds. Prediction models were optimized within the CARET package of R. Results: The best performance of the different machine learning techniques was that of the random forest method which yielded a receiver operator curve (ROC) area of 68.1%±4.2% (mean ± SD) on the testing subset with ten different seed values used to separate training and testing subsets. The four most important variables in the random forest method were SBP, serum creatinine, glycosylated hemoglobin, and DBP. Each of these variables was also important in at least some of the other methods. The treatment assignment group was not consistently an important determinant in any of the models. Conclusion: Prediction of a composite cardiovascular outcome was difficult in the CORAL population, even when employing machine learning methods. Assignment to either the stenting or best medical therapy group did not serve as an important predictor of composite outcome. Clinical Trial Registration: ClinicalTrials.gov, NCT00081731.

KW - Cardiovascular disease

KW - Chronic kidney disease

KW - Glomerular filtration rate

KW - Hypertension

KW - Ischemic renal disease

KW - Renal artery stenosis

UR - http://www.scopus.com/inward/record.url?scp=85067478940&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067478940&partnerID=8YFLogxK

U2 - 10.2147/IJNRD.S194727

DO - 10.2147/IJNRD.S194727

M3 - Article

VL - 12

SP - 49

EP - 58

JO - International Journal of Nephrology and Renovascular Disease

JF - International Journal of Nephrology and Renovascular Disease

SN - 1178-7058

ER -