TY - JOUR
T1 - Routine Laboratory Blood Tests Predict SARS-CoV-2 Infection Using Machine Learning
AU - Yang, He S.
AU - Hou, Yu
AU - Vasovic, Ljiljana V.
AU - Steel, Peter A.D.
AU - Chadburn, Amy
AU - Racine-Brzostek, Sabrina E.
AU - Velu, Priya
AU - Cushing, Melissa M.
AU - Loda, Massimo
AU - Kaushal, Rainu
AU - Zhao, Zhen
AU - Wang, Fei
N1 - Publisher Copyright:
© 2020 American Association for Clinical Chemistry 2020. All rights reserved. For permissions, please email: [email protected].
PY - 2020/11/1
Y1 - 2020/11/1
N2 - Background: Accurate diagnostic strategies to identify SARS-CoV-2 positive individuals rapidly for management of patient care and protection of health care personnel are urgently needed. The predominant diagnostic test is viral RNA detection by RT-PCR from nasopharyngeal swabs specimens, however the results are not promptly obtainable in all patient care locations. Routine laboratory testing, in contrast, is readily available with a turn-around time (TAT) usually within 1-2 hours. Method: We developed a machine learning model incorporating patient demographic features (age, sex, race) with 27 routine laboratory tests to predict an individual's SARS-CoV-2 infection status. Laboratory testing results obtained within 2 days before the release of SARS-CoV-2 RT-PCR result were used to train a gradient boosting decision tree (GBDT) model from 3,356 SARS-CoV-2 RT-PCR tested patients (1,402 positive and 1,954 negative) evaluated at a metropolitan hospital. Results: The model achieved an area under the receiver operating characteristic curve (AUC) of 0.854 (95% CI: 0.829-0.878). Application of this model to an independent patient dataset from a separate hospital resulted in a comparable AUC (0.838), validating the generalization of its use. Moreover, our model predicted initial SARS-CoV-2 RT-PCR positivity in 66% individuals whose RT-PCR result changed from negative to positive within 2 days. Conclusion: This model employing routine laboratory test results offers opportunities for early and rapid identification of high-risk SARS-CoV-2 infected patients before their RT-PCR results are available. It may play an important role in assisting the identification of SARS-CoV-2 infected patients in areas where RT-PCR testing is not accessible due to financial or supply constraints.
AB - Background: Accurate diagnostic strategies to identify SARS-CoV-2 positive individuals rapidly for management of patient care and protection of health care personnel are urgently needed. The predominant diagnostic test is viral RNA detection by RT-PCR from nasopharyngeal swabs specimens, however the results are not promptly obtainable in all patient care locations. Routine laboratory testing, in contrast, is readily available with a turn-around time (TAT) usually within 1-2 hours. Method: We developed a machine learning model incorporating patient demographic features (age, sex, race) with 27 routine laboratory tests to predict an individual's SARS-CoV-2 infection status. Laboratory testing results obtained within 2 days before the release of SARS-CoV-2 RT-PCR result were used to train a gradient boosting decision tree (GBDT) model from 3,356 SARS-CoV-2 RT-PCR tested patients (1,402 positive and 1,954 negative) evaluated at a metropolitan hospital. Results: The model achieved an area under the receiver operating characteristic curve (AUC) of 0.854 (95% CI: 0.829-0.878). Application of this model to an independent patient dataset from a separate hospital resulted in a comparable AUC (0.838), validating the generalization of its use. Moreover, our model predicted initial SARS-CoV-2 RT-PCR positivity in 66% individuals whose RT-PCR result changed from negative to positive within 2 days. Conclusion: This model employing routine laboratory test results offers opportunities for early and rapid identification of high-risk SARS-CoV-2 infected patients before their RT-PCR results are available. It may play an important role in assisting the identification of SARS-CoV-2 infected patients in areas where RT-PCR testing is not accessible due to financial or supply constraints.
KW - COVID-19
KW - SARS-CoV-2
KW - gradient boosted decision tree
KW - machine learning
KW - routine laboratory tests
UR - http://www.scopus.com/inward/record.url?scp=85094981447&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85094981447&partnerID=8YFLogxK
U2 - 10.1093/clinchem/hvaa200
DO - 10.1093/clinchem/hvaa200
M3 - Article
C2 - 32821907
AN - SCOPUS:85094981447
SN - 0009-9147
VL - 66
SP - 1396
EP - 1404
JO - Clinical chemistry
JF - Clinical chemistry
IS - 11
ER -