TY - JOUR
T1 - Deep IDA
T2 - A deep learning approach for integrative discriminant analysis of multi-omics data with feature ranking - An application to COVID-19
AU - Wang, Jiuzhou
AU - Safo, Sandra E.
N1 - Publisher Copyright:
© 2024 The Author(s).
PY - 2024
Y1 - 2024
N2 - Motivation: Many diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, requiring a holistic approach to better understand disease pathobiology. Most existing methods for integrating data from multiple sources and classifying individuals into one of multiple classes or disease groups have mainly focused on linear relationships despite the complexity of these relationships. On the other hand, methods for nonlinear association and classification studies are limited in their ability to identify variables to aid in our understanding of the complexity of the disease or can be applied to only two data types. Results: We propose Deep Integrative Discriminant Analysis (IDA), a deep learning method to learn complex nonlinear transformations of two or more views such that resulting projections have maximum association and maximum separation. Further, we propose a feature ranking approach based on ensemble learning for interpretable results. We test Deep IDA on both simulated data and two large real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care.
AB - Motivation: Many diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, requiring a holistic approach to better understand disease pathobiology. Most existing methods for integrating data from multiple sources and classifying individuals into one of multiple classes or disease groups have mainly focused on linear relationships despite the complexity of these relationships. On the other hand, methods for nonlinear association and classification studies are limited in their ability to identify variables to aid in our understanding of the complexity of the disease or can be applied to only two data types. Results: We propose Deep Integrative Discriminant Analysis (IDA), a deep learning method to learn complex nonlinear transformations of two or more views such that resulting projections have maximum association and maximum separation. Further, we propose a feature ranking approach based on ensemble learning for interpretable results. We test Deep IDA on both simulated data and two large real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care.
UR - http://www.scopus.com/inward/record.url?scp=85194152536&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85194152536&partnerID=8YFLogxK
U2 - 10.1093/bioadv/vbae060
DO - 10.1093/bioadv/vbae060
M3 - Article
C2 - 39027641
AN - SCOPUS:85194152536
SN - 2635-0041
VL - 4
JO - Bioinformatics Advances
JF - Bioinformatics Advances
IS - 1
M1 - vbae060
ER -