Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information

Sandra E. Safo, Shuzhao Li, Qi Long

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Integrative analysis of high dimensional omics data is becoming increasingly popular. At the same time, incorporating known functional relationships among variables in analysis of omics data has been shown to help elucidate underlying mechanisms for complex diseases. In this article, our goal is to assess association between transcriptomic and metabolomic data from a Predictive Health Institute (PHI) study that includes healthy adults at a high risk of developing cardiovascular diseases. Adopting a strategy that is both data-driven and knowledge-based, we develop statistical methods for sparse canonical correlation analysis (CCA) with incorporation of known biological information. Our proposed methods use prior network structural information among genes and among metabolites to guide selection of relevant genes and metabolites in sparse CCA, providing insight on the molecular underpinning of cardiovascular disease. Our simulations demonstrate that the structured sparse CCA methods outperform several existing sparse CCA methods in selecting relevant genes and metabolites when structural information is informative and are robust to mis-specified structural information. Our analysis of the PHI study reveals that a number of gene and metabolic pathways including some known to be associated with cardiovascular diseases are enriched in the set of genes and metabolites selected by our proposed approach.

Original languageEnglish (US)
Pages (from-to)300-312
Number of pages13
JournalBiometrics
Volume74
Issue number1
DOIs
StatePublished - Mar 2018
Externally publishedYes

Keywords

  • Biological information
  • Canonical correlation analysis
  • High dimension
  • Integrative analysis
  • Low sample size
  • Sparsity
  • Structural information

Fingerprint Dive into the research topics of 'Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information'. Together they form a unique fingerprint.

Cite this