A novel association test for multiple secondary phenotypes from a case-control GWAS

Debashree Ray, Saonli Basu

Research output: Contribution to journalArticlepeer-review

8 Scopus citations


In the past decade, many genome-wide association studies (GWASs) have been conducted to explore association of single nucleotide polymorphisms (SNPs) with complex diseases using a case-control design. These GWASs not only collect information on the disease status (primary phenotype, D) and the SNPs (genotypes, X), but also collect extensive data on several risk factors and traits. Recent literature and grant proposals point toward a trend in reusing existing large case-control data for exploring genetic associations of some additional traits (secondary phenotypes, Y) collected during the study. These secondary phenotypes may be correlated, and a proper analysis warrants a multivariate approach. Commonly used multivariate methods are not equipped to properly account for the non-random sampling scheme. Current ad hoc practices include analyses without any adjustment, and analyses with D adjusted as a covariate. Our theoretical and empirical studies suggest that the type I error for testing genetic association of secondary traits can be substantial when X as well as Y are associated with D, even when there is no association between X and Y in the underlying (target) population. Whether using D as a covariate helps maintain type I error depends heavily on the disease mechanism and the underlying causal structure (which is often unknown). To avoid grossly incorrect inference, we have proposed proportional odds model adjusted for propensity score (POM-PS). It uses a proportional odds logistic regression of X on Y and adjusts estimated conditional probability of being diseased as a covariate. We demonstrate the validity and advantage of POM-PS, and compare to some existing methods in extensive simulation experiments mimicking plausible scenarios of dependency among Y, X, and D. Finally, we use POM-PS to jointly analyze four adiposity traits using a type 2 diabetes (T2D) case-control sample from the population-based Metabolic Syndrome in Men (METSIM) study. Only POM-PS analysis of the T2D case-control sample seems to provide valid association signals.

Original languageEnglish (US)
Pages (from-to)413-426
Number of pages14
JournalGenetic epidemiology
Issue number5
StatePublished - Jul 2017

Bibliographical note

Funding Information:
This research was supported by the NIH grant R01-DA033958, and the Doctoral Dissertation Fellowship of the University of Minnesota Graduate School. This work was carried out in part using computing resources at the Department of Psychology, University of Minnesota, and at the Center for Statistical Genetics, Department of Biostatistics, University of Michigan. We are grateful to Dr. Michael Boehnke and Dr. Markku Laakso for providing access to the METSIM data, and allowing us to analyze it. We thank the referees for a prompt and careful review of our work. The authors declare no conflicts of interest. R software for POM-PS can be found at https://github.com/RayDebashree/POM-PS.

Publisher Copyright:


  • GWAS
  • case-control design
  • cross-phenotype association
  • joint modeling
  • multiple traits
  • multivariate analysis
  • propensity score
  • proportional odds model
  • secondary traits
  • stratification score


Dive into the research topics of 'A novel association test for multiple secondary phenotypes from a case-control GWAS'. Together they form a unique fingerprint.

Cite this