A Composite Likelihood Approach to Latent Multivariate Gaussian Modeling of SNP Data with Application to Genetic Association Testing

Fang Han, Wei Pan

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Many statistical tests have been proposed for case-control data to detect disease association with multiple single nucleotide polymorphisms (SNPs) in linkage disequilibrium. The main reason for the existence of so many tests is that each test aims to detect one or two aspects of many possible distributional differences between cases and controls, largely due to the lack of a general and yet simple model for discrete genotype data. Here we propose a latent variable model to represent SNP data: the observed SNP data are assumed to be obtained by discretizing a latent multivariate Gaussian variate. Because the latent variate is multivariate Gaussian, its distribution is completely characterized by its mean vector and covariance matrix, in contrast to much more complex forms of a general distribution for discrete multivariate SNP data. We propose a composite likelihood approach for parameter estimation. A direct application of this latent variable model is to association testing with multiple SNPs in a candidate gene or region. In contrast to many existing tests that aim to detect only one or two aspects of many possible distributional differences of discrete SNP data, we can exclusively focus on testing the mean and covariance parameters of the latent Gaussian distributions for cases and controls. Our simulation results demonstrate potential power gains of the proposed approach over some existing methods.

Original languageEnglish (US)
Pages (from-to)307-315
Number of pages9
JournalBiometrics
Volume68
Issue number1
DOIs
StatePublished - Mar 2012

Keywords

  • GWAS
  • Genome-wide association study
  • Latent model
  • Logistic regression
  • Multimarker analysis
  • Multivariate discrete distribution

Fingerprint Dive into the research topics of 'A Composite Likelihood Approach to Latent Multivariate Gaussian Modeling of SNP Data with Application to Genetic Association Testing'. Together they form a unique fingerprint.

Cite this