Resampling-based tests for Lasso in genome-wide association studies

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Background: Genome-wide association studies involve detecting association between millions of genetic variants and a trait, which typically use univariate regression to test association between each single variant and the phenotype. Alternatively, Lasso penalized regression allows one to jointly model the relationship between all genetic variants and the phenotype. However, it is unclear how to best conduct inference on the individual Lasso coefficients, especially in high-dimensional settings. Methods: We consider six methods for testing the Lasso coefficients: two permutation (Lasso-Ayers, Lasso-PL) and one analytic approach (Lasso-AL) to select the penalty parameter for type-1-error control, residual bootstrap (Lasso-RB), modified residual bootstrap (Lasso-MRB), and a permutation test (Lasso-PT). Methods are compared via simulations and application to the Minnesota Center for Twins and Family Study. Results: We show that for finite sample sizes with increasing number of null predictors, Lasso-RB, Lasso-MRB, and Lasso-PT fail to be viable methods of inference. However, Lasso-PL and Lasso-AL remain fast and powerful tools for conducting inference with the Lasso, even in high-dimensions. Conclusion: Our results suggest that the proposed permutation selection procedure (Lasso-PL) and the analytic selection method (Lasso-AL) are fast and powerful alternatives to the standard univariate analysis in genome-wide association studies.

Original languageEnglish (US)
Article number70
JournalBMC genetics
Volume18
Issue number1
DOIs
StatePublished - Jul 24 2017

Fingerprint

Genome-Wide Association Study
alachlor
Phenotype
Twin Studies
Sample Size

Keywords

  • Bootstrap
  • GWAS
  • Lasso
  • Permutation
  • Resampling
  • Testing

Cite this

Resampling-based tests for Lasso in genome-wide association studies. / Arbet, Jaron; Mc Gue, Matt; Chatterjee, Singdhansu B; Basu, Saonli.

In: BMC genetics, Vol. 18, No. 1, 70, 24.07.2017.

Research output: Contribution to journalArticle

@article{1798f941629840f392b8a0e5be5a251e,
title = "Resampling-based tests for Lasso in genome-wide association studies",
abstract = "Background: Genome-wide association studies involve detecting association between millions of genetic variants and a trait, which typically use univariate regression to test association between each single variant and the phenotype. Alternatively, Lasso penalized regression allows one to jointly model the relationship between all genetic variants and the phenotype. However, it is unclear how to best conduct inference on the individual Lasso coefficients, especially in high-dimensional settings. Methods: We consider six methods for testing the Lasso coefficients: two permutation (Lasso-Ayers, Lasso-PL) and one analytic approach (Lasso-AL) to select the penalty parameter for type-1-error control, residual bootstrap (Lasso-RB), modified residual bootstrap (Lasso-MRB), and a permutation test (Lasso-PT). Methods are compared via simulations and application to the Minnesota Center for Twins and Family Study. Results: We show that for finite sample sizes with increasing number of null predictors, Lasso-RB, Lasso-MRB, and Lasso-PT fail to be viable methods of inference. However, Lasso-PL and Lasso-AL remain fast and powerful tools for conducting inference with the Lasso, even in high-dimensions. Conclusion: Our results suggest that the proposed permutation selection procedure (Lasso-PL) and the analytic selection method (Lasso-AL) are fast and powerful alternatives to the standard univariate analysis in genome-wide association studies.",
keywords = "Bootstrap, GWAS, Lasso, Permutation, Resampling, Testing",
author = "Jaron Arbet and {Mc Gue}, Matt and Chatterjee, {Singdhansu B} and Saonli Basu",
year = "2017",
month = "7",
day = "24",
doi = "10.1186/s12863-017-0533-3",
language = "English (US)",
volume = "18",
journal = "BMC Genetics",
issn = "1471-2156",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Resampling-based tests for Lasso in genome-wide association studies

AU - Arbet, Jaron

AU - Mc Gue, Matt

AU - Chatterjee, Singdhansu B

AU - Basu, Saonli

PY - 2017/7/24

Y1 - 2017/7/24

N2 - Background: Genome-wide association studies involve detecting association between millions of genetic variants and a trait, which typically use univariate regression to test association between each single variant and the phenotype. Alternatively, Lasso penalized regression allows one to jointly model the relationship between all genetic variants and the phenotype. However, it is unclear how to best conduct inference on the individual Lasso coefficients, especially in high-dimensional settings. Methods: We consider six methods for testing the Lasso coefficients: two permutation (Lasso-Ayers, Lasso-PL) and one analytic approach (Lasso-AL) to select the penalty parameter for type-1-error control, residual bootstrap (Lasso-RB), modified residual bootstrap (Lasso-MRB), and a permutation test (Lasso-PT). Methods are compared via simulations and application to the Minnesota Center for Twins and Family Study. Results: We show that for finite sample sizes with increasing number of null predictors, Lasso-RB, Lasso-MRB, and Lasso-PT fail to be viable methods of inference. However, Lasso-PL and Lasso-AL remain fast and powerful tools for conducting inference with the Lasso, even in high-dimensions. Conclusion: Our results suggest that the proposed permutation selection procedure (Lasso-PL) and the analytic selection method (Lasso-AL) are fast and powerful alternatives to the standard univariate analysis in genome-wide association studies.

AB - Background: Genome-wide association studies involve detecting association between millions of genetic variants and a trait, which typically use univariate regression to test association between each single variant and the phenotype. Alternatively, Lasso penalized regression allows one to jointly model the relationship between all genetic variants and the phenotype. However, it is unclear how to best conduct inference on the individual Lasso coefficients, especially in high-dimensional settings. Methods: We consider six methods for testing the Lasso coefficients: two permutation (Lasso-Ayers, Lasso-PL) and one analytic approach (Lasso-AL) to select the penalty parameter for type-1-error control, residual bootstrap (Lasso-RB), modified residual bootstrap (Lasso-MRB), and a permutation test (Lasso-PT). Methods are compared via simulations and application to the Minnesota Center for Twins and Family Study. Results: We show that for finite sample sizes with increasing number of null predictors, Lasso-RB, Lasso-MRB, and Lasso-PT fail to be viable methods of inference. However, Lasso-PL and Lasso-AL remain fast and powerful tools for conducting inference with the Lasso, even in high-dimensions. Conclusion: Our results suggest that the proposed permutation selection procedure (Lasso-PL) and the analytic selection method (Lasso-AL) are fast and powerful alternatives to the standard univariate analysis in genome-wide association studies.

KW - Bootstrap

KW - GWAS

KW - Lasso

KW - Permutation

KW - Resampling

KW - Testing

UR - http://www.scopus.com/inward/record.url?scp=85025455575&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85025455575&partnerID=8YFLogxK

U2 - 10.1186/s12863-017-0533-3

DO - 10.1186/s12863-017-0533-3

M3 - Article

C2 - 28738830

AN - SCOPUS:85025455575

VL - 18

JO - BMC Genetics

JF - BMC Genetics

SN - 1471-2156

IS - 1

M1 - 70

ER -