Sparse Minimum Discrepancy Approach to Sufficient Dimension Reduction with Simultaneous Variable Selection in Ultrahigh Dimension

Wei Qian, Shanshan Ding, R. D Cook

Research output: Contribution to journalArticle

Abstract

Sufficient dimension reduction (SDR) is known to be a powerful tool for achieving data reduction and data visualization in regression and classification problems. In this work, we study ultrahigh-dimensional SDR problems and propose solutions under a unified minimum discrepancy approach with regularization. When p grows exponentially with n, consistency results in both central subspace estimation and variable selection are established simultaneously for important SDR methods, including sliced inverse regression (SIR), principal fitted component (PFC), and sliced average variance estimation (SAVE). Special sparse structures of large predictor or error covariance are also considered for potentially better performance. In addition, the proposed approach is equipped with a new algorithm to efficiently solve the regularized objective functions and a new data-driven procedure to determine structural dimension and tuning parameters, without the need to invert a large covariance matrix. Simulations and a real data analysis are offered to demonstrate the promise of our proposal in ultrahigh-dimensional settings. Supplementary materials for this article are available online.

Original languageEnglish (US)
Pages (from-to)1277-1290
Number of pages14
JournalJournal of the American Statistical Association
Volume114
Issue number527
DOIs
StatePublished - Jul 3 2019

Fingerprint

Sufficient Dimension Reduction
Variable Selection
Discrepancy
Sliced Average Variance Estimation
Central Subspace
Sliced Inverse Regression
Invert
Data Visualization
Data Reduction
Parameter Tuning
Principal Components
Reduction Method
Data-driven
Classification Problems
Covariance matrix
Predictors
Regularization
Data analysis
Objective function
Regression

Keywords

  • Central subspace
  • Inverse regression
  • Principal fitted component
  • Sliced average variance estimation
  • Sliced inverse regression
  • Sparsity

Cite this

Sparse Minimum Discrepancy Approach to Sufficient Dimension Reduction with Simultaneous Variable Selection in Ultrahigh Dimension. / Qian, Wei; Ding, Shanshan; Cook, R. D.

In: Journal of the American Statistical Association, Vol. 114, No. 527, 03.07.2019, p. 1277-1290.

Research output: Contribution to journalArticle

@article{a2b31994238e46e1a22c8a015803ff17,
title = "Sparse Minimum Discrepancy Approach to Sufficient Dimension Reduction with Simultaneous Variable Selection in Ultrahigh Dimension",
abstract = "Sufficient dimension reduction (SDR) is known to be a powerful tool for achieving data reduction and data visualization in regression and classification problems. In this work, we study ultrahigh-dimensional SDR problems and propose solutions under a unified minimum discrepancy approach with regularization. When p grows exponentially with n, consistency results in both central subspace estimation and variable selection are established simultaneously for important SDR methods, including sliced inverse regression (SIR), principal fitted component (PFC), and sliced average variance estimation (SAVE). Special sparse structures of large predictor or error covariance are also considered for potentially better performance. In addition, the proposed approach is equipped with a new algorithm to efficiently solve the regularized objective functions and a new data-driven procedure to determine structural dimension and tuning parameters, without the need to invert a large covariance matrix. Simulations and a real data analysis are offered to demonstrate the promise of our proposal in ultrahigh-dimensional settings. Supplementary materials for this article are available online.",
keywords = "Central subspace, Inverse regression, Principal fitted component, Sliced average variance estimation, Sliced inverse regression, Sparsity",
author = "Wei Qian and Shanshan Ding and Cook, {R. D}",
year = "2019",
month = "7",
day = "3",
doi = "10.1080/01621459.2018.1497498",
language = "English (US)",
volume = "114",
pages = "1277--1290",
journal = "Journal of the American Statistical Association",
issn = "0162-1459",
publisher = "Taylor and Francis Ltd.",
number = "527",

}

TY - JOUR

T1 - Sparse Minimum Discrepancy Approach to Sufficient Dimension Reduction with Simultaneous Variable Selection in Ultrahigh Dimension

AU - Qian, Wei

AU - Ding, Shanshan

AU - Cook, R. D

PY - 2019/7/3

Y1 - 2019/7/3

N2 - Sufficient dimension reduction (SDR) is known to be a powerful tool for achieving data reduction and data visualization in regression and classification problems. In this work, we study ultrahigh-dimensional SDR problems and propose solutions under a unified minimum discrepancy approach with regularization. When p grows exponentially with n, consistency results in both central subspace estimation and variable selection are established simultaneously for important SDR methods, including sliced inverse regression (SIR), principal fitted component (PFC), and sliced average variance estimation (SAVE). Special sparse structures of large predictor or error covariance are also considered for potentially better performance. In addition, the proposed approach is equipped with a new algorithm to efficiently solve the regularized objective functions and a new data-driven procedure to determine structural dimension and tuning parameters, without the need to invert a large covariance matrix. Simulations and a real data analysis are offered to demonstrate the promise of our proposal in ultrahigh-dimensional settings. Supplementary materials for this article are available online.

AB - Sufficient dimension reduction (SDR) is known to be a powerful tool for achieving data reduction and data visualization in regression and classification problems. In this work, we study ultrahigh-dimensional SDR problems and propose solutions under a unified minimum discrepancy approach with regularization. When p grows exponentially with n, consistency results in both central subspace estimation and variable selection are established simultaneously for important SDR methods, including sliced inverse regression (SIR), principal fitted component (PFC), and sliced average variance estimation (SAVE). Special sparse structures of large predictor or error covariance are also considered for potentially better performance. In addition, the proposed approach is equipped with a new algorithm to efficiently solve the regularized objective functions and a new data-driven procedure to determine structural dimension and tuning parameters, without the need to invert a large covariance matrix. Simulations and a real data analysis are offered to demonstrate the promise of our proposal in ultrahigh-dimensional settings. Supplementary materials for this article are available online.

KW - Central subspace

KW - Inverse regression

KW - Principal fitted component

KW - Sliced average variance estimation

KW - Sliced inverse regression

KW - Sparsity

UR - http://www.scopus.com/inward/record.url?scp=85055709008&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055709008&partnerID=8YFLogxK

U2 - 10.1080/01621459.2018.1497498

DO - 10.1080/01621459.2018.1497498

M3 - Article

AN - SCOPUS:85055709008

VL - 114

SP - 1277

EP - 1290

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

SN - 0162-1459

IS - 527

ER -