Causality-guided feature selection

Alzheimer’s Disease Neuroimaging Initiative

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations


Identifying meaningful features that drive a phenomenon (response) of interest in complex systems of interconnected factors is a challenging problem. Causal discovery methods have been previously applied to estimate bounds on causal strengths of factors on a response or to identify meaningful interactions between factors in complex systems, but these approaches have been used only for inferential purposes. In contrast, we posit that interactions between factors with a potential causal association on a given response could be viable candidates not only for hypothesis generation but also for predictive modeling. In this work, we propose a causality-guided feature selection methodology that identifies factors having a potential cause-effect relationship in complex systems, and selects features by clustering them based on their causal strength with respect to the response. To this end, we estimate statistically significant causal effects on the response of factors taking part in potential causal relationships, while addressing associated technical challenges, such as multicollinearity in the data. We validate the proposed methodology for predicting response in five real-world datasets from the domain of climate science and biology. The selected features show predictive skill and consistent performance across different domains.

Original languageEnglish (US)
Title of host publicationAdvanced Data Mining and Applications - 12th International Conference, ADMA 2016, Proceedings
EditorsJinyan Li, Xue Li, Shuliang Wang, Jianxin Li, Quan Z. Sheng
Number of pages15
ISBN (Print)9783319495859
StatePublished - 2016
Event12th International Conference on Advanced Data Mining and Applications, ADMA 2016 - Gold Coast, Australia
Duration: Dec 12 2016Dec 15 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10086 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other12th International Conference on Advanced Data Mining and Applications, ADMA 2016
CityGold Coast

Bibliographical note

Funding Information:
This material is based upon work supported in part by the Laboratory for Analytic Sciences (LAS), the Department of Energy National Nuclear Security Administration under Award Number(s) DE-NA0002576 and NSF grant 1029711. Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimers Association; Alzheimers Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd. and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health ( ). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimers Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. PMD has received research grants and/or advisory fees from several government agencies, advocacy groups and pharmaceutical/imaging companies, and received a grant from ADNI to support data collection for this study. He also owns stock in several companies whose products are not discussed here.

Publisher Copyright:
© Springer International Publishing AG 2016.


Dive into the research topics of 'Causality-guided feature selection'. Together they form a unique fingerprint.

Cite this