Significance Tests of Feature Relevance for a Black-Box Learner

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


An exciting recent development is the uptake of deep neural networks in many scientific fields, where the main objective is outcome prediction with a black-box nature. Significance testing is promising to address the black-box issue and explore novel scientific insights and interpretations of the decision-making process based on a deep learning model. However, testing for a neural network poses a challenge because of its black-box nature and unknown limiting distributions of parameter estimates while existing methods require strong assumptions or excessive computation. In this article, we derive one-split and two-split tests relaxing the assumptions and computational complexity of existing black-box tests and extending to examine the significance of a collection of features of interest in a dataset of possibly a complex type, such as an image. The one-split test estimates and evaluates a black-box model based on estimation and inference subsets through sample splitting and data perturbation. The two-split test further splits the inference subset into two but requires no perturbation. Also, we develop their combined versions by aggregating the <inline-formula> <tex-math notation="LaTeX">$p$</tex-math> </inline-formula>-values based on repeated sample splitting. By deflating the <italic>bias-sd-ratio</italic>, we establish asymptotic null distributions of the test statistics and the consistency in terms of Type 2 error. Numerically, we demonstrate the utility of the proposed tests on seven simulated examples and six real datasets. Accompanying this article is our python library dnn-inference ( that implements the proposed tests.

Original languageEnglish (US)
Pages (from-to)1-14
Number of pages14
JournalIEEE Transactions on Neural Networks and Learning Systems
StateAccepted/In press - 2022

Bibliographical note

Publisher Copyright:


  • Adaptive splitting
  • Cathode ray tubes
  • Computational modeling
  • Deep learning
  • Estimation
  • Hafnium
  • Standards
  • Testing
  • black-box tests
  • combining
  • computational constraints
  • feature relevance

PubMed: MeSH publication types

  • Journal Article


Dive into the research topics of 'Significance Tests of Feature Relevance for a Black-Box Learner'. Together they form a unique fingerprint.

Cite this