Abstract
An exciting recent development is the uptake of deep neural networks in many scientific fields, where the main objective is outcome prediction with a black-box nature. Significance testing is promising to address the black-box issue and explore novel scientific insights and interpretations of the decision-making process based on a deep learning model. However, testing for a neural network poses a challenge because of its black-box nature and unknown limiting distributions of parameter estimates while existing methods require strong assumptions or excessive computation. In this article, we derive one-split and two-split tests relaxing the assumptions and computational complexity of existing black-box tests and extending to examine the significance of a collection of features of interest in a dataset of possibly a complex type, such as an image. The one-split test estimates and evaluates a black-box model based on estimation and inference subsets through sample splitting and data perturbation. The two-split test further splits the inference subset into two but requires no perturbation. Also, we develop their combined versions by aggregating the <inline-formula> <tex-math notation="LaTeX">$p$</tex-math> </inline-formula>-values based on repeated sample splitting. By deflating the <italic>bias-sd-ratio</italic>, we establish asymptotic null distributions of the test statistics and the consistency in terms of Type 2 error. Numerically, we demonstrate the utility of the proposed tests on seven simulated examples and six real datasets. Accompanying this article is our python library dnn-inference (https://dnn-inference.readthedocs.io/en/latest/) that implements the proposed tests.
Original language | English (US) |
---|---|
Pages (from-to) | 1-14 |
Number of pages | 14 |
Journal | IEEE Transactions on Neural Networks and Learning Systems |
Volume | PP |
DOIs | |
State | Accepted/In press - 2022 |
Bibliographical note
Publisher Copyright:Author
Keywords
- Adaptive splitting
- Cathode ray tubes
- Computational modeling
- Deep learning
- Estimation
- Hafnium
- Standards
- Testing
- black-box tests
- combining
- computational constraints
- feature relevance
PubMed: MeSH publication types
- Journal Article