Abstract
This paper introduces a novel Perturbation-Assisted Inference (PAI) framework utilizing synthetic data generated by the Perturbation-Assisted Sample Synthesis (PASS) method. The framework focuses on uncertainty quantification in complex data scenarios, particularly involving unstructured data while utilizing deep learning models. On one hand, PASS employs a generative model to create synthetic data that closely mirrors raw data while preserving its rank properties through data perturbation, thereby enhancing data diversity and bolstering privacy. By incorporating knowledge transfer from large pre-trained generative models, PASS enhances estimation accuracy, yielding refined distributional estimates of various statistics via Monte Carlo experiments. On the other hand, PAI boasts its statistically guaranteed validity. In pivotal inference, it enables precise conclusions even without prior knowledge of the pivotal's distribution. In non-pivotal situations, we enhance the reliability of synthetic data generation by training it with an independent holdout sample. We demonstrate the effectiveness of PAI in advancing uncertainty quantification in complex, data-driven tasks by applying it to diverse areas such as image synthesis, sentiment word analysis, multimodal inference, and the construction of prediction intervals.
Original language | English (US) |
---|---|
Pages (from-to) | 1-12 |
Number of pages | 12 |
Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
DOIs | |
State | Accepted/In press - 2024 |
Bibliographical note
Publisher Copyright:IEEE
Keywords
- Data models
- Diffusion
- High-dimensionality
- Large pre-trained Models
- Monte Carlo methods
- Multimodality
- Normalizing Flows
- Perturbation methods
- Synthetic data
- Task analysis
- Testing
- Uncertainty
- Uncertainty Quantification
PubMed: MeSH publication types
- Journal Article