Abstract
In this study, we solve a class of multiple testing problems under a Bayesian sequential decision framework. Our work is motivated by binary labeling tasks in crowdsourcing, where a requestor needs to simultaneously choose a worker to provide a label and decide when to stop collecting labels, under a certain budget constraint. We begin by using a binary hypothesis testing problem to determine the true label of a single object, and provide an optimal solution by casting it under an adaptive sequential probability ratio test framework. Then, we characterize the structure of the optimal solution, that is, the optimal adaptive sequential design, which minimizes the Bayes risk using a log-likelihood ratio statistic. We also develop a dynamic programming algorithm to efficiently compute the optimal solution. For the multiple testing problem, we propose an empirical Bayes approach for estimating the class priors, and show that the average loss of our method converges to the minimal Bayes risk under the true model. Experiments on both simulated and real data show the robustness of our method, as well as its superiority over existing methods in terms of its labeling accuracy.
Original language | English (US) |
---|---|
Pages (from-to) | 519-546 |
Number of pages | 28 |
Journal | Statistica Sinica |
Volume | 31 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2021 |
Bibliographical note
Funding Information:The authors thank the editors and two referees for their constructive comments. Xiaoou Li’s research was partially supported by the NSF grant DMS-1712657. Yunxiao Chen’s research was partially supported by the NAEd/Spencer postdoctoral fellowship. Xi Chen’s research was partially supported by NSF grant IIS-1845444 and Bloomberg Data Science Research Grant. Jingchen Liu’s research was partially supported by NSF grants IIS-1633360 and SES-1826540. Zhiliang Ying’s research was partially supported by NSF IIS-1633360 and SES-1826540, and NIH grant R01GM047845.
Publisher Copyright:
© 2021 Institute of Statistical Science. All rights reserved.
Keywords
- Bayesian decision theory
- Crowdsourcing
- Empirical Bayes
- Sequential analysis
- Sequential probability ratio test