Optimal stopping and worker selection in crowdsourcing: An adaptive sequential probability ratio test framework

Xiaoou Li, Yunxiao Chen, Xi Chen, Jingchen Liu, Zhiliang Ying

Research output: Contribution to journalArticlepeer-review

Abstract

In this study, we solve a class of multiple testing problems under a Bayesian sequential decision framework. Our work is motivated by binary labeling tasks in crowdsourcing, where a requestor needs to simultaneously choose a worker to provide a label and decide when to stop collecting labels, under a certain budget constraint. We begin by using a binary hypothesis testing problem to determine the true label of a single object, and provide an optimal solution by casting it under an adaptive sequential probability ratio test framework. Then, we characterize the structure of the optimal solution, that is, the optimal adaptive sequential design, which minimizes the Bayes risk using a log-likelihood ratio statistic. We also develop a dynamic programming algorithm to efficiently compute the optimal solution. For the multiple testing problem, we propose an empirical Bayes approach for estimating the class priors, and show that the average loss of our method converges to the minimal Bayes risk under the true model. Experiments on both simulated and real data show the robustness of our method, as well as its superiority over existing methods in terms of its labeling accuracy.

Original languageEnglish (US)
Pages (from-to)519-546
Number of pages28
JournalStatistica Sinica
Volume31
Issue number1
DOIs
StatePublished - Jan 2021

Bibliographical note

Funding Information:
The authors thank the editors and two referees for their constructive comments. Xiaoou Li’s research was partially supported by the NSF grant DMS-1712657. Yunxiao Chen’s research was partially supported by the NAEd/Spencer postdoctoral fellowship. Xi Chen’s research was partially supported by NSF grant IIS-1845444 and Bloomberg Data Science Research Grant. Jingchen Liu’s research was partially supported by NSF grants IIS-1633360 and SES-1826540. Zhiliang Ying’s research was partially supported by NSF IIS-1633360 and SES-1826540, and NIH grant R01GM047845.

Publisher Copyright:
© 2021 Institute of Statistical Science. All rights reserved.

Keywords

  • Bayesian decision theory
  • Crowdsourcing
  • Empirical Bayes
  • Sequential analysis
  • Sequential probability ratio test

Fingerprint

Dive into the research topics of 'Optimal stopping and worker selection in crowdsourcing: An adaptive sequential probability ratio test framework'. Together they form a unique fingerprint.

Cite this