Abstract
Crowdsourcing approaches rely on the collection of multiple individuals to solve problems that require analysis of large data sets in a timely accurate manner. The inexperience of participants or annotators motivates well robust techniques. Focusing on clustering setups, the data provided by all annotators is suitably modeled here as a mixture of Gaussian components plus a uniformly distributed random variable to capture outliers. The proposed algorithm is based on the expectation-maximization algorithm and allows for soft assignments of data to clusters, to rate annotators according to their performance, and to estimate the number of Gaussian components in the non-Gaussian/Gaussian mixture model, in a jointly manner.
Original language | English (US) |
---|---|
Title of host publication | 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 4014-4018 |
Number of pages | 5 |
ISBN (Electronic) | 9781509041176 |
DOIs | |
State | Published - Jun 16 2017 |
Event | 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - New Orleans, United States Duration: Mar 5 2017 → Mar 9 2017 |
Publication series
Name | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
---|---|
ISSN (Print) | 1520-6149 |
Other
Other | 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 |
---|---|
Country/Territory | United States |
City | New Orleans |
Period | 3/5/17 → 3/9/17 |
Bibliographical note
Publisher Copyright:© 2017 IEEE.
Keywords
- Bayesian Information Criterion
- Crowdsourcing
- EM algorithm
- Gaussian plus non-Gaussian Mixture
- Outlier