Crowdsourcing approaches rely on the collection of multiple individuals to solve problems that require analysis of large data sets in a timely accurate manner. The inexperience of participants or annotators motivates well robust techniques. Focusing on clustering setups, the data provided by all annotators is suitably modeled here as a mixture of Gaussian components plus a uniformly distributed random variable to capture outliers. The proposed algorithm is based on the expectation-maximization algorithm and allows for soft assignments of data to clusters, to rate annotators according to their performance, and to estimate the number of Gaussian components in the non-Gaussian/Gaussian mixture model, in a jointly manner.
|Original language||English (US)|
|Title of host publication||2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||5|
|State||Published - Jun 16 2017|
|Event||2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - New Orleans, United States|
Duration: Mar 5 2017 → Mar 9 2017
|Name||ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|
|Other||2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017|
|Period||3/5/17 → 3/9/17|
Bibliographical noteFunding Information:
This work has been funded by the Ministerio de Economia y Competitividad of the Spanish Government, ERDF funds (TEC2013-41315-R,TEC2015-69648-REDC, TEC2016-75067-C4-2-R,TEC2013-47020-C2-1-R, TACTICA), the Catalan Government (2014 SGR 60 AGAUR), and the Galician Government (AtlantTIC, GRC2013/009, R2014/037).
© 2017 IEEE.
- Bayesian Information Criterion
- EM algorithm
- Gaussian plus non-Gaussian Mixture