Robust clustering of data collected via crowdsourcing

Alba Pages-Zamora, Georgios B. Giannakis, Roberto Lopez-Valcarce, Pere Gimenez-Febrer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Crowdsourcing approaches rely on the collection of multiple individuals to solve problems that require analysis of large data sets in a timely accurate manner. The inexperience of participants or annotators motivates well robust techniques. Focusing on clustering setups, the data provided by all annotators is suitably modeled here as a mixture of Gaussian components plus a uniformly distributed random variable to capture outliers. The proposed algorithm is based on the expectation-maximization algorithm and allows for soft assignments of data to clusters, to rate annotators according to their performance, and to estimate the number of Gaussian components in the non-Gaussian/Gaussian mixture model, in a jointly manner.

Original languageEnglish (US)
Title of host publication2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4014-4018
Number of pages5
ISBN (Electronic)9781509041176
DOIs
StatePublished - Jun 16 2017
Event2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - New Orleans, United States
Duration: Mar 5 2017Mar 9 2017

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Other

Other2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017
CountryUnited States
CityNew Orleans
Period3/5/173/9/17

Bibliographical note

Funding Information:
This work has been funded by the Ministerio de Economia y Competitividad of the Spanish Government, ERDF funds (TEC2013-41315-R,TEC2015-69648-REDC, TEC2016-75067-C4-2-R,TEC2013-47020-C2-1-R, TACTICA), the Catalan Government (2014 SGR 60 AGAUR), and the Galician Government (AtlantTIC, GRC2013/009, R2014/037).

Publisher Copyright:
© 2017 IEEE.

Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.

Keywords

  • Bayesian Information Criterion
  • Crowdsourcing
  • EM algorithm
  • Gaussian plus non-Gaussian Mixture
  • Outlier

Fingerprint Dive into the research topics of 'Robust clustering of data collected via crowdsourcing'. Together they form a unique fingerprint.

Cite this