Simulation experiments on (the absence of) ratings bias in reputation systems

Jacob Thebault-Spieker, Daniel Kluver, Maximillian Klein, Aaron Halfaker, Brent Hecht, Loren G Terveen, Joseph A Konstan

Research output: Contribution to journalArticlepeer-review

18 Scopus citations


As the gig economy continues to grow and freelance work moves online, five-star reputation systems are becoming more and more common. At the same time, there are increasing accounts of race and gender bias in evaluations of gig workers, with negative impacts for those workers. We report on a series of four Mechanical Turk-based studies in which participants who rated simulated gig work did not show race- or gender bias, while manipulation checks showed they reliably distinguished between low- and high-quality work. Given prior research, this was a striking result. To explore further, we used a Bayesian approach to verify absence of ratings bias (as opposed to merely not detecting bias). This Bayesian test let us identify an upperbound: if any bias did exist in our studies, it was below an average of 0.2 stars on a five-star scale. We discuss possible interpretations of our results and outline future work to better understand the results.

Original languageEnglish (US)
Article number101
JournalProceedings of the ACM on Human-Computer Interaction
Issue numberCSCW
StatePublished - Nov 2017

Bibliographical note

Publisher Copyright:
© 2017 Association for Computing Machinery.


  • Bayesian statistics
  • Gender bias
  • Gig economy
  • Racial bias
  • Reputation bias
  • Reputation systems


Dive into the research topics of 'Simulation experiments on (the absence of) ratings bias in reputation systems'. Together they form a unique fingerprint.

Cite this