Quantifying the gap: A case study of wikidata gender disparities

Charles Chuankai Zhang, Loren Terveen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations


Much prior research has found gender bias in peer production systems like Wikipedia and OpenStreetMap. This bias affects both women's participation in these platforms and content about women on these platforms. We investigated the gender content gap in Wikidata, where less than 22% of items that represent people are about women. We asked: what is the source of this bias? Specifically, does it originate from the actions of Wikidata editors or from external factors; that is, does it simply reflect existing real world gender bias? We conducted a quantitative case study that found: (i) the most popular categories of people included in Wikidata represent male-dominant professions, such as American football; (ii) within a selected set of professions where we could obtain gender distribution data, Wikidata is no more biased than the real world: men and women are included at similar percentages, and the quality of items representing men and women also is similar. We provide possible explanations for our findings and implications for addressing the Wikidata content gap.

Original languageEnglish (US)
Title of host publicationProceedings of the 17th International Symposium on Open Collaboration, OpenSym 2021
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450385008
StatePublished - Sep 15 2021
Event17th International Symposium on Open Collaboration, OpenSym 2021 - Virtual, Online, Spain
Duration: Sep 15 2021Sep 17 2021

Publication series

NameACM International Conference Proceeding Series


Conference17th International Symposium on Open Collaboration, OpenSym 2021
CityVirtual, Online

Bibliographical note

Funding Information:
We thank the anonymous reviewers for their comments and suggestions that helped us strengthen our paper. This work was supported by the National Science Foundation(NSF) under Award No. IIS-1816348.

Publisher Copyright:
© 2021 ACM.


  • Peer-production
  • Structured data
  • Wikidata


Dive into the research topics of 'Quantifying the gap: A case study of wikidata gender disparities'. Together they form a unique fingerprint.

Cite this