Widespread sampling biases in herbaria revealed from large-scale digitization

Barnabas H. Daru, Daniel S. Park, Richard B. Primack, Charles G. Willis, David S. Barrington, Timothy J.S. Whitfeld, Tristram G. Seidler, Patrick W. Sweeney, David R. Foster, Aaron M. Ellison, Charles C. Davis

Research output: Contribution to journalArticlepeer-review

172 Scopus citations


Nonrandom collecting practices may bias conclusions drawn from analyses of herbarium records. Recent efforts to fully digitize and mobilize regional floras online offer a timely opportunity to assess commonalities and differences in herbarium sampling biases. We determined spatial, temporal, trait, phylogenetic, and collector biases in c. 5 million herbarium records, representing three of the most complete digitized floras of the world: Australia (AU), South Africa (SA), and New England, USA (NE). We identified numerous shared and unique biases among these regions. Shared biases included specimens collected close to roads and herbaria; specimens collected more frequently during biological spring and summer; specimens of threatened species collected less frequently; and specimens of close relatives collected in similar numbers. Regional differences included overrepresentation of graminoids in SA and AU and of annuals in AU; and peak collection during the 1910s in NE, 1980s in SA, and 1990s in AU. Finally, in all regions, a disproportionately large percentage of specimens were collected by very few individuals. We hypothesize that these mega-collectors, with their associated preferences and idiosyncrasies, shaped patterns of collection bias via ‘founder effects’. Studies using herbarium collections should account for sampling biases, and future collecting efforts should avoid compounding these biases to the extent possible.

Original languageEnglish (US)
Pages (from-to)939-955
Number of pages17
JournalNew Phytologist
Issue number2
StatePublished - Jan 2018

Bibliographical note

Funding Information:
We thank the Harvard University Herbaria for logistic and financial support, and the virtual herbaria in the three regional floras for granting us access to their data: the Australian Virtual Herbarium (http://avh.chah.org.au), the South African National Biodiversity Institute (http://newposa.sanbi.org/) and the Consortium for Northeast Herbaria (http://portal.neherbaria.org/portal/). Digitization of most New England specimens was funded by the ADBC program of the US National Science Foundation (Awards 1208829, 1208835, 1208972, 1208973, 1208975, 1208989, and 1209149). Special thanks to T. J. Davies, E. K. Meineke, K. M. Peterson, and K. G. Dexter for valuable discussion during the early formation of this paper. We appreciate the constructive comments of the Associate Editor and three anonymous reviewers on the submitted manuscript.

Publisher Copyright:
© 2017 The Authors. New Phytologist © 2017 New Phytologist Trust


  • collector bias
  • geographic bias
  • herbarium
  • regional flora
  • sampling bias
  • temporal bias
  • trait bias


Dive into the research topics of 'Widespread sampling biases in herbaria revealed from large-scale digitization'. Together they form a unique fingerprint.

Cite this