Identifying animal species in camera trap images using deep learning and citizen science

Marco Willi, Ross T. Pitman, Anabelle W. Cardoso, Christina Locke, Alexandra Swanson, Amy Boyer, Marten Veldthuis, Lucy F Fortson

Research output: Contribution to journalArticlepeer-review

180 Scopus citations


Ecologists often study wildlife populations by deploying camera traps. Large datasets are generated using this approach which can be difficult for research teams to manually evaluate. Researchers increasingly enlist volunteers from the general public as citizen scientists to help classify images. The growing number of camera trap studies, however, makes it ever more challenging to find enough volunteers to process all projects in a timely manner. Advances in machine learning, especially deep learning, allow for accurate automatic image classification. By training models using existing datasets of images classified by citizen scientists and subsequent application of such models on new studies, human effort may be reduced substantially. The goals of this study were to (a) assess the accuracy of deep learning in classifying camera trap data, (b) investigate how to process datasets with only a few classified images that are generally difficult to model, and (c) apply a trained model on a live online citizen science project. Convolutional neural networks (CNNs) were used to differentiate among images of different animal species, images of humans or vehicles, and empty images (no animals, vehicles, or humans). We used four different camera trap datasets featuring a wide variety of species, different habitats, and a varying number of images. All datasets were labelled by citizen scientists on Zooniverse. Accuracies for identifying empty images across projects ranged between 91.2% and 98.0%, whereas accuracies for identifying specific species were between 88.7% and 92.7%. Transferring information from CNNs trained on large datasets (“transfer-learning”) was increasingly beneficial as the size of the training dataset decreased and raised accuracy by up to 10.3%. Removing low-confidence predictions increased model accuracies to the level of citizen scientists. By combining a trained model with classifications from citizen scientists, human effort was reduced by 43% while maintaining overall accuracy for a live experiment running on Zooniverse. Ecology researchers can significantly reduce image classification time and manual effort by combining citizen scientists and CNNs, enabling faster processing of data from large camera trap studies.

Original languageEnglish (US)
Pages (from-to)80-91
Number of pages12
JournalMethods in Ecology and Evolution
Issue number1
StatePublished - Jan 2019

Bibliographical note

Funding Information:
We thank Hugh Dickinson, Chris Lintott, Sarah Pati, Laura Trouille, and Mike Walmsley for reviewing the manuscript. We also thank Jennifer Stenglein and other members of the SW team for program and data management. EE was funded by the University of Oxford’s Hertford College Mortimer May fund. We thank ANPN Gabon, U. Stirling, J. Edzang-Ndong, D. Lehmann, Yadvinder Malhi, Imma Oliveras, William Bond, and Katharine Abernethy for contributing to EE. This study was partially supported by the NSF under award IIS 1619177. The development of the Zooniverse platform was partially supported by a Global Impact Award from Google. We also acknowledge support from STFC under grant ST /N003179/1.

Funding Information:
National Science Foundation, Grant/Award Number: IIS 1619177

Publisher Copyright:
© 2018 The Authors. Methods in Ecology and Evolution © 2018 British Ecological Society


  • animal identification
  • camera trap
  • citizen science
  • convolutional neural networks
  • deep learning
  • machine learning


Dive into the research topics of 'Identifying animal species in camera trap images using deep learning and citizen science'. Together they form a unique fingerprint.

Cite this