Pitfalls and pointers: An accessible guide to marker gene amplicon sequencing in ecological applications

Anita J Krause, Alexander Strauss, Jeremiah A Henning, Eric W. Seabloom, Elizabeth T. Borer

Research output: Contribution to journalReview articlepeer-review

4 Scopus citations


Next-Generation Sequencing (NGS) is a powerful tool that has been rapidly adopted by many ecologists studying microbial communities. Despite the exciting demonstration of NGS technology as a tool for ecological research, cryptic pitfalls inherent to its use can obscure correct interpretation of NGS data. Here, we provide an accessible overview of a NGS process that uses marker gene amplicon sequences (MGAS) that will allow scientists, particularly community ecologists, to make appropriate methodological choices and understand limits on inference about community composition and diversity that can be drawn from MGAS data. We describe the MGAS pipeline, focusing specifically on cryptic sources of variation that have received less emphasis in the ecological literature, but which may substantially impact inference about microbial community diversity and composition. By simulating communities from published microbiome data, we demonstrate how these sources of variation can generate inaccurate or misleading patterns. We specifically highlight sample dilution without researcher awareness and lane-to-lane variability, two cryptic sources of variation arising during the MGAS pipeline. These sources of variation affect estimates of species presence and relative abundance, particularly for species with moderate to low abundances. Each of these sources of bias can lead to errors in the estimation of both absolute and relative abundance within, and turnover among, microbial communities. Awareness and understanding of what happens and, specifically, why it happens during MGAS generation is key to generating a strong dataset and building a robust community matrix. Requesting sample dilution information from the sequencing centre, including technical replicates across sequencing lanes, and understanding how sampling intensity and community taxa distribution patterns shape the measurement of community richness, evenness and diversity are critical for drawing correct ecological inferences using MGAS data.

Original languageEnglish (US)
Pages (from-to)266-277
Number of pages12
JournalMethods in Ecology and Evolution
Issue number2
StatePublished - Feb 2022

Bibliographical note

Funding Information:
Financial support was provided through National Science Foundation Award: DEB1241895 to E.T.B. We would like to thank Daryl M. Gohl for his incredibly insightful and technical comments on the manuscript.

Publisher Copyright:
© 2021 The Authors. Methods in Ecology and Evolution published by John Wiley & Sons Ltd on behalf of British Ecological Society.


Dive into the research topics of 'Pitfalls and pointers: An accessible guide to marker gene amplicon sequencing in ecological applications'. Together they form a unique fingerprint.

Cite this