TY - JOUR
T1 - A Guide to Enterotypes across the Human Body
T2 - Meta-Analysis of Microbial Community Structures in Human Microbiome Datasets
AU - Koren, Omry
AU - Knights, Dan
AU - Gonzalez, Antonio
AU - Waldron, Levi
AU - Segata, Nicola
AU - Knight, Rob
AU - Huttenhower, Curtis
AU - Ley, Ruth E.
PY - 2013/1
Y1 - 2013/1
N2 - Recent analyses of human-associated bacterial diversity have categorized individuals into 'enterotypes' or clusters based on the abundances of key bacterial genera in the gut microbiota. There is a lack of consensus, however, on the analytical basis for enterotypes and on the interpretation of these results. We tested how the following factors influenced the detection of enterotypes: clustering methodology, distance metrics, OTU-picking approaches, sequencing depth, data type (whole genome shotgun (WGS) vs.16S rRNA gene sequence data), and 16S rRNA region. We included 16S rRNA gene sequences from the Human Microbiome Project (HMP) and from 16 additional studies and WGS sequences from the HMP and MetaHIT. In most body sites, we observed smooth abundance gradients of key genera without discrete clustering of samples. Some body habitats displayed bimodal (e.g., gut) or multimodal (e.g., vagina) distributions of sample abundances, but not all clustering methods and workflows accurately highlight such clusters. Because identifying enterotypes in datasets depends not only on the structure of the data but is also sensitive to the methods applied to identifying clustering strength, we recommend that multiple approaches be used and compared when testing for enterotypes.
AB - Recent analyses of human-associated bacterial diversity have categorized individuals into 'enterotypes' or clusters based on the abundances of key bacterial genera in the gut microbiota. There is a lack of consensus, however, on the analytical basis for enterotypes and on the interpretation of these results. We tested how the following factors influenced the detection of enterotypes: clustering methodology, distance metrics, OTU-picking approaches, sequencing depth, data type (whole genome shotgun (WGS) vs.16S rRNA gene sequence data), and 16S rRNA region. We included 16S rRNA gene sequences from the Human Microbiome Project (HMP) and from 16 additional studies and WGS sequences from the HMP and MetaHIT. In most body sites, we observed smooth abundance gradients of key genera without discrete clustering of samples. Some body habitats displayed bimodal (e.g., gut) or multimodal (e.g., vagina) distributions of sample abundances, but not all clustering methods and workflows accurately highlight such clusters. Because identifying enterotypes in datasets depends not only on the structure of the data but is also sensitive to the methods applied to identifying clustering strength, we recommend that multiple approaches be used and compared when testing for enterotypes.
UR - http://www.scopus.com/inward/record.url?scp=84873510063&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84873510063&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1002863
DO - 10.1371/journal.pcbi.1002863
M3 - Article
C2 - 23326225
AN - SCOPUS:84873510063
SN - 1553-734X
VL - 9
JO - PLoS computational biology
JF - PLoS computational biology
IS - 1
M1 - e1002863
ER -