Computational methods for high-throughput comparative analyses of natural microbial communities

Sarah P. Preheim, Allison R. Perrotta, Jonathan Friedman, Chris Smilie, Ilana Brito, Mark B. Smith, Eric Alm*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

28 Scopus citations


One of the most widely employed methods in metagenomics is the amplification and sequencing of the highly conserved ribosomal RNA (rRNA) genes from organisms in complex microbial communities. rRNA surveys, typically using the 16S rRNA gene for prokaryotic identification, provide information about the total diversity and taxonomic affiliation of organisms present in a sample. Greatly enhanced by high-throughput sequencing, these surveys have uncovered the remarkable diversity of uncultured organisms and revealed unappreciated ecological roles ranging from nutrient cycling to human health. This chapter outlines the best practices for comparative analyses of microbial community surveys. We explain how to transform raw data into meaningful units for further analysis and discuss how to calculate sample diversity and community distance metrics. Finally, we outline how to find associations of species with specific metadata and true correlations between species from compositional data. We focus on data generated by next-generation sequencing platforms, using the Illumina platform as a test case, because of its widespread use especially among researchers just entering the field.

Original languageAmerican English
Title of host publicationMicrobial Metagenomics, Metatranscriptomics, and Metaproteomics
PublisherAcademic Press Inc.
Number of pages18
ISBN (Print)9780124078635
StatePublished - 2013
Externally publishedYes

Publication series

NameMethods in Enzymology
ISSN (Print)0076-6879
ISSN (Electronic)1557-7988


  • 16S ribosomal
  • RNA survey operational
  • distance metric correlation analysis
  • diversity estimates community
  • taxonomic units


Dive into the research topics of 'Computational methods for high-throughput comparative analyses of natural microbial communities'. Together they form a unique fingerprint.

Cite this