+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: not found

      Exploring Microbial Diversity and Taxonomy Using SSU rRNA Hypervariable Tag Sequencing

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Massively parallel pyrosequencing of hypervariable regions from small subunit ribosomal RNA (SSU rRNA) genes can sample a microbial community two or three orders of magnitude more deeply per dollar and per hour than capillary sequencing of full-length SSU rRNA. As with full-length rRNA surveys, each sequence read is a tag surrogate for a single microbe. However, rather than assigning taxonomy by creating gene trees de novo that include all experimental sequences and certain reference taxa, we compare the hypervariable region tags to an extensive database of rRNA sequences and assign taxonomy based on the best match in a Global Alignment for Sequence Taxonomy (GAST) process. The resulting taxonomic census provides information on both composition and diversity of the microbial community. To determine the effectiveness of using only hypervariable region tags for assessing microbial community membership, we compared the taxonomy assigned to the V3 and V6 hypervariable regions with the taxonomy assigned to full-length SSU rRNA sequences isolated from both the human gut and a deep-sea hydrothermal vent. The hypervariable region tags and full-length rRNA sequences provided equivalent taxonomy and measures of relative abundance of microbial communities, even for tags up to 15% divergent from their nearest reference match. The greater sampling depth per dollar afforded by massively parallel pyrosequencing reveals many more members of the “rare biosphere” than does capillary sequencing of the full-length gene. In addition, tag sequencing eliminates cloning bias and the sequences are short enough to be completely sequenced in a single read, maximizing the number of organisms sampled in a run while minimizing chimera formation. This technique allows the cost-effective exploration of changes in microbial community structure, including the rare biosphere, over space and time and can be applied immediately to initiatives, such as the Human Microbiome Project.

          Author Summary

          Microbes play a critical role in both human and environmental health. The more we explore microbial populations, the more complexity and diversity we find. Phylogenetic trees based on 16S ribosomal RNA genes have been used with great success to identify microbial taxonomy from DNA alone. New DNA sequencing technologies, such as massively parallel pyrosequencing, can provide orders of magnitude more DNA sequences than ever before, however, the sequences are much shorter, so new methods are necessary to identify the microbes from short DNA tags. We demonstrate the effectiveness of identifying microbial taxa by comparing short tags from 16S hypervariable regions against a large database of known 16S genes. Using this technique, hypervariable region tags provide equivalent taxonomy and relative abundances of microbial communities as full-length rRNA sequences. The greater sampling depth afforded by tag pyrosequencing uncovers not only the dominant microbial species, but many more members of the “rare biosphere” than does capillary sequencing of the full-length gene. Tag pyrosequencing greatly enhances projects exploring composition, diversity, and distribution of microbial populations, such as the Human Microbiome Initiative. A companion paper in PLoS Biology (see Dethlefsen et al., doi:10.1371/journal.pbio.0060280) successfully uses this technique to characterize the effects of antibiotics on the human gut microbiota.

          Related collections

          Most cited references 42

          • Record: found
          • Abstract: found
          • Article: not found

          MUSCLE: multiple sequence alignment with high accuracy and high throughput.

           Robert Edgar (2004)
          We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.
            • Record: found
            • Abstract: found
            • Article: not found

            Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB.

            A 16S rRNA gene database ( addresses limitations of public repositories by providing chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies. It was found that there is incongruent taxonomic nomenclature among curators even at the phylum level. Putative chimeras were identified in 3% of environmental sequences and in 0.2% of records derived from isolates. Environmental sequences were classified into 100 phylum-level lineages in the Archaea and Bacteria.
              • Record: found
              • Abstract: found
              • Article: not found

              Genome sequencing in microfabricated high-density picolitre reactors.

              The proliferation of large-scale DNA-sequencing projects in recent years has driven a search for alternative methods to reduce time and cost. Here we describe a scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments. The apparatus uses a novel fibre-optic slide of individual wells and is able to sequence 25 million bases, at 99% or better accuracy, in one four-hour run. To achieve an approximately 100-fold increase in throughput over current Sanger sequencing technology, we have developed an emulsion method for DNA amplification and an instrument for sequencing by synthesis using a pyrosequencing protocol optimized for solid support and picolitre-scale volumes. Here we show the utility, throughput, accuracy and robustness of this system by shotgun sequencing and de novo assembly of the Mycoplasma genitalium genome with 96% coverage at 99.96% accuracy in one run of the machine.

                Author and article information

                Role: Editor
                PLoS Genet
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                November 2008
                November 2008
                21 November 2008
                : 4
                : 11
                [1 ]Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts, United States of America
                [2 ]Department of Microbiology and Immunology, Stanford School of Medicine, Stanford, California, United States of America
                [3 ]Department of Medicine, Stanford University School of Medicine, Stanford, California, United States of America
                [4 ]Veterans Affairs Palo Alto Health Care System, Palo Alto, California, United States of America
                University of California Davis, United States of America
                Author notes

                Conceived and designed the experiments: SMH DMW DAR MLS. Performed the experiments: SMH LD JAH. Analyzed the data: SMH DMW. Contributed reagents/materials/analysis tools: SMH. Wrote the paper: SMH DMW DAR MLS.

                Huse et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                Page count
                Pages: 10
                Research Article
                Biotechnology/Applied Microbiology
                Biotechnology/Environmental Microbiology
                Computational Biology/Comparative Sequence Analysis
                Computational Biology/Metagenomics
                Computational Biology/Molecular Genetics
                Ecology/Community Ecology and Biodiversity
                Ecology/Environmental Microbiology
                Evolutionary Biology/Bioinformatics
                Evolutionary Biology/Microbial Evolution and Genomics
                Genetics and Genomics/Microbial Evolution and Genomics
                Microbiology/Applied Microbiology
                Microbiology/Environmental Microbiology
                Microbiology/Medical Microbiology
                Microbiology/Microbial Evolution and Genomics



                Comment on this article