+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: not found

      Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Profiling phylogenetic marker genes, such as the 16S rRNA gene, is a key tool for studies of microbial communities but does not provide direct evidence of a community’s functional capabilities. Here we describe PICRUSt (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States), a computational approach to predict the functional composition of a metagenome using marker gene data and a database of reference genomes. PICRUSt uses an extended ancestral-state reconstruction algorithm to predict which gene families are present and then combines gene families to estimate the composite metagenome. Using 16S information, PICRUSt recaptures key findings from the Human Microbiome Project and accurately predicts the abundance of gene families in host-associated and environmental communities, with quantifiable uncertainty. Our results demonstrate that phylogeny and function are sufficiently linked that this ‘predictive metagenomic’ approach should provide useful insights into the thousands of uncultivated microbial communities for which only marker gene surveys are currently available.

          Related collections

          Most cited references 49

          • Record: found
          • Abstract: not found
          • Article: not found

          QIIME allows analysis of high-throughput community sequencing data.

            • Record: found
            • Abstract: found
            • Article: not found

            Search and clustering orders of magnitude faster than BLAST.

             Robert Edgar (2010)
            Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification. UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets. Binaries are available at no charge for non-commercial use at
              • Record: found
              • Abstract: found
              • Article: not found

              APE: Analyses of Phylogenetics and Evolution in R language.

              Analysis of Phylogenetics and Evolution (APE) is a package written in the R language for use in molecular evolution and phylogenetics. APE provides both utility functions for reading and writing data and manipulating phylogenetic trees, as well as several advanced methods for phylogenetic and evolutionary analysis (e.g. comparative and population genetic methods). APE takes advantage of the many R functions for statistics and graphics, and also provides a flexible framework for developing and implementing further statistical methods for the analysis of evolutionary processes. The program is free and available from the official R package archive at APE is licensed under the GNU General Public License.

                Author and article information

                Nat Biotechnol
                Nat. Biotechnol.
                Nature biotechnology
                12 September 2013
                25 August 2013
                September 2013
                01 March 2014
                : 31
                : 9
                [1 ]Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada
                [2 ]Department of Microbiology, Oregon State University, Corvallis, OR, USA
                [3 ]Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, USA
                [4 ]Institute for Genomics and Systems Biology, Argonne National Laboratory, Lemont, IL, USA
                [5 ]BioFrontiers Institute, University of Colorado, Boulder, CO, USA
                [6 ]Department of Computer Science, University of Colorado, Boulder, CO, USA
                [7 ]Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA
                [8 ]Biotechnology Institute, University of Minnesota, Saint Paul, MN, USA
                [9 ]Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
                [10 ]Department of Chemistry and Biochemistry, University of Colorado, Boulder, CO, USA
                [11 ]Department of Biological Sciences, Florida International University, Miami Beach, FL, USA
                [12 ]Howard Hughes Medical Institute, Boulder, Colorado, USA
                [13 ]Broad Institute of MIT and Harvard, Cambridge, MA, USA
                Author notes

                These authors contributed equally.

                Funded by: National Human Genome Research Institute : NHGRI
                Award ID: U01 HG004866 || HG
                Funded by: National Human Genome Research Institute : NHGRI
                Award ID: R01 HG005969 || HG
                Funded by: National Human Genome Research Institute : NHGRI
                Award ID: R01 HG004872 || HG
                Funded by: National Institute of Diabetes and Digestive and Kidney Diseases : NIDDK
                Award ID: P01 DK078669 || DK
                Funded by: Howard Hughes Medical Institute :
                Award ID: || HHMI_



                Comment on this article