32
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      A Bayesian Approach to Inferring the Phylogenetic Structure of Communities from Metagenomic Data

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Metagenomics provides a powerful new tool set for investigating evolutionary interactions with the environment. However, an absence of model-based statistical methods means that researchers are often not able to make full use of this complex information. We present a Bayesian method for inferring the phylogenetic relationship among related organisms found within metagenomic samples. Our approach exploits variation in the frequency of taxa among samples to simultaneously infer each lineage haplotype, the phylogenetic tree connecting them, and their frequency within each sample. Applications of the algorithm to simulated data show that our method can recover a substantial fraction of the phylogenetic structure even in the presence of high rates of migration among sample sites. We provide examples of the method applied to data from green sulfur bacteria recovered from an Antarctic lake, plastids from mixed Plasmodium falciparum infections, and virulent Neisseria meningitidis samples.

          Related collections

          Most cited references25

          • Record: found
          • Abstract: found
          • Article: not found

          The global distribution of clinical episodes of Plasmodium falciparum malaria.

          Interest in mapping the global distribution of malaria is motivated by a need to define populations at risk for appropriate resource allocation and to provide a robust framework for evaluating its global economic impact. Comparison of older and more recent malaria maps shows how the disease has been geographically restricted, but it remains entrenched in poor areas of the world with climates suitable for transmission. Here we provide an empirical approach to estimating the number of clinical events caused by Plasmodium falciparum worldwide, by using a combination of epidemiological, geographical and demographic data. We estimate that there were 515 (range 300-660) million episodes of clinical P. falciparum malaria in 2002. These global estimates are up to 50% higher than those reported by the World Health Organization (WHO) and 200% higher for areas outside Africa, reflecting the WHO's reliance upon passive national reporting for these countries. Without an informed understanding of the cartography of malaria risk, the global extent of clinical disease caused by P. falciparum will continue to be underestimated.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Comparative metagenomics of microbial communities.

            The species complexity of microbial communities and challenges in culturing representative isolates make it difficult to obtain assembled genomes. Here we characterize and compare the metabolic capabilities of terrestrial and marine microbial communities using largely unassembled sequence data obtained by shotgun sequencing DNA isolated from the various environments. Quantitative gene content analysis reveals habitat-specific fingerprints that reflect known characteristics of the sampled environments. The identification of environment-specific genes through a gene-centric comparative analysis presents new opportunities for interpreting and diagnosing environments.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.

              Molecular techniques allow the survey of a large number of linked polymorphic loci in random samples from diploid populations. However, the gametic phase of haplotypes is usually unknown when diploid individuals are heterozygous at more than one locus. To overcome this difficulty, we implement an expectation-maximization (EM) algorithm leading to maximum-likelihood estimates of molecular haplotype frequencies under the assumption of Hardy-Weinberg proportions. The performance of the algorithm is evaluated for simulated data representing both DNA sequences and highly polymorphic loci with different levels of recombination. As expected, the EM algorithm is found to perform best for large samples, regardless of recombination rates among loci. To ensure finding the global maximum likelihood estimate, the EM algorithm should be started from several initial conditions. The present approach appears to be useful for the analysis of nuclear DNA sequences or highly variable loci. Although the algorithm, in principle, can accommodate an arbitrary number of loci, there are practical limitations because the computing time grows exponentially with the number of polymorphic loci. Although the algorithm, in principle, can accommodate an arbitrary number of loci, there are practical limitations because the computing time grows exponentially with the number of polymorphic loci.
                Bookmark

                Author and article information

                Journal
                Genetics
                Genetics
                genetics
                genetics
                genetics
                Genetics
                Genetics Society of America
                0016-6731
                1943-2631
                July 2014
                1 May 2014
                1 May 2014
                : 197
                : 3
                : 925-937
                Affiliations
                [* ]Department of Mathematics, Bowdoin College, Brunswick, Maine 04011
                []School of Public Health, Imperial College London, London W2 1PG, United Kingdom
                []Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom
                [§ ]Department of Statistics, University of Oxford, Oxford OX1 3TG, United Kingdom
                [** ]Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany
                Author notes

                Available freely online through the author-supported open access option.

                [1 ]Corresponding authors: Department of Mathematics, Bowdoin College, 8600 College Station, Brunwswick, ME 04011. E-mail: jobrien@ 123456bowdoin.edu ; and Max Planck Center for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany. E-mail: daniel_falush@ 123456eva.mpg.de
                Article
                161299
                10.1534/genetics.114.161299
                4096371
                24793089
                8bb2b3d9-942d-41f7-96e8-6f77e171aa20
                Copyright © 2014 by the Genetics Society of America

                Available freely online through the author-supported open access option.

                History
                : 07 January 2014
                : 09 April 2014
                Page count
                Pages: 13
                Categories
                Investigations
                Population and Evolutionary Genetics
                Custom metadata
                v1
                highlight-article

                Genetics
                metagenomics,bayesian phylogenetics,microevolution
                Genetics
                metagenomics, bayesian phylogenetics, microevolution

                Comments

                Comment on this article