24
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Emerging Concepts of Data Integration in Pathogen Phylodynamics

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics.

          Related collections

          Most cited references112

          • Record: found
          • Abstract: found
          • Article: not found

          Bayesian phylogenetic analysis of combined data.

          The recent development of Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) techniques has facilitated the exploration of parameter-rich evolutionary models. At the same time, stochastic models have become more realistic (and complex) and have been extended to new types of data, such as morphology. Based on this foundation, we developed a Bayesian MCMC approach to the analysis of combined data sets and explored its utility in inferring relationships among gall wasps based on data from morphology and four genes (nuclear and mitochondrial, ribosomal and protein coding). Examined models range in complexity from those recognizing only a morphological and a molecular partition to those having complex substitution models with independent parameters for each gene. Bayesian MCMC analysis deals efficiently with complex models: convergence occurs faster and more predictably for complex models, mixing is adequate for all parameters even under very complex models, and the parameter update cycle is virtually unaffected by model partitioning across sites. Morphology contributed only 5% of the characters in the data set but nevertheless influenced the combined-data tree, supporting the utility of morphological data in multigene analyses. We used Bayesian criteria (Bayes factors) to show that process heterogeneity across data partitions is a significant model component, although not as important as among-site rate variation. More complex evolutionary models are associated with more topological uncertainty and less conflict between morphology and molecules. Bayes factors sometimes favor simpler models over considerably more parameter-rich models, but the best model overall is also the most complex and Bayes factors do not support exclusion of apparently weak parameters from this model. Thus, Bayes factors appear to be useful for selecting among complex models, but it is still unclear whether their use strikes a reasonable balance between model complexity and error in parameter estimates.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Estimating the rate of evolution of the rate of molecular evolution.

            A simple model for the evolution of the rate of molecular evolution is presented. With a Bayesian approach, this model can serve as the basis for estimating dates of important evolutionary events even in the absence of the assumption of constant rates among evolutionary lineages. The method can be used in conjunction with any of the widely used models for nucleotide substitution or amino acid replacement. It is illustrated by analyzing a data set of rbcL protein sequences.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Smooth skyride through a rough skyline: Bayesian coalescent-based inference of population dynamics.

              Kingman's coalescent process opens the door for estimation of population genetics model parameters from molecular sequences. One paramount parameter of interest is the effective population size. Temporal variation of this quantity characterizes the demographic history of a population. Because researchers are rarely able to choose a priori a deterministic model describing effective population size dynamics for data at hand, nonparametric curve-fitting methods based on multiple change-point (MCP) models have been developed. We propose an alternative to change-point modeling that exploits Gaussian Markov random fields to achieve temporal smoothing of the effective population size in a Bayesian framework. The main advantage of our approach is that, in contrast to MCP models, the explicit temporal smoothing does not require strong prior decisions. To approximate the posterior distribution of the population dynamics, we use efficient, fast mixing Markov chain Monte Carlo algorithms designed for highly structured Gaussian models. In a simulation study, we demonstrate that the proposed temporal smoothing method, named Bayesian skyride, successfully recovers "true" population size trajectories in all simulation scenarios and competes well with the MCP approaches without evoking strong prior assumptions. We apply our Bayesian skyride method to 2 real data sets. We analyze sequences of hepatitis C virus contemporaneously sampled in Egypt, reproducing all key known aspects of the viral population dynamics. Next, we estimate the demographic histories of human influenza A hemagglutinin sequences, serially sampled throughout 3 flu seasons.
                Bookmark

                Author and article information

                Journal
                Syst Biol
                Syst. Biol
                sysbio
                Systematic Biology
                Oxford University Press
                1063-5157
                1076-836X
                January 2017
                06 June 2016
                06 June 2016
                : 66
                : 1
                : e47-e65
                Affiliations
                [1 ] Department of Microbiology and Immunology, Rega Institute, KU Leuven - University of Leuven, Leuven, Belgium
                [2 ] Department of Biomathematics, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, USA
                [3 ] Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, USA
                [4 ] Department of Biostatistics, School of Public Health, University of California, Los Angeles, CA 90095, USA
                [5 ] Institute of Evolutionary Biology, University of Edinburgh, Kings Buildings, Edinburgh EH9 3FL, UK
                [6 ] Centre for Immunity, Infection and Evolution, University of Edinburgh, Kings Buildings, Edinburgh EH9 3FL, UK
                Author notes
                [* ] Correspondence to be sent to: Philippe Lemey, Department of Microbiology and Immunology, KU Leuven - University of Leuven, Minderbroedersstaat 10, 3000 Leuven, Belgium; E-mail: philippe.lemey@ 123456kuleuven.be .

                Associate Editor: David Bryant

                Article
                syw054
                10.1093/sysbio/syw054
                5837209
                28173504
                29f0fadb-809e-41c2-a9b0-fe92821ea1ce
                © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 14 November 2015
                : 22 January 2016
                : 2 June 2016
                Page count
                Pages: 19
                Funding
                Funded by: National Institutes of Health 10.13039/100000002
                Categories
                The following are online-only papers that are freely available as part of Issue 66(1) online.
                Custom metadata
                corrected-proof

                Animal science & Zoology
                Animal science & Zoology

                Comments

                Comment on this article