24
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Novel non-parametric models to estimate evolutionary rates and divergence times from heterochronous sequence data

      research-article
      1 , ,   1
      BMC Evolutionary Biology
      BioMed Central

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Early methods for estimating divergence times from gene sequence data relied on the assumption of a molecular clock. More sophisticated methods were created to model rate variation and used auto-correlation of rates, local clocks, or the so called “uncorrelated relaxed clock” where substitution rates are assumed to be drawn from a parametric distribution. In the case of Bayesian inference methods the impact of the prior on branching times is not clearly understood, and if the amount of data is limited the posterior could be strongly influenced by the prior.

          Results

          We develop a maximum likelihood method – Physher – that uses local or discrete clocks to estimate evolutionary rates and divergence times from heterochronous sequence data. Using two empirical data sets we show that our discrete clock estimates are similar to those obtained by other methods, and that Physher outperformed some methods in the estimation of the root age of an influenza virus data set. A simulation analysis suggests that Physher can outperform a Bayesian method when the real topology contains two long branches below the root node, even when evolution is strongly clock-like.

          Conclusions

          These results suggest it is advisable to use a variety of methods to estimate evolutionary rates and divergence times from heterochronous sequence data. Physher and the associated data sets used here are available online at http://code.google.com/p/physher/.

          Related collections

          Most cited references16

          • Record: found
          • Abstract: found
          • Article: not found

          Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty.

          Recent developments in marginal likelihood estimation for model selection in the field of Bayesian phylogenetics and molecular evolution have emphasized the poor performance of the harmonic mean estimator (HME). Although these studies have shown the merits of new approaches applied to standard normally distributed examples and small real-world data sets, not much is currently known concerning the performance and computational issues of these methods when fitting complex evolutionary and population genetic models to empirical real-world data sets. Further, these approaches have not yet seen widespread application in the field due to the lack of implementations of these computationally demanding techniques in commonly used phylogenetic packages. We here investigate the performance of some of these new marginal likelihood estimators, specifically, path sampling (PS) and stepping-stone (SS) sampling for comparing models of demographic change and relaxed molecular clocks, using synthetic data and real-world examples for which unexpected inferences were made using the HME. Given the drastically increased computational demands of PS and SS sampling, we also investigate a posterior simulation-based analogue of Akaike's information criterion (AIC) through Markov chain Monte Carlo (MCMC), a model comparison approach that shares with the HME the appealing feature of having a low computational overhead over the original MCMC analysis. We confirm that the HME systematically overestimates the marginal likelihood and fails to yield reliable model classification and show that the AICM performs better and may be a useful initial evaluation of model choice but that it is also, to a lesser degree, unreliable. We show that PS and SS sampling substantially outperform these estimators and adjust the conclusions made concerning previous analyses for the three real-world data sets that we reanalyzed. The methods used in this article are now available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Estimating the rate of evolution of the rate of molecular evolution.

            A simple model for the evolution of the rate of molecular evolution is presented. With a Bayesian approach, this model can serve as the basis for estimating dates of important evolutionary events even in the absence of the assumption of constant rates among evolutionary lineages. The method can be used in conjunction with any of the widely used models for nucleotide substitution or amino acid replacement. It is illustrated by analyzing a data set of rbcL protein sequences.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Smooth skyride through a rough skyline: Bayesian coalescent-based inference of population dynamics.

              Kingman's coalescent process opens the door for estimation of population genetics model parameters from molecular sequences. One paramount parameter of interest is the effective population size. Temporal variation of this quantity characterizes the demographic history of a population. Because researchers are rarely able to choose a priori a deterministic model describing effective population size dynamics for data at hand, nonparametric curve-fitting methods based on multiple change-point (MCP) models have been developed. We propose an alternative to change-point modeling that exploits Gaussian Markov random fields to achieve temporal smoothing of the effective population size in a Bayesian framework. The main advantage of our approach is that, in contrast to MCP models, the explicit temporal smoothing does not require strong prior decisions. To approximate the posterior distribution of the population dynamics, we use efficient, fast mixing Markov chain Monte Carlo algorithms designed for highly structured Gaussian models. In a simulation study, we demonstrate that the proposed temporal smoothing method, named Bayesian skyride, successfully recovers "true" population size trajectories in all simulation scenarios and competes well with the MCP approaches without evoking strong prior assumptions. We apply our Bayesian skyride method to 2 real data sets. We analyze sequences of hepatitis C virus contemporaneously sampled in Egypt, reproducing all key known aspects of the viral population dynamics. Next, we estimate the demographic histories of human influenza A hemagglutinin sequences, serially sampled throughout 3 flu seasons.
                Bookmark

                Author and article information

                Contributors
                Journal
                BMC Evol Biol
                BMC Evol. Biol
                BMC Evolutionary Biology
                BioMed Central
                1471-2148
                2014
                24 July 2014
                : 14
                : 163
                Affiliations
                [1 ]Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Biological Sciences and Sydney Medical School, The University of Sydney, Sydney, Australia
                Article
                s12862-014-0163-6
                10.1186/s12862-014-0163-6
                4222489
                25055743
                c4935b55-0e9a-4b1e-a446-069608d055e2
                Copyright © 2014 Fourment and Holmes; licensee BioMed Central Ltd

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 2 March 2014
                : 14 July 2014
                Categories
                Research Article

                Evolutionary Biology
                Evolutionary Biology

                Comments

                Comment on this article