8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      More on the Best Evolutionary Rate for Phylogenetic Analysis

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The accumulation of genome-scale molecular data sets for nonmodel taxa brings us ever closer to resolving the tree of life of all living organisms. However, despite the depth of data available, a number of studies that each used thousands of genes have reported conflicting results. The focus of phylogenomic projects must thus shift to more careful experimental design. Even though we still have a limited understanding of what are the best predictors of the phylogenetic informativeness of a gene, there is wide agreement that one key factor is its evolutionary rate; but there is no consensus as to whether the rates derived as optimal in various analytical, empirical, and simulation approaches have any general applicability. We here use simulations to infer optimal rates in a set of realistic phylogenetic scenarios with varying tree sizes, numbers of terminals, and tree shapes. Furthermore, we study the relationship between the optimal rate and rate variation among sites and among lineages. Finally, we examine how well the predictions made by a range of experimental design methods correlate with the observed performance in our simulations.

          We find that the optimal level of divergence is surprisingly robust to differences in taxon sampling and even to among-site and among-lineage rate variation as often encountered in empirical data sets. This finding encourages the use of methods that rely on a single optimal rate to predict a gene’s utility. Focusing on correct recovery either of the most basal node in the phylogeny or of the entire topology, the optimal rate is about 0.45 substitutions from root to tip in average Yule trees and about 0.2 in difficult trees with short basal and long-apical branches, but all rates leading to divergence levels between about 0.1 and 0.5 perform reasonably well.

          Testing the performance of six methods that can be used to predict a gene’s utility against our simulation results, we find that the probability of resolution, signal-noise analysis, and Fisher information are good predictors of phylogenetic informativeness, but they require specification of at least part of a model tree. Likelihood quartet mapping also shows very good performance but only requires sequence alignments and is thus applicable without making assumptions about the phylogeny. Despite them being the most commonly used methods for experimental design, geometric quartet mapping and the integration of phylogenetic informativeness curves perform rather poorly in our comparison. Instead of derived predictors of phylogenetic informativeness, we suggest that the number of sites in a gene that evolve at near-optimal rates (as inferred here) could be used directly to prioritize genes for phylogenetic inference. In combination with measures of model fit, especially with respect to compositional biases and among-site and among-lineage rate variation, such an approach has the potential to greatly improve marker choice and should be tested on empirical data.

          Related collections

          Most cited references51

          • Record: found
          • Abstract: found
          • Article: not found

          Testing for phylogenetic signal in comparative data: behavioral traits are more labile.

          The primary rationale for the use of phylogenetically based statistical methods is that phylogenetic signal, the tendency for related species to resemble each other, is ubiquitous. Whether this assertion is true for a given trait in a given lineage is an empirical question, but general tools for detecting and quantifying phylogenetic signal are inadequately developed. We present new methods for continuous-valued characters that can be implemented with either phylogenetically independent contrasts or generalized least-squares models. First, a simple randomization procedure allows one to test the null hypothesis of no pattern of similarity among relatives. The test demonstrates correct Type I error rate at a nominal alpha = 0.05 and good power (0.8) for simulated datasets with 20 or more species. Second, we derive a descriptive statistic, K, which allows valid comparisons of the amount of phylogenetic signal across traits and trees. Third, we provide two biologically motivated branch-length transformations, one based on the Ornstein-Uhlenbeck (OU) model of stabilizing selection, the other based on a new model in which character evolution can accelerate or decelerate (ACDC) in rate (e.g., as may occur during or after an adaptive radiation). Maximum likelihood estimation of the OU (d) and ACDC (g) parameters can serve as tests for phylogenetic signal because an estimate of d or g near zero implies that a phylogeny with little hierarchical structure (a star) offers a good fit to the data. Transformations that improve the fit of a tree to comparative data will increase power to detect phylogenetic signal and may also be preferable for further comparative analyses, such as of correlated character evolution. Application of the methods to data from the literature revealed that, for trees with 20 or more species, 92% of traits exhibited significant phylogenetic signal (randomization test), including behavioral and ecological ones that are thought to be relatively evolutionarily malleable (e.g., highly adaptive) and/or subject to relatively strong environmental (nongenetic) effects or high levels of measurement error. Irrespective of sample size, most traits (but not body size, on average) showed less signal than expected given the topology, branch lengths, and a Brownian motion model of evolution (i.e., K was less than one), which may be attributed to adaptation and/or measurement error in the broad sense (including errors in estimates of phenotypes, branch lengths, and topology). Analysis of variance of log K for all 121 traits (from 35 trees) indicated that behavioral traits exhibit lower signal than body size, morphological, life-history, or physiological traits. In addition, physiological traits (corrected for body size) showed less signal than did body size itself. For trees with 20 or more species, the estimated OU (25% of traits) and/or ACDC (40%) transformation parameter differed significantly from both zero and unity, indicating that a hierarchical tree with less (or occasionally more) structure than the original better fit the data and so could be preferred for comparative analyses.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Is a new and general theory of molecular systematics emerging?

            The advent and maturation of algorithms for estimating species trees-phylogenetic trees that allow gene tree heterogeneity and whose tips represent lineages, populations and species, as opposed to genes-represent an exciting confluence of phylogenetics, phylogeography, and population genetics, and ushers in a new generation of concepts and challenges for the molecular systematist. In this essay I argue that to better deal with the large multilocus datasets brought on by phylogenomics, and to better align the fields of phylogeography and phylogenetics, we should embrace the primacy of species trees, not only as a new and useful practical tool for systematics, but also as a long-standing conceptual goal of systematics that, largely due to the lack of appropriate computational tools, has been eclipsed in the past few decades. I suggest that phylogenies as gene trees are a "local optimum" for systematics, and review recent advances that will bring us to the broader optimum inherent in species trees. In addition to adopting new methods of phylogenetic analysis (and ideally reserving the term "phylogeny" for species trees rather than gene trees), the new paradigm suggests shifts in a number of practices, such as sampling data to maximize not only the number of accumulated sites but also the number of independently segregating genes; routinely using coalescent or other models in computer simulations to allow gene tree heterogeneity; and understanding better the role of concatenation in influencing topologies and confidence in phylogenies. By building on the foundation laid by concepts of gene trees and coalescent theory, and by taking cues from recent trends in multilocus phylogeography, molecular systematics stands to be enriched. Many of the challenges and lessons learned for estimating gene trees will carry over to the challenge of estimating species trees, although adopting the species tree paradigm will clarify many issues (such as the nature of polytomies and the star tree paradox), raise conceptually new challenges, or provide new answers to old questions.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Among-site rate variation and its impact on phylogenetic analyses.

              Although several decades of study have revealed the ubiquity of variation of evolutionary rates among sites, reliable methods for studying rate variation were not developed until very recently. Early methods fit theoretical distributions to the numbers of changes at sites inferred by parsimony and substantially underestimate the rate variation. Recent analyses show that failure to account for rate variation can have drastic effects, leading to biased dating of speciation events, biased estimation of the transition:transversion rate ratio, and incorrect reconstruction of phylogenies.
                Bookmark

                Author and article information

                Journal
                Syst Biol
                Syst. Biol
                sysbio
                Systematic Biology
                Oxford University Press
                1063-5157
                1076-836X
                September 2017
                08 June 2017
                08 June 2017
                : 66
                : 5
                : 769-785
                Affiliations
                [1 ] Naturhistorisches Museum der Burgergemeinde Bern, Bernastr. 15, CH-3005 Bern, Switzerland
                [2 ] University of Bern, Institute of Ecology and Evolution, Baltzerstr. 6, CH-3012 Bern, Switzerland
                [3 ] European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
                Author notes
                [* ] Correspondence to be sent to: Naturhistorisches Museum der Burgergemeinde Bern, Bernastr. 15, CH-3005 Bern, Switzerland; E-mail: klopfstein@ 123456nmbe.ch .

                Associate Editor: Jeffrey Townsend

                Article
                syx051
                10.1093/sysbio/syx051
                5790136
                28595363
                de2ac6f0-348d-430b-86a1-19a91faddd22
                © The Author(s) 2017. Published by Oxford University Press on behalf of the Systematic Biology.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                : 19 August 2016
                : 19 May 2016
                : 24 May 2017
                Page count
                Pages: 17
                Categories
                Regular Articles

                Animal science & Zoology
                experimental design,phylogenomics
                Animal science & Zoology
                experimental design, phylogenomics

                Comments

                Comment on this article