20
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Terrace Aware Data Structure for Phylogenomic Inference from Supermatrices

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In phylogenomics the analysis of concatenated gene alignments, the so-called supermatrix, is commonly accompanied by the assumption of partition models. Under such models each gene, or more generally partition, is allowed to evolve under its own evolutionary model. Although partition models provide a more comprehensive analysis of supermatrices, missing data may hamper the tree search algorithms due to the existence of phylogenetic (partial) terraces. Here, we introduce the phylogenetic terrace aware (PTA) data structure for the efficient analysis under partition models. In the presence of missing data PTA exploits (partial) terraces and induced partition trees to save computation time. We show that an implementation of PTA in IQ-TREE leads to a substantial speedup of up to 4.5 and 8 times compared with the standard IQ-TREE and RAxML implementations, respectively. PTA is generally applicable to all types of partition models and common topological rearrangements thus can be employed by all phylogenomic inference software.

          Related collections

          Most cited references 24

          • Record: found
          • Abstract: found
          • Article: not found

          TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics

          Background Most analysis programs for inferring molecular phylogenies are difficult to use, in particular for researchers with little programming experience. Results TREEFINDER is an easy-to-use integrative platform-independent analysis environment for molecular phylogenetics. In this paper the main features of TREEFINDER (version of April 2004) are described. TREEFINDER is written in ANSI C and Java and implements powerful statistical approaches for inferring gene tree and related analyzes. In addition, it provides a user-friendly graphical interface and a phylogenetic programming language. Conclusions TREEFINDER is a versatile framework for analyzing phylogenetic data across different platforms that is suited both for exploratory as well as advanced studies.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis.

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous.

              All inferences in comparative biology depend on accurate estimates of evolutionary relationships. Recent phylogenetic analyses have turned away from maximum parsimony towards the probabilistic techniques of maximum likelihood and bayesian Markov chain Monte Carlo (BMCMC). These probabilistic techniques represent a parametric approach to statistical phylogenetics, because their criterion for evaluating a topology--the probability of the data, given the tree--is calculated with reference to an explicit evolutionary model from which the data are assumed to be identically distributed. Maximum parsimony can be considered nonparametric, because trees are evaluated on the basis of a general metric--the minimum number of character state changes required to generate the data on a given tree--without assuming a specific distribution. The shift to parametric methods was spurred, in large part, by studies showing that although both approaches perform well most of the time, maximum parsimony is strongly biased towards recovering an incorrect tree under certain combinations of branch lengths, whereas maximum likelihood is not. All these evaluations simulated sequences by a largely homogeneous evolutionary process in which data are identically distributed. There is ample evidence, however, that real-world gene sequences evolve heterogeneously and are not identically distributed. Here we show that maximum likelihood and BMCMC can become strongly biased and statistically inconsistent when the rates at which sequence sites evolve change non-identically over time. Maximum parsimony performs substantially better than current parametric methods over a wide range of conditions tested, including moderate heterogeneity and phylogenetic problems not normally considered difficult.
                Bookmark

                Author and article information

                Journal
                Syst Biol
                Syst. Biol
                sysbio
                sysbio
                Systematic Biology
                Oxford University Press
                1063-5157
                1076-836X
                November 2016
                26 April 2016
                26 April 2016
                : 65
                : 6
                : 997-1008
                Affiliations
                1Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, A-1030 Vienna, Austria and
                2Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, A-1090 Vienna, Austria;
                Author notes
                *Correspondence to be sent to: Center for Integrative Bioinformatics Vienna (CIBIV), Max F. Perutz Laboratories, Dr Bohrgasse 9, A-1030 Vienna, Austria; E-mail: minh.bui@ 123456univie.ac.at .

                Associate Editor: Edwards Susko

                Article
                syw037
                10.1093/sysbio/syw037
                5066062
                27121966
                © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                Page count
                Pages: 12
                Categories
                Regular Articles

                Comments

                Comment on this article