4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Maximum Likelihood Inference of Small Trees in the Presence of Long Branches

      research-article
      ,
      Systematic Biology
      Oxford University Press

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The statistical basis of maximum likelihood (ML), its robustness, and the fact that it appears to suffer less from biases lead to it being one of the most popular methods for tree reconstruction. Despite its popularity, very few analytical solutions for ML exist, so biases suffered by ML are not well understood. One possible bias is long branch attraction (LBA), a regularly cited term generally used to describe a propensity for long branches to be joined together in estimated trees. Although initially mentioned in connection with inconsistency of parsimony, LBA has been claimed to affect all major phylogenetic reconstruction methods, including ML. Despite the widespread use of this term in the literature, exactly what LBA is and what may be causing it is poorly understood, even for simple evolutionary models and small model trees. Studies looking at LBA have focused on the effect of two long branches on tree reconstruction. However, to understand the effect of two long branches it is also important to understand the effect of just one long branch. If ML struggles to reconstruct one long branch, then this may have an impact on LBA. In this study, we look at the effect of one long branch on three-taxon tree reconstruction. We show that, counterintuitively, long branches are preferentially placed at the tips of the tree. This can be understood through the use of analytical solutions to the ML equation and distance matrix methods. We go on to look at the placement of two long branches on four-taxon trees, showing that there is no attraction between long branches, but that for extreme branch lengths long branches are joined together disproportionally often. These results illustrate that even small model trees are still interesting to help understand how ML phylogenetic reconstruction works, and that LBA is a complicated phenomenon that deserves further study. [analytic solutions; long branch attraction; maximum likelihood; simulation.]

          Related collections

          Most cited references40

          • Record: found
          • Abstract: found
          • Article: not found

          Resolution of the early placental mammal radiation using Bayesian phylogenetics.

          Molecular phylogenetic studies have resolved placental mammals into four major groups, but have not established the full hierarchy of interordinal relationships, including the position of the root. The latter is critical for understanding the early biogeographic history of placentals. We investigated placental phylogeny using Bayesian and maximum-likelihood methods and a 16.4-kilobase molecular data set. Interordinal relationships are almost entirely resolved. The basal split is between Afrotheria and other placentals, at about 103 million years, and may be accounted for by the separation of South America and Africa in the Cretaceous. Crown-group Eutheria may have their most recent common ancestry in the Southern Hemisphere (Gondwana).
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Some probabilistic and statistical problems on the analysis of DNA sequence

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates.

              Using simulated data, we compared five methods of phylogenetic tree estimation: parsimony, compatibility, maximum likelihood, Fitch-Margoliash, and neighbor joining. For each combination of substitution rates and sequence length, 100 data sets were generated for each of 50 trees, for a total of 5,000 replications per condition. Accuracy was measured by two measures of the distance between the true tree and the estimate of the tree, one measure sensitive to accuracy of branch lengths and the other not. The distance-matrix methods (Fitch-Margoliash and neighbor joining) performed best when they were constrained from estimating negative branch lengths; all comparisons with other methods used this constraint. Parsimony and compatibility had similar results, with compatibility generally inferior; Fitch-Margoliash and neighbor joining had similar results, with neighbor joining generally slightly inferior. Maximum likelihood was the most successful method overall, although for short sequences Fitch-Margoliash and neighbor joining were sometimes better. Bias of the estimates was inferred by measuring whether the independent estimates of a tree for different data sets were closer to the true tree than to each other. Parsimony and compatibility had particular difficulty with inaccuracy and bias when substitution rates varied among different branches. When rates of evolution varied among different sites, all methods showed signs of inaccuracy and bias.
                Bookmark

                Author and article information

                Journal
                Syst Biol
                Syst. Biol
                sysbio
                sysbio
                Systematic Biology
                Oxford University Press
                1063-5157
                1076-836X
                September 2014
                04 July 2014
                04 July 2014
                : 63
                : 5
                : 798-811
                Affiliations
                European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD, United Kingdom
                Author notes
                *Correspondence to be sent to: European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD, United Kingdom; E-mail: sparks@ 123456ebi.ac.uk .

                Associate Editor: Olivier Gascuel

                Article
                syu044
                10.1093/sysbio/syu044
                6371681
                24996414
                8627ffbc-9743-4744-9f56-57b01f566714
                © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 14 August 2013
                : 20 November 2013
                : 20 June 2014
                Page count
                Pages: 14
                Categories
                Regular Articles

                Animal science & Zoology
                Animal science & Zoology

                Comments

                Comment on this article