44
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Pruning Rogue Taxa Improves Phylogenetic Accuracy: An Efficient Algorithm and Webservice

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The presence of rogue taxa (rogues) in a set of trees can frequently have a negative impact on the results of a bootstrap analysis (e.g., the overall support in consensus trees). We introduce an efficient graph-based algorithm for rogue taxon identification as well as an interactive webservice implementing this algorithm. Compared with our previous method, the new algorithm is up to 4 orders of magnitude faster, while returning qualitatively identical results. Because of this significant improvement in scalability, the new algorithm can now identify substantially more complex and compute-intensive rogue taxon constellations. On a large and diverse collection of real-world data sets, we show that our method yields better supported reduced/pruned consensus trees than any competing rogue taxon identification method. Using the parallel version of our open-source code, we successfully identified rogue taxa in a set of 100 trees with 116 334 taxa each. For simulated data sets, we show that when removing/pruning rogue taxa with our method from a tree set, we consistently obtain bootstrap consensus trees as well as maximum-likelihood trees that are topologically closer to the respective true trees.

          Related collections

          Most cited references15

          • Record: found
          • Abstract: found
          • Article: not found

          ATV: display and manipulation of annotated phylogenetic trees.

          A Tree Viewer (ATV) is a Java tool for the display and manipulation of annotated phylogenetic trees. It can be utilized both as a standalone application and as an applet in a web browser.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Phylogenetic-signal dissection of nuclear housekeeping genes supports the paraphyly of sponges and the monophyly of Eumetazoa.

            The relationships at the base of the metazoan tree have been difficult to robustly resolve, and there are several different hypotheses regarding the interrelationships among sponges, cnidarians, ctenophores, placozoans, and bilaterians, with each hypothesis having different implications for the body plan of the last common ancestor of animals and the paleoecology of the late Precambrian. We have sequenced seven nuclear housekeeping genes from 17 new sponges, bringing the total to 29 species analyzed, including multiple representatives of the Demospongiae, Calcarea, Hexactinellida, and Homoscleromorpha, and analyzed a data set also including six nonmetazoan outgroups and 36 eumetazoans using a variety of phylogenetic methods and evolutionary models. We used leaf stability to identify rogue taxa and investigate their effect on the support of the nodes in our trees, and we identified clades most likely to represent phylogenetic artifacts through the comparison of trees derived using different methods (and models) and through site-stripping analyses. Further, we investigated compositional heterogeneity and tested whether amino acid composition bias affected our results. Finally, we used Bayes factors to compare our results against previously published phylogenies. All our maximum likelihood (ML) and Bayesian analyses find sponges to be paraphyletic, with all analyses finding three extant paraphyletic sponge lineages, Demospongiae plus Hexactinellida, Calcarea, and Homoscleromorpha. All but one of our ML and Bayesian analyses support the monophyly of Eumetazoa (here Cnidaria + Bilateria) and a sister group relationship between Placozoa (here Trichoplax adhaerens) and Eumetazoa. Bayes factors invariably provide decisive support in favor of poriferan paraphyly when compared against either a sister group relationship between Porifera and Cnidaria or with a monophyletic Porifera with respect to a monophyletic Eumetazoa. Although we were able to recover sponge monophyly using our data set, this was only possible under unrealistic evolutionary models, if poorly performing phylogenetic methods were used, or in situations where the potential for the generation of tree reconstruction artifacts was artificially exacerbated. Everything considered, our data set does not provide any support for a monophyletic Diploblastica (here Placozoa + Cnidaria + Porifera) and suggests that a monophyletic Porifera may be better seen as a phylogenetic artifact.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Sparse supermatrices for phylogenetic inference: taxonomy, alignment, rogue taxa, and the phylogeny of living turtles.

              As phylogenetic data sets grow in size and number, objective methods to summarize this information are becoming increasingly important. Supermatrices can combine existing data directly and in principle provide effective syntheses of phylogenetic information that may reveal new relationships. However, several serious difficulties exist in the construction of large supermatrices that must be overcome before these approaches will enjoy broad utility. We present analyses that examine the performance of sparse supermatrices constructed from large sequence databases for the reconstruction of species-level phylogenies. We develop a largely automated informatics pipeline that allows for the construction of sparse supermatrices from GenBank data. In doing so, we develop strategies for alleviating some of the outstanding impediments to accurate phylogenetic inference using these approaches. These include taxonomic standardization, automated alignment, and the identification of rogue taxa. We use turtles as an exemplar clade and present a well-supported species-level phylogeny for two-thirds of all turtle species based on a approximately 50 kb supermatrix consisting of 93% missing data. Finally, we discuss some of the remaining pitfalls and concerns associated with supermatrix analyses, provide comparisons to supertree approaches, and suggest areas for future research.
                Bookmark

                Author and article information

                Journal
                Syst Biol
                Syst. Biol
                sysbio
                sysbio
                Systematic Biology
                Oxford University Press
                1063-5157
                1076-836X
                January 2013
                8 November 2012
                8 November 2012
                : 62
                : 1
                : 162-166
                Affiliations
                Exelixis Laboratory, Scientific Computing Group, Heidelberg Institute for Theoretical Studies (HITS gGmbH), Schloss-Wolfsbrunnenweg 35, D-69118 Heidelberg, Germany
                Author notes

                Associate Editor: David Posada

                *Correspondence to be sent to: Exelixis Laboratory, Scientific Computing Group, Heidelberg Institute for Theoretical Studies (HITS gGmbH), Schloss-Wolfsbrunnenweg 35, D-69118 Heidelberg, Germany; E-mail: andre.aberer@ 123456h-its.org .
                Article
                sys078
                10.1093/sysbio/sys078
                3526802
                22962004
                d1e676ad-ddc3-4e90-9892-92a3680b367d
                © The Author(s) 2012. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 25 May 2012
                : 5 July 2012
                : 31 August 2012
                Page count
                Pages: 5
                Categories
                Software for Systematics and Evolution

                Animal science & Zoology
                bootstrap support,consensus tree,phylogenetic postanalysis,rogue taxa,software,webservice

                Comments

                Comment on this article