+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe.

          Author Summary

          Phylogenetic trees are the most common datatype by which we examine evolutionary patterns. However, biological and practical considerations require the exploration of other models. Here, we address a problem concerning the representation of conflicting and partially overlapping datasets in phylogenetics. We examine the problem of aligning many source trees from independent phylogenetic analyses into a structure that can be analyzed and synthesized but retain all of the original structure and source information. We present methods to map trees into a common graph structure using a graph database. This allows the information in the trees to be stored and synthesized in several ways. Specifically, we demonstrate how these graphs can be used to construct enormous trees as an alternative to labor-intensive grafting exercise and other methods that make the synthetic tree difficult to update. We also show how examination of the relationships in the graph allows patterns to emerge concerning support and information that are difficult to discern with existing methods. Because these methods scale well into the millions of nodes, these techniques should lead to the construction and maintenance of even larger phylogenies and new techniques for analyzing graphs that maintain the structure of the underlying trees.

          Related collections

          Most cited references 15

          • Record: found
          • Abstract: found
          • Article: not found

          The delayed rise of present-day mammals.

          Did the end-Cretaceous mass extinction event, by eliminating non-avian dinosaurs and most of the existing fauna, trigger the evolutionary radiation of present-day mammals? Here we construct, date and analyse a species-level phylogeny of nearly all extant Mammalia to bring a new perspective to this question. Our analyses of how extant lineages accumulated through time show that net per-lineage diversification rates barely changed across the Cretaceous/Tertiary boundary. Instead, these rates spiked significantly with the origins of the currently recognized placental superorders and orders approximately 93 million years ago, before falling and remaining low until accelerating again throughout the Eocene and Oligocene epochs. Our results show that the phylogenetic 'fuses' leading to the explosion of extant placental orders are not only very much longer than suspected previously, but also challenge the hypothesis that the end-Cretaceous mass extinction event had a major, direct influence on the diversification of today's mammals.
            • Record: found
            • Abstract: found
            • Article: not found

            DensiTree: making sense of sets of phylogenetic trees.

            Bayesian analysis through programs like BEAST (Drummond and Rumbaut, 2007) and MrBayes (Huelsenbeck et al., 2001) provides a powerful method for reconstruction of evolutionary relationships. One of the benefits of Bayesian methods is that well-founded estimates of uncertainty in models can be made available. So, for example, not only the mean time of a most recent common ancestor (tMRCA) is estimated, but also the spread. This distribution over model space is represented by a set of trees, which can be rather large and difficult to interpret. DensiTree is a tool that helps navigating these sets of trees. The main idea behind DensiTree is to draw all trees in the set transparently. As a result, areas where a lot of the trees agree in topology and branch lengths show up as highly colored areas, while areas with little agreement show up as webs. This makes it possible to quickly get an impression of properties of the tree set such as well-supported clades, distribution of tMRCA and areas of topological uncertainty. Thus, DensiTree provides a quick method for qualitative analysis of tree sets. DensiTree is freely available from The program is licensed under GPL and source code is available.
              • Record: found
              • Abstract: found
              • Article: not found

              Darwin's abominable mystery: Insights from a supertree of the angiosperms.

              Angiosperms are among the major terrestrial radiations of life and a model group for studying patterns and processes of diversification. As a tool for future comparative studies, we compiled a supertree of angiosperm families from published phylogenetic studies. Sequence data from the plastid rbcL gene were used to estimate relative timing of branching events, calibrated by using robust fossil dates. The frequency of shifts in diversification rate is largely constant among time windows but with an apparent increase in diversification rates within the more recent time frames. Analyses of species numbers among families revealed that diversification rate is a labile attribute of lineages at all levels of the tree. An examination of the top 10 major shifts in diversification rates indicates they cannot easily be attributed to the action of a few key innovations but instead are consistent with a more complex process of diversification, reflecting the interactive effects of biological traits and the environment.

                Author and article information

                Role: Editor
                PLoS Comput Biol
                PLoS Comput. Biol
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                September 2013
                September 2013
                26 September 2013
                : 9
                : 9
                Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
                University of California San Diego, United States of America
                Author notes

                The authors have declared that no competing interests exist.

                Conceived and designed the experiments: SAS JWB CEH. Performed the experiments: SAS JWB CEH. Analyzed the data: SAS JWB CEH. Contributed reagents/materials/analysis tools: SAS JWB CEH. Wrote the paper: SAS JWB CEH.


                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                Page count
                Pages: 15
                SAS, JWB, and CEH were supported by the NSF AVATOL 1207915. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Research Article

                Quantitative & Systems biology


                Comment on this article