Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe.

Author Summary

Phylogenetic trees are the most common datatype by which we examine evolutionary patterns. However, biological and practical considerations require the exploration of other models. Here, we address a problem concerning the representation of conflicting and partially overlapping datasets in phylogenetics. We examine the problem of aligning many source trees from independent phylogenetic analyses into a structure that can be analyzed and synthesized but retain all of the original structure and source information. We present methods to map trees into a common graph structure using a graph database. This allows the information in the trees to be stored and synthesized in several ways. Specifically, we demonstrate how these graphs can be used to construct enormous trees as an alternative to labor-intensive grafting exercise and other methods that make the synthetic tree difficult to update. We also show how examination of the relationships in the graph allows patterns to emerge concerning support and information that are difficult to discern with existing methods. Because these methods scale well into the millions of nodes, these techniques should lead to the construction and maintenance of even larger phylogenies and new techniques for analyzing graphs that maintain the structure of the underlying trees.

Related collections

Most cited references 15

Record: found
Abstract: found
Article: not found

The delayed rise of present-day mammals.

Olaf Bininda-Emonds, Marcel Cardillo, Kate E. Jones … (2007)

Did the end-Cretaceous mass extinction event, by eliminating non-avian dinosaurs and most of the existing fauna, trigger the evolutionary radiation of present-day mammals? Here we construct, date and analyse a species-level phylogeny of nearly all extant Mammalia to bring a new perspective to this question. Our analyses of how extant lineages accumulated through time show that net per-lineage diversification rates barely changed across the Cretaceous/Tertiary boundary. Instead, these rates spiked significantly with the origins of the currently recognized placental superorders and orders approximately 93 million years ago, before falling and remaining low until accelerating again throughout the Eocene and Oligocene epochs. Our results show that the phylogenetic 'fuses' leading to the explosion of extant placental orders are not only very much longer than suspected previously, but also challenge the hypothesis that the end-Cretaceous mass extinction event had a major, direct influence on the diversification of today's mammals.

0 comments Cited 732 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

DensiTree: making sense of sets of phylogenetic trees.

Remco R Bouckaert (2010)

Bayesian analysis through programs like BEAST (Drummond and Rumbaut, 2007) and MrBayes (Huelsenbeck et al., 2001) provides a powerful method for reconstruction of evolutionary relationships. One of the benefits of Bayesian methods is that well-founded estimates of uncertainty in models can be made available. So, for example, not only the mean time of a most recent common ancestor (tMRCA) is estimated, but also the spread. This distribution over model space is represented by a set of trees, which can be rather large and difficult to interpret. DensiTree is a tool that helps navigating these sets of trees. The main idea behind DensiTree is to draw all trees in the set transparently. As a result, areas where a lot of the trees agree in topology and branch lengths show up as highly colored areas, while areas with little agreement show up as webs. This makes it possible to quickly get an impression of properties of the tree set such as well-supported clades, distribution of tMRCA and areas of topological uncertainty. Thus, DensiTree provides a quick method for qualitative analysis of tree sets. DensiTree is freely available from http://compevol.auckland.ac.nz/software/DensiTree/. The program is licensed under GPL and source code is available. remco@cs.auckland.ac.nz

0 comments Cited 274 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Darwin's abominable mystery: Insights from a supertree of the angiosperms.

T Jonathan Davies, Timothy G Barraclough, Mark W. Chase … (2004)

Angiosperms are among the major terrestrial radiations of life and a model group for studying patterns and processes of diversification. As a tool for future comparative studies, we compiled a supertree of angiosperm families from published phylogenetic studies. Sequence data from the plastid rbcL gene were used to estimate relative timing of branching events, calibrated by using robust fossil dates. The frequency of shifts in diversification rate is largely constant among time windows but with an apparent increase in diversification rates within the more recent time frames. Analyses of species numbers among families revealed that diversification rate is a labile attribute of lineages at all levels of the tree. An examination of the top 10 major shifts in diversification rates indicates they cannot easily be attributed to the action of a few key innovations but instead are consistent with a more complex process of diversification, reflecting the interactive effects of biological traits and the environment.

0 comments Cited 177 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Sergei L. Kosakovsky Pond: Role: Editor

Journal

Journal ID (nlm-ta): PLoS Comput Biol

Journal ID (iso-abbrev): PLoS Comput. Biol

Journal ID (publisher-id): plos

Journal ID (pmc): ploscomp

Title: PLoS Computational Biology

Publisher: Public Library of Science (San Francisco, USA )

ISSN (Print): 1553-734X

ISSN (Electronic): 1553-7358

Publication date Collection: September 2013

Publication date (Print): September 2013

Publication date (Electronic): 26 September 2013

Volume: 9

Issue: 9

Electronic Location Identifier: e1003223

Affiliations

[1]Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America

University of California San Diego, United States of America

Author notes

* E-mail: eebsmith@ 123456umich.edu

The authors have declared that no competing interests exist.

Conceived and designed the experiments: SAS JWB CEH. Performed the experiments: SAS JWB CEH. Analyzed the data: SAS JWB CEH. Contributed reagents/materials/analysis tools: SAS JWB CEH. Wrote the paper: SAS JWB CEH.

Article

Publisher ID: PCOMPBIOL-D-13-00522

DOI: 10.1371/journal.pcbi.1003223

PMC ID: 3784503

PubMed ID: 24086118

SO-VID: 7c508fb3-1885-4a7f-98f0-328c9ebdc887

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

History

Date received : 28 March 2013

Date accepted : 31 July 2013

Page count

Pages: 15

Funding

SAS, JWB, and CEH were supported by the NSF AVATOL 1207915. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs

Read this article at

Abstract

Author Summary

Related collections

Journal of Systems Thinking Preprints

Most cited references 15

The delayed rise of present-day mammals.

DensiTree: making sense of sets of phylogenetic trees.

Darwin's abominable mystery: Insights from a supertree of the angiosperms.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 14

Cited by 12

Most referenced authors 569