14
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Signal, Uncertainty, and Conflict in Phylogenomic Data for a Diverse Lineage of Microbial Eukaryotes (Diatoms, Bacillariophyta)

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Diatoms (Bacillariophyta) are a species-rich group of eukaryotic microbes diverse in morphology, ecology, and metabolism. Previous reconstructions of the diatom phylogeny based on one or a few genes have resulted in inconsistent resolution or low support for critical nodes. We applied phylogenetic paralog pruning techniques to a data set of 94 diatom genomes and transcriptomes to infer perennially difficult species relationships, using concatenation and summary-coalescent methods to reconstruct species trees from data sets spanning a wide range of thresholds for taxon and column occupancy in gene alignments. Conflicts between gene and species trees decreased with both increasing taxon occupancy and bootstrap cutoffs applied to gene trees. Concordance between gene and species trees was lowest for short internodes and increased logarithmically with increasing edge length, suggesting that incomplete lineage sorting disproportionately affects species tree inference at short internodes, which are a common feature of the diatom phylogeny. Although species tree topologies were largely consistent across many data treatments, concatenation methods appeared to outperform summary-coalescent methods for sparse alignments. Our results underscore that approaches to species-tree inference based on few loci are likely to be misled by unrepresentative sampling of gene histories, particularly in lineages that may have diversified rapidly. In addition, phylogenomic studies of diatoms, and potentially other hyperdiverse groups, should maximize the number of gene trees with high taxon occupancy, though there is clearly a limit to how many of these genes will be available.

          Related collections

          Most cited references53

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data

          The Environment for Tree Exploration (ETE) is a computational framework that simplifies the reconstruction, analysis, and visualization of phylogenetic trees and multiple sequence alignments. Here, we present ETE v3, featuring numerous improvements in the underlying library of methods, and providing a novel set of standalone tools to perform common tasks in comparative genomics and phylogenetics. The new features include (i) building gene-based and supermatrix-based phylogenies using a single command, (ii) testing and visualizing evolutionary models, (iii) calculating distances between trees of different size or including duplications, and (iv) providing seamless integration with the NCBI taxonomy database. ETE is freely available at http://etetoolkit.org
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Gene Trees in Species Trees

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              ASTRAL: genome-scale coalescent-based species tree estimation

              Motivation: Species trees provide insight into basic biology, including the mechanisms of evolution and how it modifies biomolecular function and structure, biodiversity and co-evolution between genes and species. Yet, gene trees often differ from species trees, creating challenges to species tree estimation. One of the most frequent causes for conflicting topologies between gene trees and species trees is incomplete lineage sorting (ILS), which is modelled by the multi-species coalescent. While many methods have been developed to estimate species trees from multiple genes, some which have statistical guarantees under the multi-species coalescent model, existing methods are too computationally intensive for use with genome-scale analyses or have been shown to have poor accuracy under some realistic conditions. Results: We present ASTRAL, a fast method for estimating species trees from multiple genes. ASTRAL is statistically consistent, can run on datasets with thousands of genes and has outstanding accuracy—improving on MP-EST and the population tree from BUCKy, two statistically consistent leading coalescent-based methods. ASTRAL is often more accurate than concatenation using maximum likelihood, except when ILS levels are low or there are too few gene trees. Availability and implementation: ASTRAL is available in open source form at https://github.com/smirarab/ASTRAL/. Datasets studied in this article are available at http://www.cs.utexas.edu/users/phylo/datasets/astral. Contact: warnow@illinois.edu Supplementary information: Supplementary data are available at Bioinformatics online.
                Bookmark

                Author and article information

                Journal
                Mol Biol Evol
                Mol. Biol. Evol
                molbev
                Molecular Biology and Evolution
                Oxford University Press
                0737-4038
                1537-1719
                January 2018
                12 October 2017
                12 October 2017
                : 35
                : 1
                : 80-93
                Affiliations
                [1 ]Daniel F. and Ada L. Rice Plant Conservation Science Center, Chicago Botanic Garden, Glencoe, IL
                [2 ]Department of Biological Sciences, University of Arkansas, Fayetteville, AR
                Author notes

                All nuclear transcript assemblies, ortholog alignments, (Newick-formatted) gene, and species trees are available in Zenodo data repository DOI 10.5281/zenodo.344519 ( https://zenodo.org/).

                Corresponding author: E-mail: mparks@ 123456chicagobotanic.org .

                Associate editor: Beth Shapiro

                Article
                msx268
                10.1093/molbev/msx268
                5850769
                29040712
                69c8d836-3982-43cd-89fd-afe473274003
                © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                Page count
                Pages: 14
                Funding
                Funded by: National Science Foundation 10.13039/100000001
                Award ID: DEB-1353131
                Award ID: DEB-1353152
                Funded by: Arkansas Economic Development Commission
                Award ID: DEB-1239992
                Award ID: DEB-1342873
                Categories
                Discoveries

                Molecular biology
                diatoms,bacillariophyta,phylogenomics,phylotranscriptomics,incomplete lineage sorting

                Comments

                Comment on this article