+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Decisive Data Sets in Phylogenomics: Lessons from Studies on the Phylogenetic Relationships of Primarily Wingless Insects

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Phylogenetic relationships of the primarily wingless insects are still considered unresolved. Even the most comprehensive phylogenomic studies that addressed this question did not yield congruent results. To get a grip on these problems, we here analyzed the sources of incongruence in these phylogenomic studies by using an extended transcriptome data set. Our analyses showed that unevenly distributed missing data can be severely misleading by inflating node support despite the absence of phylogenetic signal. In consequence, only decisive data sets should be used which exclusively comprise data blocks containing all taxa whose relationships are addressed. Additionally, we used Four-cluster Likelihood Mapping (FcLM) to measure the degree of congruence among genes of a data set, as a measure of support alternative to bootstrap. FcLM showed incongruent signal among genes, which in our case is correlated neither with functional class assignment of these genes nor with model misspecification due to unpartitioned analyses. The herein analyzed data set is the currently largest data set covering primarily wingless insects, but failed to elucidate their interordinal phylogenetic relationships. Although this is unsatisfying from a phylogenetic perspective, we try to show that the analyses of structure and signal within phylogenomic data can protect us from biased phylogenetic inferences due to analytical artifacts.

          Related collections

          Most cited references 59

          • Record: found
          • Abstract: found
          • Article: not found

          Amino acid substitution matrices from protein blocks.

          Methods for alignment of protein sequences typically measure similarity by using a substitution matrix with scores for all possible exchanges of one amino acid with another. The most widely used matrices are based on the Dayhoff model of evolutionary rates. Using a different approach, we have derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins. This led to marked improvements in alignments and in searches using queries from each of the groups.
            • Record: found
            • Abstract: found
            • Article: not found

            Among-site rate variation and its impact on phylogenetic analyses.

             Ziheng Yang (1996)
            Although several decades of study have revealed the ubiquity of variation of evolutionary rates among sites, reliable methods for studying rate variation were not developed until very recently. Early methods fit theoretical distributions to the numbers of changes at sites inferred by parsimony and substantially underestimate the rate variation. Recent analyses show that failure to account for rate variation can have drastic effects, leading to biased dating of speciation events, biased estimation of the transition:transversion rate ratio, and incorrect reconstruction of phylogenies.
              • Record: found
              • Abstract: found
              • Article: not found

              Inferring ancient divergences requires genes with strong phylogenetic signals.

              To tackle incongruence, the topological conflict between different gene trees, phylogenomic studies couple concatenation with practices such as rogue taxon removal or the use of slowly evolving genes. Phylogenomic analysis of 1,070 orthologues from 23 yeast genomes identified 1,070 distinct gene trees, which were all incongruent with the phylogeny inferred from concatenation. Incongruence severity increased for shorter internodes located deeper in the phylogeny. Notably, whereas most practices had little or negative impact on the yeast phylogeny, the use of genes or internodes with high average internode support significantly improved the robustness of inference. We obtained similar results in analyses of vertebrate and metazoan phylogenomic data sets. These results question the exclusive reliance on concatenation and associated practices, and argue that selecting genes with strong phylogenetic signals and demonstrating the absence of significant incongruence are essential for accurately reconstructing ancient divergences.

                Author and article information

                Mol Biol Evol
                Mol. Biol. Evol
                Molecular Biology and Evolution
                Oxford University Press
                January 2014
                18 October 2013
                18 October 2013
                : 31
                : 1
                : 239-249
                1Department of Integrative Zoology, University of Vienna, Vienna, Austria
                2Zoologisches Forschungsmuseum Alexander Koenig, Zentrum für Molekulare Biodiversitätsforschung (zmb), Bonn, Germany
                3CSIRO Ecosystem Sciences, Australian National Insect Collection, Acton, ACT, Australia
                4Zoologisches Forschungsmuseum Alexander Koenig, Abteilung Arthropoda, Bonn, Germany
                5Institut für Systemische Neurowissenschaften, Universitätsklinikum Hamburg-Eppendorf, Hamburg, Germany
                6Biozentrum Grindel & Zoologisches Museum, Universität Hamburg, Hamburg, Germany
                7Heidelberg Institute for Theoretical Studies (HITS), Scientific Computing Group, Heidelberg, Germany
                8Karlsruher Institut für Technologie, Fakultät für Informatik, Karlsruhe, Germany
                9Center for Integrative Bioinformatics Vienna (CIBIV), Max F Perutz Laboratories, University of Vienna, Medical University of Vienna, Vienna, Austria
                10Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, Austria
                11Institute for Cell Biology and Neuroscience, Goethe-Universität Frankfurt, Frankfurt am Main, Germany
                Author notes

                These authors contributed equally to this work.

                Associate editor: Nicolas Vidal

                © The Author 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                Page count
                Pages: 11


                Comment on this article