+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: not found

      Phylogenomics provides strong evidence for relationships of butterflies and moths.

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Butterflies and moths constitute some of the most popular and charismatic insects. Lepidoptera include approximately 160 000 described species, many of which are important model organisms. Previous studies on the evolution of Lepidoptera did not confidently place butterflies, and many relationships among superfamilies in the megadiverse clade Ditrysia remain largely uncertain. We generated a molecular dataset with 46 taxa, combining 33 new transcriptomes with 13 available genomes, transcriptomes and expressed sequence tags (ESTs). Using HaMStR with a Lepidoptera-specific core-orthologue set of single copy loci, we identified 2696 genes for inclusion into the phylogenomic analysis. Nucleotides and amino acids of the all-gene, all-taxon dataset yielded nearly identical, well-supported trees. Monophyly of butterflies (Papilionoidea) was strongly supported, and the group included skippers (Hesperiidae) and the enigmatic butterfly-moths (Hedylidae). Butterflies were placed sister to the remaining obtectomeran Lepidoptera, and the latter was grouped with greater than or equal to 87% bootstrap support. Establishing confident relationships among the four most diverse macroheteroceran superfamilies was previously challenging, but we recovered 100% bootstrap support for the following relationships: ((Geometroidea, Noctuoidea), (Bombycoidea, Lasiocampoidea)). We present the first robust, transcriptome-based tree of Lepidoptera that strongly contradicts historical placement of butterflies, and provide an evolutionary framework for genomic, developmental and ecological studies on this diverse insect order.

          Related collections

          Most cited references33

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations

          We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at http://translatorx.co.uk.
            • Record: found
            • Abstract: found
            • Article: not found

            How many bootstrap replicates are necessary?

            Phylogenetic bootstrapping (BS) is a standard technique for inferring confidence values on phylogenetic trees that is based on reconstructing many trees from minor variations of the input data, trees called replicates. BS is used with all phylogenetic reconstruction approaches, but we focus here on one of the most popular, maximum likelihood (ML). Because ML inference is so computationally demanding, it has proved too expensive to date to assess the impact of the number of replicates used in BS on the relative accuracy of the support values. For the same reason, a rather small number (typically 100) of BS replicates are computed in real-world studies. Stamatakis et al. recently introduced a BS algorithm that is 1 to 2 orders of magnitude faster than previous techniques, while yielding qualitatively comparable support values, making an experimental study possible. In this article, we propose stopping criteria--that is, thresholds computed at runtime to determine when enough replicates have been generated--and we report on the first large-scale experimental study to assess the effect of the number of replicates on the quality of support values, including the performance of our proposed criteria. We run our tests on 17 diverse real-world DNA--single-gene as well as multi-gene--datasets, which include 125-2,554 taxa. We find that our stopping criteria typically stop computations after 100-500 replicates (although the most conservative criterion may continue for several thousand replicates) while producing support values that correlate at better than 99.5% with the reference values on the best ML trees. Significantly, we also find that the stopping criteria can recommend very different numbers of replicates for different datasets of comparable sizes. Our results are thus twofold: (i) they give the first experimental assessment of the effect of the number of BS replicates on the quality of support values returned through BS, and (ii) they validate our proposals for stopping criteria. Practitioners will no longer have to enter a guess nor worry about the quality of support values; moreover, with most counts of replicates in the 100-500 range, robust BS under ML inference becomes computationally practical for most datasets. The complete test suite is available at http://lcbb.epfl.ch/BS.tar.bz2, and BS with our stopping criteria is included in the latest release of RAxML v7.2.5, available at http://wwwkramer.in.tum.de/exelixis/software.html.
              • Record: found
              • Abstract: found
              • Article: not found

              A survey of sequence alignment algorithms for next-generation sequencing.

              Rapidly evolving sequencing technologies produce data on an unparalleled scale. A central challenge to the analysis of this data is sequence alignment, whereby sequence reads must be compared to a reference. A wide variety of alignment algorithms and software have been subsequently developed over the past two years. In this article, we will systematically review the current development of these algorithms and introduce their practical applications on different types of experimental data. We come to the conclusion that short-read alignment is no longer the bottleneck of data analyses. We also consider future development of alignment algorithms with respect to emerging long sequence reads and the prospect of cloud computing.

                Author and article information

                Proc. Biol. Sci.
                Proceedings. Biological sciences / The Royal Society
                Aug 7 2014
                : 281
                : 1788
                [1 ] Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA kawahara@flmnh.ufl.edu.
                [2 ] Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA jessebreinholt@gmail.com.
                © 2014 The Author(s) Published by the Royal Society. All rights reserved.

                Lepidoptera, butterfly, moth, orthologue, phylogeny, transcriptome


                Comment on this article