47
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Inferring Phylogenies from RAD Sequence Data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Reduced-representation genome sequencing represents a new source of data for systematics, and its potential utility in interspecific phylogeny reconstruction has not yet been explored. One approach that seems especially promising is the use of inexpensive short-read technologies (e.g., Illumina, SOLiD) to sequence restriction-site associated DNA (RAD) – the regions of the genome that flank the recognition sites of restriction enzymes. In this study, we simulated the collection of RAD sequences from sequenced genomes of different taxa ( Drosophila, mammals, and yeasts) and developed a proof-of-concept workflow to test whether informative data could be extracted and used to accurately reconstruct “known” phylogenies of species within each group. The workflow consists of three basic steps: first, sequences are clustered by similarity to estimate orthology; second, clusters are filtered by taxonomic coverage; and third, they are aligned and concatenated for “total evidence” phylogenetic analysis. We evaluated the performance of clustering and filtering parameters by comparing the resulting topologies with well-supported reference trees and we were able to identify conditions under which the reference tree was inferred with high support. For Drosophila, whole genome alignments allowed us to directly evaluate which parameters most consistently recovered orthologous sequences. For the parameter ranges explored, we recovered the best results at the low ends of sequence similarity and taxonomic representation of loci; these generated the largest supermatrices with the highest proportion of missing data. Applications of the method to mammals and yeasts were less successful, which we suggest may be due partly to their much deeper evolutionary divergence times compared to Drosophila (crown ages of approximately 100 and 300 versus 60 Mya, respectively). RAD sequences thus appear to hold promise for reconstructing phylogenetic relationships in younger clades in which sufficient numbers of orthologous restriction sites are retained across species.

          Related collections

          Most cited references28

          • Record: found
          • Abstract: found
          • Article: not found

          Evolution of genes and genomes on the Drosophila phylogeny.

          Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Delimiting species without monophyletic gene trees.

            Genetic data are frequently used to delimit species, where species status is determined on the basis of an exclusivity criterium, such as reciprocal monophyly. Not only are there numerous empirical examples of incongruence between the boundaries inferred from such data compared to other sources like morphology -- especially with recently derived species, but population genetic theory also clearly shows that an inevitable bias in species status results because genetic thresholds do not explicitly take into account how the timing of speciation influences patterns of genetic differentiation. This study represents a fundamental shift in how genetic data might be used to delimit species. Rather than equating gene trees with a species tree or basing species status on some genetic threshold, the relationship between the gene trees and the species history is modeled probabilistically. Here we show that the same theory that is used to calculate the probability of reciprocal monophyly can also be used to delimit species despite widespread incomplete lineage sorting. The results from a preliminary simulation study suggest that very recently derived species can be accurately identified long before the requisite time for reciprocal monophyly to be achieved following speciation. The study also indicates the importance of sampling, both with regards to loci and individuals. Withstanding a thorough investigation into the conditions under which the coalescent-based approach will be effective, namely how the timing of divergence relative to the effective population size of species affects accurate species delimitation, the results are nevertheless consistent with other recent studies (aimed at inferring species relationships), showing that despite the lack of monophyletic gene trees, a signal of species divergence persists and can be extracted. Using an explicit model-based approach also avoids two primary problems with species delimitation that result when genetic thresholds are applied with genetic data -- the inherent biases in species detection arising from when and how speciation occurred, and failure to take into account the high stochastic variance of genetic processes. Both the utility and sensitivities of the coalescent-based approach outlined here are discussed; most notably, a model-based approach is essential for determining whether incompletely sorted gene lineages are (or are not) consistent with separate species lineages, and such inferences require accurate model parameterization (i.e., a range of realistic effective population sizes relative to potential times of divergence for the purported species). It is the goal (and motivation of this study) that genetic data might be used effectively as a source of complementation to other sources of data for diagnosing species, as opposed to the exclusion of other evidence for species delimitation, which will require an explicit consideration of the effects of the temporal dynamic of lineage splitting on genetic data.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Phylogeny and divergence-date estimates of rapid radiations in muroid rodents based on multiple nuclear genes.

              The muroid rodents are the largest superfamily of mammals, containing nearly one third of all mammal species. We report on a phylogenetic study comprising 53 genera sequenced for four nuclear genes, GHR, BRCA1, RAG1, and c-myc, totaling up to 6400 nucleotides. Most relationships among the subfamilies are resolved. All four genes yield nearly identical phylogenies, differing only in five key regions, four of which may represent particularly rapid radiations. Support is very strong for a fundamental division of the mole rats of the subfamilies Spalacinae and Rhizomyinae from all other muroids. Among the other "core" muroids, a rapid radiation led to at least four distinct lineages: Asian Calomyscus, an African clade of at least four endemic subfamilies, including the diverse Nesomyinae of Madagascar, a hamster clade with maximum diversity in the New World, and an Old World clade including gerbils and the diverse Old World mice and rats (Murinae). The Deomyinae, recently removed from the Murinae, is well supported as the sister group to the gerbils (Gerbillinae). Four key regions appear to represent rapid radiations and, despite a large amount of sequence data, remain poorly resolved: the base of the "core" muroids, among the five cricetid (hamster) subfamilies, within a large clade of Sigmodontinae endemic to South America, and among major geographic lineages of Old World Murinae. Because of the detailed taxon sampling within the Murinae, we are able to refine the fossil calibration of a rate-smoothed molecular clock and apply this clock to date key events in muroid evolution. We calculate rate differences among the gene regions and relate those differences to relative contribution of each gene to the support for various nodes. The among-gene variance in support is greatest for the shortest branches. We present a revised classification for this largest but most unsettled mammalian superfamily.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2012
                6 April 2012
                : 7
                : 4
                : e33394
                Affiliations
                [1 ]Committee on Evolutionary Biology, University of Chicago, Chicago, Illinois, United States of America
                [2 ]Department of Zoology, Field Museum of Natural History, Chicago, Illinois, United States of America
                [3 ]Department of Botany, Field Museum of Natural History, Chicago, Illinois, United States of America
                Barnard College, Columbia University, United States of America
                Author notes

                Conceived and designed the experiments: BERR RHR CSM. Performed the experiments: BERR. Analyzed the data: BERR. Contributed reagents/materials/analysis tools: BERR RHR. Wrote the paper: BERR RHR CSM.

                Article
                PONE-D-11-17992
                10.1371/journal.pone.0033394
                3320897
                22493668
                ad10a96c-4ba0-4c32-8919-7c966043ca0f
                Rubin et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 13 September 2011
                : 14 February 2012
                Page count
                Pages: 12
                Categories
                Research Article
                Biology
                Computational Biology
                Evolutionary Biology
                Evolutionary Systematics
                Genomics
                Model Organisms
                Animal Models
                Zoology

                Uncategorized
                Uncategorized

                Comments

                Comment on this article