48
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Predicting RAD-seq Marker Numbers across the Eukaryotic Tree of Life

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          High-throughput sequencing of reduced representation libraries obtained through digestion with restriction enzymes—generically known as restriction site associated DNA sequencing (RAD-seq)—is a common strategy to generate genome-wide genotypic and sequence data from eukaryotes. A critical design element of any RAD-seq study is knowledge of the approximate number of genetic markers that can be obtained for a taxon using different restriction enzymes, as this number determines the scope of a project, and ultimately defines its success. This number can only be directly determined if a reference genome sequence is available, or it can be estimated if the genome size and restriction recognition sequence probabilities are known. However, both scenarios are uncommon for nonmodel species. Here, we performed systematic in silico surveys of recognition sequences, for diverse and commonly used type II restriction enzymes across the eukaryotic tree of life. Our observations reveal that recognition sequence frequencies for a given restriction enzyme are strikingly variable among broad eukaryotic taxonomic groups, being largely determined by phylogenetic relatedness. We demonstrate that genome sizes can be predicted from cleavage frequency data obtained with restriction enzymes targeting “neutral” elements. Models based on genomic compositions are also effective tools to accurately calculate probabilities of recognition sequences across taxa, and can be applied to species for which reduced representation data are available (including transcriptomes and neutral RAD-seq data sets). The analytical pipeline developed in this study, PredRAD ( https://github.com/phrh/PredRAD), and the resulting databases constitute valuable resources that will help guide the design of any study using RAD-seq or related methods.

          Related collections

          Most cited references33

          • Record: found
          • Abstract: found
          • Article: not found

          Cytosine methylation and the ecology of intragenomic parasites.

          Most of the 5-methylcytosine in mammalian DNA resides in transposons, which are specialized intragenomic parasites that represent at least 35% of the genome. Transposon promoters are inactive when methylated and, over time, C-->T transition mutations at methylated sites destroy many transposons. Apart from that subset of genes subject to X inactivation and genomic imprinting, no cellular gene in a non-expressing tissue has been proven to be methylated in a pattern that prevents transcription. It has become increasingly difficult to hold that reversible promoter methylation is commonly involved in developmental gene control; instead, suppression of parasitic sequence elements appears to be the primary function of cytosine methylation, with crucial secondary roles in allele-specific gene expression as seen in X inactivation and genomic imprinting.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            RADSeq: next-generation population genetics.

            Next-generation sequencing technologies are making a substantial impact on many areas of biology, including the analysis of genetic diversity in populations. However, genome-scale population genetic studies have been accessible only to well-funded model systems. Restriction-site associated DNA sequencing, a method that samples at reduced complexity across target genomes, promises to deliver high resolution population genomic data-thousands of sequenced markers across many individuals-for any organism at reasonable costs. It has found application in wild populations and non-traditional study species, and promises to become an important technology for ecological population genomics.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              DNA methylation and the frequency of CpG in animal DNA.

              A Bird (1980)
              An analysis of nearest neighbour dinucleotide frequencies and the level of DNA methylation in animals strongly supports the suggestion that 5-methylcytosine (5mC) tends to mutate abnormally frequently to T. This tendency is the likely cause of the CpG deficiency in heavily methylated genomes.
                Bookmark

                Author and article information

                Journal
                Genome Biol Evol
                Genome Biol Evol
                gbe
                gbe
                Genome Biology and Evolution
                Oxford University Press
                1759-6653
                December 2015
                03 November 2015
                03 November 2015
                : 7
                : 12
                : 3207-3225
                Affiliations
                1Biology Department, Woods Hole Oceanographic Institution
                2Biology Department, Massachusetts Institute of Technology
                3Colombian Corporation for Agricultural Research (CORPOICA), Bogotá, Colombia
                Author notes
                *Corresponding author: E-mail: sherrera@ 123456alum.mit.edu .

                Associate editor: Cécile Ané

                Data deposition: This project, including the analytical software pipeline (PredRAD), the visualization scripts, and the output databases, has been deposited at GitHub under the accession https://github.com/phrh/PredRAD.

                Article
                evv210
                10.1093/gbe/evv210
                4700943
                26537225
                6bc84e57-73bb-4a66-a133-a2e9f39925d6
                © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                : 23 October 2015
                Page count
                Pages: 19
                Categories
                Genome Resources

                Genetics
                rad-seq,reduced representation sequencing,predrad,experimental design,genome size prediction,restriction recognition sequence probability

                Comments

                Comment on this article