320
views
0
recommends
+1 Recommend
0 collections
    4
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      TASSEL-GBS: A High Capacity Genotyping by Sequencing Analysis Pipeline

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Genotyping by sequencing (GBS) is a next generation sequencing based method that takes advantage of reduced representation to enable high throughput genotyping of large numbers of individuals at a large number of SNP markers. The relatively straightforward, robust, and cost-effective GBS protocol is currently being applied in numerous species by a large number of researchers. Herein we describe a bioinformatics pipeline, tassel- gbs, designed for the efficient processing of raw GBS sequence data into SNP genotypes. The tassel- gbs pipeline successfully fulfills the following key design criteria: (1) Ability to run on the modest computing resources that are typically available to small breeding or ecological research programs, including desktop or laptop machines with only 8–16 GB of RAM, (2) Scalability from small to extremely large studies, where hundreds of thousands or even millions of SNPs can be scored in up to 100,000 individuals (e.g., for large breeding programs or genetic surveys), and (3) Applicability in an accelerated breeding context, requiring rapid turnover from tissue collection to genotypes. Although a reference genome is required, the pipeline can also be run with an unfinished “pseudo-reference” consisting of numerous contigs. We describe the tassel- gbs pipeline in detail and benchmark it based upon a large scale, species wide analysis in maize ( Zea mays), where the average error rate was reduced to 0.0042 through application of population genetic-based SNP filters. Overall, the GBS assay and the tassel- gbs pipeline provide robust tools for studying genomic diversity.

          Related collections

          Most cited references31

          • Record: found
          • Abstract: found
          • Article: not found

          2b-RAD: a simple and flexible method for genome-wide genotyping.

          We describe 2b-RAD, a streamlined restriction site-associated DNA (RAD) genotyping method based on sequencing the uniform fragments produced by type IIB restriction endonucleases. Well-studied accessions of Arabidopsis thaliana were genotyped to validate the method's accuracy and to demonstrate fine-tuning of marker density as needed. The simplicity of the 2b-RAD protocol makes it particularly suitable for high-throughput genotyping as required for linkage mapping and profiling genetic variation in natural populations.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Genotyping-by-sequencing in ecological and conservation genomics.

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries.

              High-density single-nucleotide polymorphism (SNP) arrays have revolutionized the ability of genome-wide association studies to detect genomic regions harboring sequence variants that affect complex traits. Extensive numbers of validated SNPs with known allele frequencies are essential to construct genotyping assays with broad utility. We describe an economical, efficient, single-step method for SNP discovery, validation and characterization that uses deep sequencing of reduced representation libraries (RRLs) from specified target populations. Using nearly 50 million sequences generated on an Illumina Genome Analyzer from DNA of 66 cattle representing three populations, we identified 62,042 putative SNPs and predicted their allele frequencies. Genotype data for these 66 individuals validated 92% of 23,357 selected genome-wide SNPs, with a genotypic and sequence allele frequency correlation of r = 0.67. This approach for simultaneous de novo discovery of high-quality SNPs and population characterization of allele frequencies may be applied to any species with at least a partially sequenced genome.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2014
                28 February 2014
                : 9
                : 2
                : e90346
                Affiliations
                [1]Institute for Genomic Diversity, Cornell University, Ithaca, New York, United States of America
                [2]Biotechnology Resource Center Bioinformatics Facility, Cornell University, Ithaca, New York, United States of America
                [3]USDA Agricultural Research Service, Ithaca, New York, United States of America
                Agriculture and Agri-Food Canada, Canada
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: ESB JCG QS. Performed the experiments: ESB JCG TMC FL JH QS RJE. Analyzed the data: JCG ESB TMC QS RJE. Wrote the paper: JCG ESB.

                [¤]

                Current address: AgResearch Limited, Grasslands Research Centre, Palmerston North, New Zealand

                Article
                PONE-D-13-47679
                10.1371/journal.pone.0090346
                3938676
                24587335
                a05c648d-3554-41ca-9206-5d7a3747713c
                Copyright @ 2014

                This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

                History
                : 13 November 2013
                : 28 January 2014
                Page count
                Pages: 11
                Funding
                This work was supported by the National Science Foundation ( www.nsf.gov) under the Plant Genome Research Program (PGRP) (grant numbers DBI-0820619 and IOS-1238014) and the Basic Research to Enable Agricultural Development (BREAD) project (ID:IOS-0965342), as well as by the USDA-ARS ( www.usda.gov). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Agriculture
                Agricultural biotechnology
                Marker-assisted selection
                Crops
                Cereals
                Maize
                Biology
                Computational biology
                Genomics
                Genome analysis tools
                Genome sequencing
                Population genetics
                Genetic polymorphism
                Biological data management
                Sequence analysis
                Genetics
                Animal genetics
                Genome-wide association studies
                Plant genetics
                Population genetics
                Genomics
                Genome analysis tools
                Genetic maps
                Genome-wide association studies
                Linkage maps
                Sequence assembly tools
                Genome sequencing
                Model organisms
                Plant and algal models
                Maize
                Plant science
                Agronomy
                Plant breeding
                Plant genetics
                Plant genomics

                Uncategorized
                Uncategorized

                Comments

                Comment on this article