50
views
0
recommends
+1 Recommend
0 collections
    8
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Comprehensive Transcriptome Assembly of Chickpea ( Cicer arietinum L.) Using Sanger and Next Generation Sequencing Platforms: Development and Applications

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs) from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinum Transcriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201), comprising 46,369 transcript assembly contigs (TACs) has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8%) of the TACs and gene ontology assignments were determined for 21,471 (46.3%). The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs) and intron spanning regions (ISRs) for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC) of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding applications in chickpea and other related legumes.

          Related collections

          Most cited references28

          • Record: found
          • Abstract: found
          • Article: not found

          Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.).

          A software tool was developed for the identification of simple sequence repeats (SSRs) in a barley ( Hordeum vulgare L.) EST (expressed sequence tag) database comprising 24,595 sequences. In total, 1,856 SSR-containing sequences were identified. Trimeric SSR repeat motifs appeared to be the most abundant type. A subset of 311 primer pairs flanking SSR loci have been used for screening polymorphisms among six barley cultivars, being parents of three mapping populations. As a result, 76 EST-derived SSR-markers were integrated into a barley genetic consensus map. A correlation between polymorphism and the number of repeats was observed for SSRs built of dimeric up to tetrameric units. 3'-ESTs yielded a higher portion of polymorphic SSRs (64%) than 5'-ESTs did. The estimated PIC (polymorphic information content) value was 0.45 +/- 0.03. Approximately 80% of the SSR-markers amplified DNA fragments in Hordeum bulbosum, followed by rye, wheat (both about 60%) and rice (40%). A subset of 38 EST-derived SSR-markers comprising 114 alleles were used to investigate genetic diversity among 54 barley cultivars. In accordance with a previous, RFLP-based, study, spring and winter cultivars, as well as two- and six-rowed barleys, formed separate clades upon PCoA analysis. The results show that: (1) with the software tool developed, EST databases can be efficiently exploited for the development of cDNA-SSRs, (2) EST-derived SSRs are significantly less polymorphic than those derived from genomic regions, (3) a considerable portion of the developed SSRs can be transferred to related species, and (4) compared to RFLP-markers, cDNA-SSRs yield similar patterns of genetic diversity.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            UniRef: comprehensive and non-redundant UniProt reference clusters.

            Redundant protein sequences in biological databases hinder sequence similarity searches and make interpretation of search results difficult. Clustering of protein sequence space based on sequence similarity helps organize all sequences into manageable datasets and reduces sampling bias and overrepresentation of sequences. The UniRef (UniProt Reference Clusters) provide clustered sets of sequences from the UniProt Knowledgebase (UniProtKB) and selected UniProt Archive records to obtain complete coverage of sequence space at several resolutions while hiding redundant sequences. Currently covering >4 million source sequences, the UniRef100 database combines identical sequences and subfragments from any source organism into a single UniRef entry. UniRef90 and UniRef50 are built by clustering UniRef100 sequences at the 90 or 50% sequence identity levels. UniRef100, UniRef90 and UniRef50 yield a database size reduction of approximately 10, 40 and 70%, respectively, from the source sequence set. The reduced redundancy increases the speed of similarity searches and improves detection of distant relationships. UniRef entries contain summary cluster and membership information, including the sequence of a representative protein, member count and common taxonomy of the cluster, the accession numbers of all the merged entries and links to rich functional annotation in UniProtKB to facilitate biological discovery. UniRef has already been applied to broad research areas ranging from genome annotation to proteomics data analysis. UniRef is updated biweekly and is available for online search and retrieval at http://www.uniprot.org, as well as for download at ftp://ftp.uniprot.org/pub/databases/uniprot/uniref. Supplementary data are available at Bioinformatics online.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Genic microsatellite markers in plants: features and applications.

              Expressed sequence tag (EST) projects have generated a vast amount of publicly available sequence data from plant species; these data can be mined for simple sequence repeats (SSRs). These SSRs are useful as molecular markers because their development is inexpensive, they represent transcribed genes and a putative function can often be deduced by a homology search. Because they are derived from transcripts, they are useful for assaying the functional diversity in natural populations or germplasm collections. These markers are valuable because of their higher level of transferability to related species, and they can often be used as anchor markers for comparative mapping and evolutionary studies. They have been developed and mapped in several crop species and could prove useful for marker-assisted selection, especially when the markers reside in the genes responsible for a phenotypic trait. Applications and potential uses of EST-SSRs in plant genetics and breeding are discussed.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2014
                23 January 2014
                : 9
                : 1
                : e86039
                Affiliations
                [1 ]Research Program on Grain Legumes, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Andhra Pradesh, India
                [2 ]National Research Council Canada (NRC-CNRC), Saskatoon, Saskatchewan, Canada
                [3 ]Department of Plant Sciences, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
                [4 ]Department of Agronomy, University of Iowa, Ames, Iowa, United States of America
                [5 ]National Center for Genome Resources (NCGR), Santa Fe, New Mexico, United States of America
                [6 ]United States Department of Agriculture–Agricultural Research Service (USDA–ARS), Corn Insects and Crop Genetics Research Unit (USDA-ARS-CICGRU), Ames, Iowa, United States of America
                [7 ]CGIAR Generation Challenge Programme (GCP), c/o CIMMYT, Mexico DF, Mexico
                University of Guelph, Canada
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: RKV AGS BT ADF SC. Performed the experiments: HK SA CC RL. Analyzed the data: HK SA BD CC. Contributed reagents/materials/analysis tools: RKV ADF SC BT AGS. Wrote the paper: RKV HK SA ADF AGS SC.

                Article
                PONE-D-13-19774
                10.1371/journal.pone.0086039
                3900451
                24465857
                5d21ede8-9b8b-4cf4-ac73-82efef78459e
                Copyright @ 2014

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 14 May 2013
                : 3 December 2013
                Page count
                Pages: 12
                Funding
                The authors are thankful to the DST-INSPIRE program by Department of Science and Technology, Government of India, Indo-German Science Technology Centre (IGSTC), CGIAR Generation Challenge Programme (GCP) and the Saskatchewan Agriculture Development Fund (ADF) for financial support to undertake part of research presented in this study. This work has been undertaken as part of the CGIAR Research Program on Grain Legumes. ICRISAT is a member of CGIAR Consortium. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Agriculture
                Agricultural Biotechnology
                Crops
                Biology
                Biotechnology
                Computational Biology
                Genomics
                Genome Analysis Tools
                Sequence Assembly Tools
                Transcriptomes
                Sequence Analysis
                Developmental Biology
                Genetics
                Genomics
                Genome Analysis Tools
                Sequence Assembly Tools
                Transcriptomes
                Genome Databases
                Sequence Databases
                Model Organisms
                Plant Science
                Plant Biotechnology
                Plant Genetics
                Plant Genomics
                Plants

                Uncategorized
                Uncategorized

                Comments

                Comment on this article