73
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55–83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

          Related collections

          Most cited references49

          • Record: found
          • Abstract: found
          • Article: not found

          TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets.

          TGICL is a pipeline for analysis of large Expressed Sequence Tags (EST) and mRNA databases in which the sequences are first clustered based on pairwise sequence similarity, and then assembled by individual clusters (optionally with quality values) to produce longer, more complete consensus sequences. The system can run on multi-CPU architectures including SMP and PVM.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The origins of genome complexity.

            Complete genomic sequences from diverse phylogenetic lineages reveal notable increases in genome complexity from prokaryotes to multicellular eukaryotes. The changes include gradual increases in gene number, resulting from the retention of duplicate genes, and more abrupt increases in the abundance of spliceosomal introns and mobile genetic elements. We argue that many of these modifications emerged passively in response to the long-term population-size reductions that accompanied increases in organism size. According to this model, much of the restructuring of eukaryotic genomes was initiated by nonadaptive processes, and this in turn provided novel substrates for the secondary evolution of phenotypic complexity by natural selection. The enormous long-term effective population sizes of prokaryotes may impose a substantial barrier to the evolution of complex genomes and morphologies.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads.

              Repetitive DNA makes up large portions of plant and animal nuclear genomes, yet it remains the least-characterized genome component in most species studied so far. Although the recent availability of high-throughput sequencing data provides necessary resources for in-depth investigation of genomic repeats, its utility is hampered by the lack of specialized bioinformatics tools and appropriate computational resources that would enable large-scale repeat analysis to be run by biologically oriented researchers. Here we present RepeatExplorer, a collection of software tools for characterization of repetitive elements, which is accessible via web interface. A key component of the server is the computational pipeline using a graph-based sequence clustering algorithm to facilitate de novo repeat identification without the need for reference databases of known elements. Because the algorithm uses short sequences randomly sampled from the genome as input, it is ideal for analyzing next-generation sequence reads. Additional tools are provided to aid in classification of identified repeats, investigate phylogenetic relationships of retroelements and perform comparative analysis of repeat composition between multiple species. The server allows to analyze several million sequence reads, which typically results in identification of most high and medium copy repeats in higher plant genomes.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                25 November 2015
                2015
                : 10
                : 11
                : e0143424
                Affiliations
                [1 ]Biology Centre of the Czech Academy of Sciences, Institute of Plant Molecular Biology, České Budějovice, Czech Republic
                [2 ]Jodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey, United Kingdom
                [3 ]Institute of Experimental Botany, Olomouc, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
                [4 ]School of Biological and Chemical Sciences, Queen Mary University of London, London, United Kingdom
                Leibniz-Institute of Plant Genetics and Crop Plant Research (IPK), GERMANY
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: JM P. Novák P. Neumann JD IJL. Performed the experiments: JM AK IF JC JD. Analyzed the data: JM P. Novák JP P. Neumann LJK IJL. Contributed reagents/materials/analysis tools: P. Novák JM. Wrote the paper: JM JD IJL.

                Article
                PONE-D-15-44868
                10.1371/journal.pone.0143424
                4659654
                26606051
                c63d2908-9719-4203-9f30-10b201b6e07a
                Copyright @ 2015

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

                History
                : 12 October 2015
                : 4 November 2015
                Page count
                Figures: 4, Tables: 3, Pages: 23
                Funding
                This work was supported by grants from the Czech Science Foundation [GBP501/12/G090] and the Czech Academy of Sciences [RVO:60077344] to JM and from the National Program of Sustainability I. [LO1204] to JD. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Custom metadata
                Sequence data for all newly sequenced species are available from the European Nucleotide Archive ( http://www.ebi.ac.uk/ena), under the study accession number ERP004630 (Repeat characterization in Fabeae genomes).

                Uncategorized
                Uncategorized

                Comments

                Comment on this article