18
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Comparative analysis of morabine grasshopper genomes reveals highly abundant transposable elements and rapidly proliferating satellite DNA repeats

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Repetitive DNA sequences, including transposable elements (TEs) and tandemly repeated satellite DNA (satDNAs), collectively called the “repeatome”, are found in high proportion in organisms across the Tree of Life. Grasshoppers have large genomes, averaging 9 Gb, that contain a high proportion of repetitive DNA, which has hampered progress in assembling reference genomes. Here we combined linked-read genomics with transcriptomics to assemble, characterize, and compare the structure of repetitive DNA sequences in four chromosomal races of the morabine grasshopper Vandiemenella viatica species complex and determine their contribution to genome evolution.

          Results

          We obtained linked-read genome assemblies of 2.73–3.27 Gb from estimated genome sizes of 4.26–5.07 Gb DNA per haploid genome of the four chromosomal races of V. viatica. These constitute the third largest insect genomes assembled so far. Combining complementary annotation tools and manual curation, we found a large diversity of TEs and satDNAs, constituting 66 to 75% per genome assembly. A comparison of sequence divergence within the TE classes revealed massive accumulation of recent TEs in all four races (314–463 Mb per assembly), indicating that their large genome sizes are likely due to similar rates of TE accumulation. Transcriptome sequencing showed more biased TE expression in reproductive tissues than somatic tissues, implying permissive transcription in gametogenesis. Out of 129 satDNA families, 102 satDNA families were shared among the four chromosomal races, which likely represent a diversity of satDNA families in the ancestor of the V. viatica chromosomal races. Notably, 50 of these shared satDNA families underwent differential proliferation since the recent diversification of the V. viatica species complex.

          Conclusion

          This in-depth annotation of the repeatome in morabine grasshoppers provided new insights into the genome evolution of Orthoptera. Our TEs analysis revealed a massive recent accumulation of TEs equivalent to the size of entire Drosophila genomes, which likely explains the large genome sizes in grasshoppers. Despite an overall high similarity of the TE and satDNA diversity between races, the patterns of TE expression and satDNA proliferation suggest rapid evolution of grasshopper genomes on recent timescales.

          Supplementary information

          Supplementary information accompanies this paper at 10.1186/s12915-020-00925-x.

          Related collections

          Most cited references95

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

          In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0550-8) contains supplementary material, which is available to authorized users.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Fast gapped-read alignment with Bowtie 2.

            As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              MUSCLE: multiple sequence alignment with high accuracy and high throughput.

              We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.
                Bookmark

                Author and article information

                Contributors
                octavio.palacios@ebc.uu.se
                tkawakami@embarkvet.com
                alexander.suh@ebc.uu.se
                Journal
                BMC Biol
                BMC Biol
                BMC Biology
                BioMed Central (London )
                1741-7007
                21 December 2020
                21 December 2020
                2020
                : 18
                : 199
                Affiliations
                [1 ]GRID grid.8993.b, ISNI 0000 0004 1936 9457, Department of Ecology and Genetics – Evolutionary Biology, Evolutionary Biology Centre, , Uppsala University, ; SE-752 36 Uppsala, Sweden
                [2 ]GRID grid.8993.b, ISNI 0000 0004 1936 9457, Department of Organismal Biology – Systematic Biology, Evolutionary Biology Centre, , Uppsala University, ; SE-752 36 Uppsala, Sweden
                [3 ]GRID grid.437963.c, ISNI 0000 0001 1349 5098, Evolutionary Biology Unit, , South Australian Museum, ; Adelaide, SA 5000 Australia
                [4 ]GRID grid.1010.0, ISNI 0000 0004 1936 7304, School of Biological Sciences and Australian Centre for Evolutionary Biology and Biodiversity, , The University of Adelaide, ; Adelaide, SA 5005 Australia
                [5 ]Embark Veterinary, Inc., Boston, MA USA
                [6 ]GRID grid.8273.e, ISNI 0000 0001 1092 7967, School of Biological Sciences, , University of East Anglia, Norwich Research Park, ; Norwich, NR4 7TU UK
                Author information
                http://orcid.org/0000-0002-1472-9949
                Article
                925
                10.1186/s12915-020-00925-x
                7754599
                33349252
                98483db9-1398-4ded-8d82-a3509adfc43c
                © The Author(s) 2020

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 22 August 2020
                : 10 November 2020
                Funding
                Funded by: Swedish Research Council Vetenskapsrådet
                Award ID: 2014-6325
                Award Recipient :
                Funded by: Marie Sklodowska Curie Actions, Co-fund Project INCA
                Award ID: 600398
                Award Recipient :
                Funded by: Swedish Research Council Formas
                Award ID: 2017-01597
                Award Recipient :
                Funded by: Sven och Lilly Lawskis fund
                Award ID: N2018-0045
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2020

                Life sciences
                Life sciences

                Comments

                Comment on this article