Blog
About

  • Record: found
  • Abstract: found
  • Article: found
Is Open Access

Similar Ratios of Introns to Intergenic Sequence across Animal Genomes

Read this article at

Bookmark
      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

      Abstract

      One central goal of genome biology is to understand how the usage of the genome differs between organisms. Our knowledge of genome composition, needed for downstream inferences, is critically dependent on gene annotations, yet problems associated with gene annotation and assembly errors are usually ignored in comparative genomics. Here, we analyze the genomes of 68 species across 12 animal phyla and some single-cell eukaryotes for general trends in genome composition and transcription, taking into account problems of gene annotation. We show that, regardless of genome size, the ratio of introns to intergenic sequence is comparable across essentially all animals, with nearly all deviations dominated by increased intergenic sequence. Genomes of model organisms have ratios much closer to 1:1, suggesting that the majority of published genomes of nonmodel organisms are underannotated and consequently omit substantial numbers of genes, with likely negative impact on evolutionary interpretations. Finally, our results also indicate that most animals transcribe half or more of their genomes arguing against differences in genome usage between animal groups, and also suggesting that the transcribed portion is more dependent on genome size than previously thought.

      Related collections

      Most cited references 112

      • Record: found
      • Abstract: found
      • Article: not found

      SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

      The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.
        Bookmark
        • Record: found
        • Abstract: found
        • Article: found

        TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions

        TopHat is a popular spliced aligner for RNA-sequence (RNA-seq) experiments. In this paper, we describe TopHat2, which incorporates many significant enhancements to TopHat. TopHat2 can align reads of various lengths produced by the latest sequencing technologies, while allowing for variable-length indels with respect to the reference genome. In addition to de novo spliced alignment, TopHat2 can align reads across fusion breaks, which can occur after genomic translocations. TopHat2 combines the ability to identify novel splice sites with direct mapping to known transcripts, producing sensitive and accurate alignments, even for highly repetitive genomes or in the presence of pseudogenes. TopHat2 is available at http://ccb.jhu.edu/software/tophat.
          Bookmark
          • Record: found
          • Abstract: found
          • Article: not found

          BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

          Genomics has revolutionized biological research, but quality assessment of the resulting assembled sequences is complicated and remains mostly limited to technical measures like N50.
            Bookmark

            Author and article information

            Affiliations
            [1 ]Department of Earth and Environmental Sciences, Paleontology and Geobiology, Ludwig-Maximilians-Universität München, Munich, Germany
            [2 ]GeoBio-Center, Ludwig-Maximilians-Universität München, Munich, Germany
            [3 ]Bavarian State Collection for Paleontology and Geology, Munich, Germany
            Author notes
            Associate editor
            [* ]Corresponding author: E-mail: woerheide@ 123456lmu.de .
            Journal
            Genome Biol Evol
            Genome Biol Evol
            gbe
            Genome Biology and Evolution
            Oxford University Press
            1759-6653
            June 2017
            13 June 2017
            13 June 2017
            : 9
            : 6
            : 1582-1598
            28633296
            5534336
            10.1093/gbe/evx103
            evx103
            © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

            This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

            Counts
            Pages: 17
            Product
            Categories
            Research Article

            Genetics

            c-value, complexity, junk dna, comparative genomics, metazoa

            Comments

            Comment on this article