4
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Highly Continuous Genome Assembly of Eurasian Perch ( Perca fluviatilis) Using Linked-Read Sequencing

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Eurasian perch ( Perca fluviatilis) is the most common fish of the Percidae family and is widely distributed across Eurasia. Perch is a popular target for professional and recreational fisheries, and a promising freshwater aquaculture species in Europe. However, despite its high ecological, economical and societal importance, the available genomic resources for P. fluviatilis are rather limited. In this work, we report de novo assembly and annotation of the whole genome sequence of perch. The linked-read based technology with 10X Genomics Chromium chemistry and Supernova assembler produced a draft perch genome ∼1.0 Gbp assembly (scaffold N 50 = 6.3 Mb; the longest individual scaffold of 29.3 Mb; BUSCO completeness of 88.0%), which included 281.6 Mb of putative repeated sequences. The perch genome assembly presented here, generated from small amount of starting material (0.75 ng) and a single linked-read library, is highly continuous and considerably more complete than the currently available draft of P. fluviatilis genome. A total of 23,397 protein-coding genes were predicted, 23,171 (99%) of which were annotated functionally from either sequence homology or protein signature searches. Linked-read technology enables fast, accurate and cost-effective de novo assembly of large non-model eukaryote genomes. The highly continuous assembly of the Eurasian perch genome presented in this study will be an invaluable resource for a range of genetic, ecological, physiological, ecotoxicological, functional and comparative genomic studies in perch and other fish species of the Percidae family.

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources

          Background In order to improve gene prediction, extrinsic evidence on the gene structure can be collected from various sources of information such as genome-genome comparisons and EST and protein alignments. However, such evidence is often incomplete and usually uncertain. The extrinsic evidence is usually not sufficient to recover the complete gene structure of all genes completely and the available evidence is often unreliable. Therefore extrinsic evidence is most valuable when it is balanced with sequence-intrinsic evidence. Results We present a fairly general method for integration of external information. Our method is based on the evaluation of hints to potentially protein-coding regions by means of a Generalized Hidden Markov Model (GHMM) that takes both intrinsic and extrinsic information into account. We used this method to extend the ab initio gene prediction program AUGUSTUS to a versatile tool that we call AUGUSTUS+. In this study, we focus on hints derived from matches to an EST or protein database, but our approach can be used to include arbitrary user-defined hints. Our method is only moderately effected by the length of a database match. Further, it exploits the information that can be derived from the absence of such matches. As a special case, AUGUSTUS+ can predict genes under user-defined constraints, e.g. if the positions of certain exons are known. With hints from EST and protein databases, our new approach was able to predict 89% of the exons in human chromosome 22 correctly. Conclusion Sensitive probabilistic modeling of extrinsic evidence such as sequence database matches can increase gene prediction accuracy. When a match of a sequence interval to an EST or protein sequence is used it should be treated as compound information rather than as information about individual positions.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            The Dfam database of repetitive DNA families

            Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Advancements in Next-Generation Sequencing

              The term next-generation sequencing is almost a decade old, but it remains the colloquial way to describe highly parallel or high-output sequencing methods that produce data at or beyond the genome scale. Since the introduction of these technologies, the number of applications and methods that leverage the power of genome-scale sequencing has increased at an exponential pace. This review highlights recent concepts, technologies, and methods from next-generation sequencing to illustrate the breadth and depth of the applications and research areas that are driving progress in genomics.
                Bookmark

                Author and article information

                Journal
                G3 (Bethesda)
                Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes|Genomes|Genetics
                Genetics Society of America
                2160-1836
                24 October 2018
                December 2018
                : 8
                : 12
                : 3737-3743
                Affiliations
                [* ]Department of Biology, University of Turku, 20014, Finland
                [§ ]Institute of Technology, Faculty of Science and Technology, University of Tartu, Tartu, 50411, Estonia
                []Chair of Aquaculture, Institute of Veterinary Medicine and Animal Sciences, Estonian University of Life Sciences, Tartu, 51014, Estonia
                []Department of Fisheries and Wildlife, Michigan State University, Michigan, 48824
                [** ]Department of Aquatic Resources, Institute of Freshwater Research, Swedish University of Agricultural Sciences, Drottningholm, 17893, Sweden
                Author notes
                [1 ]Corresponding author: Anti Vasemägi, Swedish University of Agricultural Sciences, Department of Aquatic Resources, Institute of Freshwater Research, Stångholmsvägen 2, Drottningholm 17893, Sweden. Tel: + 46 10 478 4277; e-mail: anti.vasemagi@ 123456slu.se
                Author information
                http://orcid.org/0000-0002-1817-7707
                http://orcid.org/0000-0002-8994-4723
                http://orcid.org/0000-0003-0311-3003
                http://orcid.org/0000-0001-9653-8291
                http://orcid.org/0000-0001-7922-7096
                http://orcid.org/0000-0002-5535-1639
                http://orcid.org/0000-0002-2184-5534
                Article
                GGG_200768
                10.1534/g3.118.200768
                6288837
                30355765
                6940b7f1-c0d7-4b3a-88f5-931f43556a1a
                Copyright © 2018 Ozerov et al.

                This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 27 August 2018
                : 22 October 2018
                Page count
                Figures: 1, Tables: 3, Equations: 0, References: 62, Pages: 7
                Categories
                Genome Report

                Genetics
                perca fluviatilis,whole genome sequencing,de novo assembly,10x genomics chromium linked-read,fish

                Comments

                Comment on this article