284
views
0
recommends
+1 Recommend
0 collections
    4
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Chromosomal-Level Assembly of the Asian Seabass Genome Using Long Sequence Reads and Multi-layered Scaffolding

      research-article
      1 , * , 2 , 1 , 3 , 4 , 4 ,   5 , 6 , 1 , 1 , 1 , 1 , 1 , 5 , 5 , 7 , 5 , 1 , 1 , 1 , 1 , 6 , 6 , 6 , 6 , 6 , 6 , 6 , 8 , 9 , 9 , 10 , 11 , 12 , 12 , 13 , 12 , 14 , 14 , 15 , 16 , 16 , 15 , 4 , 17 , 12 , 13 , 18 , 6 , 19 , 5 , * ,   1 , 20 , 21 , *
      PLoS Genetics
      Public Library of Science
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species’ native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics.

          Author Summary

          We describe the genome assembly of Asian seabass (Lates calcarifer), a marine teleost with aquaculture relevance. Though >500 eukaryotic genome sequences are available in public repositories, the majority are highly fragmented with incomplete assemblies, which explains why considerable effort and resources are often spent to improve their quality after publication. In our study, we employed long read sequencing combined with genetic and optical mapping, and syntenic information to produce a chromosomal level assembly. The largely continuous genome assembly will be useful for comparative genomics and offers an opportunity to look into regions less explored such as tandem repeats (the core component of centromeres and telomeres). In addition, population structure of the species was analysed based on low-coverage genome sequence information from 61 individuals representing diverse geographic locations stretching from North-Western India across South-East Asia and Australia to Papua New Guinea.

          Related collections

          Most cited references60

          • Record: found
          • Abstract: found
          • Article: not found

          ProtTest: selection of best-fit models of protein evolution.

          Using an appropriate model of amino acid replacement is very important for the study of protein evolution and phylogenetic inference. We have built a tool for the selection of the best-fit model of evolution, among a set of candidate models, for a given protein sequence alignment. ProtTest is available under the GNU license from http://darwin.uvigo.es
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The genomic basis of adaptive evolution in threespine sticklebacks

            Summary Marine stickleback fish have colonized and adapted to innumerable streams and lakes formed since the last ice age, providing an exceptional opportunity to characterize genomic mechanisms underlying repeated ecological adaptation in nature. Here we develop a high quality reference genome assembly for threespine sticklebacks. By sequencing the genomes of 20 additional individuals from a global set of marine and freshwater populations, we identify a genome-wide set of loci that are consistently associated with marine-freshwater divergence. Our results suggest that reuse of globally-shared standing genetic variation, including chromosomal inversions, plays an important role in repeated evolution of distinct marine and freshwater sticklebacks, and in the maintenance of divergent ecotypes during early stages of reproductive isolation. Both coding and regulatory changes occur in the set of loci underlying marine-freshwater evolution, with regulatory changes likely predominating in this classic example of repeated adaptive evolution in nature.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons

              Background Transposable elements are abundant in eukaryotic genomes and it is believed that they have a significant impact on the evolution of gene and chromosome structure. While there are several completed eukaryotic genome projects, there are only few high quality genome wide annotations of transposable elements. Therefore, there is a considerable demand for computational identification of transposable elements. LTR retrotransposons, an important subclass of transposable elements, are well suited for computational identification, as they contain long terminal repeats (LTRs). Results We have developed a software tool LTRharvest for the de novo detection of full length LTR retrotransposons in large sequence sets. LTRharvest efficiently delivers high quality annotations based on known LTR transposon features like length, distance, and sequence motifs. A quality validation of LTRharvest against a gold standard annotation for Saccharomyces cerevisae and Drosophila melanogaster shows a sensitivity of up to 90% and 97% and specificity of 100% and 72%, respectively. This is comparable or slightly better than annotations for previous software tools. The main advantage of LTRharvest over previous tools is (a) its ability to efficiently handle large datasets from finished or unfinished genome projects, (b) its flexibility in incorporating known sequence features into the prediction, and (c) its availability as an open source software. Conclusion LTRharvest is an efficient software tool delivering high quality annotation of LTR retrotransposons. It can, for example, process the largest human chromosome in approx. 8 minutes on a Linux PC with 4 GB of memory. Its flexibility and small space and run-time requirements makes LTRharvest a very competitive candidate for future LTR retrotransposon annotation projects. Moreover, the structured design and implementation and the availability as open source provides an excellent base for incorporating novel concepts to further improve prediction of LTR retrotransposons.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, CA USA )
                1553-7390
                1553-7404
                15 April 2016
                April 2016
                : 12
                : 4
                : e1005954
                Affiliations
                [1 ]Reproductive Genomics Group, Temasek Life Sciences Laboratory, Singapore
                [2 ]Max Planck Institute for Molecular Genetics, Berlin, Germany
                [3 ]Laboratory of Chromosome Structure and Function, Department of Cytology and Histology, Biological Faculty, Saint Petersburg State University, St. Petersburg, Russia
                [4 ]Theodosius Dobzhansky Center for Genome Bioinformatics, Saint Petersburg State University, St. Petersburg, Russia
                [5 ]South African MRC Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa
                [6 ]Pacific Biosciences, Menlo Park, California, United States of America
                [7 ]Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
                [8 ]Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
                [9 ]Genomics Core Facility, Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
                [10 ]Norwich Medical School, University of East Anglia, Norwich Research Park, Norwich, United Kingdom
                [11 ]The Genome Analysis Centre, Norwich, United Kingdom
                [12 ]Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, New York, United States of America
                [13 ]Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America
                [14 ]Nutrition, Genetics & Biotechnology Division, ICAR-Central Institute of Brackishwater Aquaculture, Tamil Nadu, India
                [15 ]College of Marine and Environmental Sciences and Center for Sustainable Tropical Fisheries and Aquaculture, James Cook University, Townsville, Queensland, Australia
                [16 ]CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), New Delhi, India
                [17 ]Oceanographic Center, Nova Southeastern University Ft. Lauderdale, Ft. Lauderdale, Florida, United States of America
                [18 ]School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, United Kingdom
                [19 ]The Centre for Applied Genomics, The Hospital for Sick Children, Peter Gilgan Centre for Research and Learning, Toronto, Ontario, Canada
                [20 ]Department of Animal Sciences and Animal Husbandry, Georgikon Faculty, University of Pannonia, Keszthely, Hungary
                [21 ]Centre for Comparative Genomics, Murdoch University, Murdoch, Australia
                MicroTrek Incorporated, UNITED STATES
                Author notes

                The authors have declared that no competing interests exist.

                Conceived and designed the experiments: SV SL AC LO. Performed the experiments: SV HK ISK AK AAY PVH SSin NMT SRSP KP JMS JJ SKM MJ AHYT DL LSH JPD MB RH CSC VT MK AT DG SMo TG FJS GWV GG VKK THN VS SSiv DRJ. Analyzed the data: SV HK ISK AK AAY PVH SSin NMT PSRS SKM MJ SMw SYN WCL XS SMo TG FJS GWV GG VKK MCS TD AC LO. Wrote the paper: SV HK ISK SSin JK DRJ MCS SL AC LO. Advised and/or coordinated the study: SV SSin RL JK MCS TD SWT SJO SL AC LO.

                Article
                PGENETICS-D-15-03078
                10.1371/journal.pgen.1005954
                4833346
                27082250
                845588ba-5b61-48be-a0d7-acedb3de5048

                This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

                History
                : 19 December 2015
                : 3 March 2016
                Page count
                Figures: 6, Tables: 2, Pages: 35
                Funding
                This work was supported by the National Research Foundation, Prime Minister’s Office, Singapore under its Competitive Research Program [NRF-CRP7-2010-01]; South African Research Chairs Initiative of the Department of Science and Technology and National Research Foundation of South Africa; interdisciplinary grant of the Siberian Branch of the Russian Academy of Sciences [SB RAS no. 137]; the Russian Ministry of Science [Mega-grant no.11.G34.31.0068 to SJO, AK and AAY]; St Petersburg State University [Research grant IAS 1.37.153.2014]; Russian Foundation for Basic Research [RFBR no. 14-14-00275 to VT]; National Science Foundation awards [DBI-1350041 and IOS-1237880 to TG, GWV, FJS and MCS]; National Institute of Health award [R01-HG006677 to TG, GWV, FJS and MCS]; and the Watson School of Biological Sciences at Cold Spring Harbor Laboratory through a training grant [5T32GM065094 to TG, GWV, FJS and MCS]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Genomic Libraries
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Genomic Libraries
                Biology and Life Sciences
                Molecular Biology
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Sequence Alignment
                Research and Analysis Methods
                Molecular Biology Techniques
                Sequencing Techniques
                Sequence Analysis
                Sequence Alignment
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Sequence Assembly Tools
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Sequence Assembly Tools
                Biology and Life Sciences
                Cell Biology
                Chromosome Biology
                Chromosomes
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Biology and Life Sciences
                Computational Biology
                Comparative Genomics
                Biology and Life Sciences
                Genetics
                Genomics
                Comparative Genomics
                Biology and life sciences
                Molecular biology
                Molecular biology techniques
                DNA construction
                DNA library construction
                Genomic Library Construction
                Research and analysis methods
                Molecular biology techniques
                DNA construction
                DNA library construction
                Genomic Library Construction
                Biology and Life Sciences
                Genetics
                Genomics
                Animal Genomics
                Fish Genomics
                Custom metadata
                The scaffolded genome assembly (v2) has been submitted to DDBJ/EMBL/NCBI GenBank under the accession LLXD00000000. Alternatively, it is also available for download at http://seabass.sanbi.ac.za/, together with the annotations (for the v2 assembly). The chromosome-level genome assembly (v3) is also available at the above-mentioned website. The Illumina and PacBio reads utilized for the genome assembly, as well as the whole-genome resequencing reads have been submitted to NCBI SRA under BioProject accession numbers SRP069219 and SRP069848, respectively. The BAC end sequences have been submitted to NCBI dbGSS under the accession numbers KS320706 - KS326261 for the Bam HI library and KS326262 - KS331896 for the Eco RI library.

                Genetics
                Genetics

                Comments

                Comment on this article