32
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      BlobToolKit – Interactive Quality Assessment of Genome Assemblies

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Reconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a mixture of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in downstream analyses by users unaware of underlying problems. We present BlobToolKit, a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies. BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility. We have been running an automated BlobToolKit pipeline on eukaryotic assemblies publicly available in the International Nucleotide Sequence Data Collaboration and are making the results available through a public instance of the Viewer at https://blobtoolkit.genomehubs.org/view. We aim to complete analysis of all publicly available genomes and then maintain currency with the flow of new genomes. We have worked to embed these views into the presentation of genome assemblies at the European Nucleotide Archive, providing an indication of assembly quality alongside the public record with links out to allow full exploration in the Viewer.

          Most cited references28

          • Record: found
          • Abstract: found
          • Article: not found

          Wolbachia: master manipulators of invertebrate biology.

          Wolbachia are common intracellular bacteria that are found in arthropods and nematodes. These alphaproteobacteria endosymbionts are transmitted vertically through host eggs and alter host biology in diverse ways, including the induction of reproductive manipulations, such as feminization, parthenogenesis, male killing and sperm-egg incompatibility. They can also move horizontally across species boundaries, resulting in a widespread and global distribution in diverse invertebrate hosts. Here, we review the basic biology of Wolbachia, with emphasis on recent advances in our understanding of these fascinating endosymbionts.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Recent segmental duplications in the human genome.

            Primate-specific segmental duplications are considered important in human disease and evolution. The inability to distinguish between allelic and duplication sequence overlap has hampered their characterization as well as assembly and annotation of our genome. We developed a method whereby each public sequence is analyzed at the clone level for overrepresentation within a whole-genome shotgun sequence. This test has the ability to detect duplications larger than 15 kilobases irrespective of copy number, location, or high sequence similarity. We mapped 169 large regions flanked by highly similar duplications. Twenty-four of these hot spots of genomic instability have been associated with genetic disease. Our analysis indicates a highly nonrandom chromosomal and genic distribution of recent segmental duplications, with a likely role in expanding protein diversity.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Earth BioGenome Project: Sequencing life for the future of life

              Increasing our understanding of Earth's biodiversity and responsibly stewarding its resources are among the most crucial scientific and social challenges of the new millennium. These challenges require fundamental new knowledge of the organization, evolution, functions, and interactions among millions of the planet's organisms. Herein, we present a perspective on the Earth BioGenome Project (EBP), a moonshot for biology that aims to sequence, catalog, and characterize the genomes of all of Earth's eukaryotic biodiversity over a period of 10 years. The outcomes of the EBP will inform a broad range of major issues facing humanity, such as the impact of climate change on biodiversity, the conservation of endangered species and ecosystems, and the preservation and enhancement of ecosystem services. We describe hurdles that the project faces, including data-sharing policies that ensure a permanent, freely available resource for future scientific discovery while respecting access and benefit sharing guidelines of the Nagoya Protocol. We also describe scientific and organizational challenges in executing such an ambitious project, and the structure proposed to achieve the project's goals. The far-reaching potential benefits of creating an open digital repository of genomic information for life on Earth can be realized only by a coordinated international effort.
                Bookmark

                Author and article information

                Journal
                G3 (Bethesda)
                Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes|Genomes|Genetics
                Genetics Society of America
                2160-1836
                18 February 2020
                April 2020
                : 10
                : 4
                : 1361-1374
                Affiliations
                [* ]Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK
                []Wellcome Sanger Institute, Cambridge CB10 1SA, UK
                []European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, UK
                Author notes
                [1 ]Corresponding author: Wellcome Sanger Institute, Cambridge CB10 1SA, UK. Email: rc28@ 123456sanger.ac.uk
                Author information
                http://orcid.org/0000-0002-3502-1122
                http://orcid.org/0000-0001-5975-6003
                http://orcid.org/0000-0002-3613-0013
                http://orcid.org/0000-0001-7954-7057
                http://orcid.org/0000-0003-2861-949X
                Article
                GGG_400908
                10.1534/g3.119.400908
                7144090
                32071071
                f1e1477b-8341-48c4-babe-8372334412f6
                Copyright © 2020 Challis et al.

                This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 14 November 2019
                : 15 February 2020
                Page count
                Figures: 5, Tables: 4, Equations: 0, References: 46, Pages: 14
                Categories
                Investigations

                Genetics
                bioinformatics,visualisation web-tool,genome assembly,quality control
                Genetics
                bioinformatics, visualisation web-tool, genome assembly, quality control

                Comments

                Comment on this article