335
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Insights from 20 years of bacterial genome sequencing

      review-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them.

          Related collections

          Most cited references136

          • Record: found
          • Abstract: found
          • Article: not found

          ISfinder: the reference centre for bacterial insertion sequences

          ISfinder () is a dedicated database for bacterial insertion sequences (ISs). It has superseded the Stanford reference center. One of its functions is to assign IS names and to provide a focal point for a coherent nomenclature. It is also the repository for ISs. Each new IS is indexed together with information such as its DNA sequence and open reading frames or potential coding sequences, the sequence of the ends of the element and target sites, its origin and distribution together with a bibliography where available. Another objective is to continuously monitor ISs to provide updated comprehensive groupings or families and to provide some insight into their phylogenies. The site also contains extensive background information on ISs and transposons in general. Online tools are gradually being added. At present an online Blast facility against the entire bank is available. But additional features will include alignment capability, PsiBLAST and HMM profiles. ISfinder also includes a section on bacterial genomes and is involved in annotating the IS content of these genomes. Finally, this database is currently recommended by several microbiology journals for registration of new IS elements before their publication.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome".

            The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for approximately 80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Community structure and metabolism through reconstruction of microbial genomes from the environment.

              Microbial communities are vital in the functioning of all ecosystems; however, most microorganisms are uncultivated, and their roles in natural systems are unclear. Here, using random shotgun sequencing of DNA from a natural acidophilic biofilm, we report reconstruction of near-complete genomes of Leptospirillum group II and Ferroplasma type II, and partial recovery of three other genomes. This was possible because the biofilm was dominated by a small number of species populations and the frequency of genomic rearrangements and gene insertions or deletions was relatively low. Because each sequence read came from a different individual, we could determine that single-nucleotide polymorphisms are the predominant form of heterogeneity at the strain level. The Leptospirillum group II genome had remarkably few nucleotide polymorphisms, despite the existence of low-abundance variants. The Ferroplasma type II genome seems to be a composite from three ancestral strains that have undergone homologous recombination to form a large population of mosaic genomes. Analysis of the gene complement for each organism revealed the pathways for carbon and nitrogen fixation and energy generation, and provided insights into survival strategies in an extreme environment.
                Bookmark

                Author and article information

                Contributors
                (865) 574-8201 , usserydw@ornl.gov
                Journal
                Funct Integr Genomics
                Funct. Integr. Genomics
                Functional & Integrative Genomics
                Springer Berlin Heidelberg (Berlin/Heidelberg )
                1438-793X
                1438-7948
                27 February 2015
                27 February 2015
                2015
                : 15
                : 2
                : 141-161
                Affiliations
                [ ]Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
                [ ]Joint Institute for Biological Sciences, University of Tennessee, Knoxville, TN 37996 USA
                [ ]Department of Microbiology, University of Tennessee, Knoxville, TN 37996 USA
                [ ]Computer Science and Mathematics Division, Computer Science Research Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
                [ ]Center for Biological Sequence Analysis, Department of Systems Biology, The Technical University of Denmark, Kgs. Lyngby, 2800 Denmark
                [ ]Molecular Microbiology and Genomics Consultants, Tannenstr 7, 55576 Zotzenheim, Germany
                [ ]Genome Science and Technology, University of Tennessee, Knoxville, TN 37996 USA
                Article
                433
                10.1007/s10142-015-0433-4
                4361730
                25722247
                cc1a97c0-9f9e-452d-a581-f972f8f7e9e6
                © The Author(s) 2015

                Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

                History
                : 19 January 2015
                : 11 February 2015
                : 12 February 2015
                Categories
                Review
                Custom metadata
                © Springer-Verlag Berlin Heidelberg 2015

                Genetics
                bacteria,comparative genomics,bacterial genomes,metagenomics,core-genome,pan-genome,next-generation sequencing

                Comments

                Comment on this article