Blog
About

  • Record: found
  • Abstract: found
  • Article: found
Is Open Access

Microbial species delineation using whole genome sequences

Read this article at

Bookmark
      There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

      Abstract

      Increased sequencing of microbial genomes has revealed that prevailing prokaryotic species assignments can be inconsistent with whole genome information for a significant number of species. The long-standing need for a systematic and scalable species assignment technique can be met by the genome-wide Average Nucleotide Identity (gANI) metric, which is widely acknowledged as a robust measure of genomic relatedness. In this work, we demonstrate that the combination of gANI and the alignment fraction (AF) between two genomes accurately reflects their genomic relatedness. We introduce an efficient implementation of AF,gANI and discuss its successful application to 86.5M genome pairs between 13,151 prokaryotic genomes assigned to 3032 species. Subsequently, by comparing the genome clusters obtained from complete linkage clustering of these pairs to existing taxonomy, we observed that nearly 18% of all prokaryotic species suffer from anomalies in species definition. Our results can be used to explore central questions such as whether microorganisms form a continuum of genetic diversity or distinct species represented by distinct genetic signatures. We propose that this precise and objective AF,gANI-based species definition: the MiSI (Microbial Species Identifier) method, be used to address previous inconsistencies in species classification and as the primary guide for new taxonomic species assignment, supplemented by the traditional polyphasic approach, as required.

      Related collections

      Most cited references 32

      • Record: found
      • Abstract: found
      • Article: not found

      Basic local alignment search tool.

      A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
        Bookmark
        • Record: found
        • Abstract: found
        • Article: not found

        Shifting the genomic gold standard for the prokaryotic species definition.

        DNA-DNA hybridization (DDH) has been used for nearly 50 years as the gold standard for prokaryotic species circumscriptions at the genomic level. It has been the only taxonomic method that offered a numerical and relatively stable species boundary, and its use has had a paramount influence on how the current classification has been constructed. However, now, in the era of genomics, DDH appears to be an outdated method for classification that needs to be substituted. The average nucleotide identity (ANI) between two genomes seems the most promising method since it mirrors DDH closely. Here we examine the work package JSpecies as a user-friendly, biologist-oriented interface to calculate ANI and the correlation of the tetranucleotide signatures between pairwise genomic comparisons. The results agreed with the use of ANI to substitute DDH, with a narrowed boundary that could be set at approximately 95-96%. In addition, the JSpecies package implemented the tetranucleotide signature correlation index, an alignment-free parameter that generally correlates with ANI and that can be of help in deciding when a given pair of organisms should be classified in the same species. Moreover, for taxonomic purposes, the analyses can be produced by simply randomly sequencing at least 20% of the genome of the query strains rather than obtaining their full sequence.
          Bookmark
          • Record: found
          • Abstract: found
          • Article: not found

          A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea.

          Sequencing of bacterial and archaeal genomes has revolutionized our understanding of the many roles played by microorganisms. There are now nearly 1,000 completed bacterial and archaeal genomes available, most of which were chosen for sequencing on the basis of their physiology. As a result, the perspective provided by the currently available genomes is limited by a highly biased phylogenetic distribution. To explore the value added by choosing microbial genomes for sequencing on the basis of their evolutionary relationships, we have sequenced and analysed the genomes of 56 culturable species of Bacteria and Archaea selected to maximize phylogenetic coverage. Analysis of these genomes demonstrated pronounced benefits (compared to an equivalent set of genomes randomly selected from the existing database) in diverse areas including the reconstruction of phylogenetic history, the discovery of new protein families and biological properties, and the prediction of functions for known genes from other organisms. Our results strongly support the need for systematic 'phylogenomic' efforts to compile a phylogeny-driven 'Genomic Encyclopedia of Bacteria and Archaea' in order to derive maximum knowledge from existing microbial genome data as well as from genome sequences to come.
            Bookmark

            Author and article information

            Affiliations
            [1 ]Microbial and Metagenome Superprogram, DOE Joint Genomic Institute, Walnut Creek, CA 94598, USA
            [2 ]Department of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0355, USA
            [3 ]Celgene Corp., San Francisco, CA 94158, USA
            Author notes
            [* ]To whom correspondence should be addressed. Tel: +1 925 296 5696; Fax: +1 925 296 5666; Email: njvarghese@ 123456lbl.gov
            Correspondence may also be addressed to Amrita Pati. Tel: +1 925 927 2580; Fax: +1 925 296 5666; Email: apati@ 123456lbl.gov
            Correspondence may also be addressed to Nikos C. Kyrpides. Tel: +925 296 5718; Fax: +1 925 296 5666; Email: nckyrpides@ 123456lbl.gov
            Journal
            Nucleic Acids Res
            Nucleic Acids Res
            nar
            nar
            Nucleic Acids Research
            Oxford University Press
            0305-1048
            1362-4962
            18 August 2015
            06 July 2015
            06 July 2015
            : 43
            : 14
            : 6761-6771
            26150420
            4538840
            10.1093/nar/gkv657
            © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

            This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

            Counts
            Pages: 11
            Product
            Categories
            7
            Computational Biology
            Custom metadata
            18 August 2015

            Genetics

            Comments

            Comment on this article