73
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairs

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          DNA-DNA hybridization (DDH) is a widely applied wet-lab technique to obtain an estimate of the overall similarity between the genomes of two organisms. To base the species concept for prokaryotes ultimately on DDH was chosen by microbiologists as a pragmatic approach for deciding about the recognition of novel species, but also allowed a relatively high degree of standardization compared to other areas of taxonomy. However, DDH is tedious and error-prone and first and foremost cannot be used to incrementally establish a comparative database. Recent studies have shown that in-silico methods for the comparison of genome sequences can be used to replace DDH. Considering the ongoing rapid technological progress of sequencing methods, genome-based prokaryote taxonomy is coming into reach. However, calculating distances between genomes is dependent on multiple choices for software and program settings. We here provide an overview over the modifications that can be applied to distance methods based in high-scoring segment pairs (HSPs) or maximally unique matches (MUMs) and that need to be documented. General recommendations on determining HSPs using BLAST or other algorithms are also provided. As a reference implementation, we introduce the GGDC web server ( http://ggdc.gbdp.org).

          Related collections

          Most cited references5

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Digital DNA-DNA hybridization for microbial species delineation by means of genome-to-genome sequence comparison

          The pragmatic species concept for Bacteria and Archaea is ultimately based on DNA-DNA hybridization (DDH). While enabling the taxonomist, in principle, to obtain an estimate of the overall similarity between the genomes of two strains, this technique is tedious and error-prone and cannot be used to incrementally build up a comparative database. Recent technological progress in the area of genome sequencing calls for bioinformatics methods to replace the wet-lab DDH by in-silico genome-to-genome comparison. Here we investigate state-of-the-art methods for inferring whole-genome distances in their ability to mimic DDH. Algorithms to efficiently determine high-scoring segment pairs or maximally unique matches perform well as a basis of inferring intergenomic distances. The examined distance functions, which are able to cope with heavily reduced genomes and repetitive sequence regions, outperform previously described ones regarding the correlation with and error ratios in emulating DDH. Simulation of incompletely sequenced genomes indicates that some distance formulas are very robust against missing fractions of genomic information. Digitally derived genome-to-genome distances show a better correlation with 16S rRNA gene sequence distances than DDH values. The future perspectives of genome-informed taxonomy are discussed, and the investigated methods are made available as a web service for genome-based species delineation.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Human-mouse alignments with BLASTZ.

            The Mouse Genome Analysis Consortium aligned the human and mouse genome sequences for a variety of purposes, using alignment programs that suited the various needs. For investigating issues regarding genome evolution, a particularly sensitive method was needed to permit alignment of a large proportion of the neutrally evolving regions. We selected a program called BLASTZ, an independent implementation of the Gapped BLAST algorithm specifically designed for aligning two long genomic sequences. BLASTZ was subsequently modified, both to attain efficiency adequate for aligning entire mammalian genomes and to increase its sensitivity. This work describes BLASTZ, its modifications, the hardware environment on which we run it, and several empirical studies to validate its results.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Whole-genome prokaryotic phylogeny.

              Current understanding of the phylogeny of prokaryotes is based on the comparison of the highly conserved small ssu-rRNA subunit and similar regions. Although such molecules have proved to be very useful phylogenetic markers, mutational saturation is a problem, due to their restricted lengths. Now, a growing number of complete prokaryotic genomes are available. This paper addresses the problem of determining a prokaryotic phylogeny utilizing the comparison of complete genomes. We introduce a new strategy, GBDP, 'genome blast distance phylogeny', and show that different variants of this approach robustly produce phylogenies that are biologically sound, when applied to 91 prokaryotic genomes. In this approach, first Blast is used to compare genomes, then a distance matrix is computed, and finally a tree- or network-reconstruction method such as UPGMA, Neighbor-Joining, BioNJ or Neighbor-Net is applied.
                Bookmark

                Author and article information

                Journal
                Stand Genomic Sci
                SIGS
                Standards in Genomic Sciences
                Michigan State University
                1944-3277
                28 January 2010
                28 February 2010
                : 2
                : 1
                : 142-148
                Affiliations
                [1 ]Center for Bioinformatics Tübingen, Eberhard-Karls-Universität, Tübingen, Germany
                [2 ]DSMZ – German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany.
                Author notes
                [* ] Corresponding author: Hans-Peter Klenk.
                Article
                sigs.541628
                10.4056/sigs.541628
                3035261
                21304686
                70e1a997-6319-4d03-87b3-6731740fe9a2
                Copyright @ 2010

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                Categories
                Standard Operating Procedures

                Genetics
                genomics,phylogeny,mummer,microbial taxonomy,ggdc web server,gbdp,blast,species delineation
                Genetics
                genomics, phylogeny, mummer, microbial taxonomy, ggdc web server, gbdp, blast, species delineation

                Comments

                Comment on this article