154
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Structure and dynamics of the pan-genome of Streptococcus pneumoniae and closely related species

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Streptococcus pneumoniae is one of the most important causes of microbial diseases in humans. The genomes of 44 diverse strains of S. pneumoniae were analyzed and compared with strains of non-pathogenic streptococci of the Mitis group.

          Results

          Despite evidence of extensive recombination, the S. pneumoniae phylogenetic tree revealed six major lineages. With the exception of serotype 1, the tree correlated poorly with capsular serotype, geographical site of isolation and disease outcome. The distribution of dispensable genes - genes present in more than one strain but not in all strains - was consistent with phylogeny, although horizontal gene transfer events attenuated this correlation in the case of ancient lineages. Homologous recombination, involving short stretches of DNA, was the dominant evolutionary process of the core genome of S. pneumoniae. Genetic exchange occurred both within and across the borders of the species, and S. mitis was the main reservoir of genetic diversity of S. pneumoniae. The pan-genome size of S. pneumoniae increased logarithmically with the number of strains and linearly with the number of polymorphic sites of the sampled genomes, suggesting that acquired genes accumulate proportionately to the age of clones. Most genes associated with pathogenicity were shared by all S. pneumoniae strains, but were also present in S. mitis, S. oralis and S. infantis, indicating that these genes are not sufficient to determine virulence.

          Conclusions

          Genetic exchange with related species sharing the same ecological niche is the main mechanism of evolution of S. pneumoniae. The open pan-genome guarantees the species a quick and economical response to diverse environments.

          Related collections

          Most cited references43

          • Record: found
          • Abstract: found
          • Article: not found

          Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome".

          The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for approximately 80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Improved tools for biological sequence comparison.

            We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data.

              The introduction of multilocus sequence typing (MLST) for the precise characterization of isolates of bacterial pathogens has had a marked impact on both routine epidemiological surveillance and microbial population biology. In both fields, a key prerequisite for exploiting this resource is the ability to discern the relatedness and patterns of evolutionary descent among isolates with similar genotypes. Traditional clustering techniques, such as dendrograms, provide a very poor representation of recent evolutionary events, as they attempt to reconstruct relationships in the absence of a realistic model of the way in which bacterial clones emerge and diversify to form clonal complexes. An increasingly popular approach, called BURST, has been used as an alternative, but present implementations are unable to cope with very large data sets and offer crude graphical outputs. Here we present a new implementation of this algorithm, eBURST, which divides an MLST data set of any size into groups of related isolates and clonal complexes, predicts the founding (ancestral) genotype of each clonal complex, and computes the bootstrap support for the assignment. The most parsimonious patterns of descent of all isolates in each clonal complex from the predicted founder(s) are then displayed. The advantages of eBURST for exploring patterns of evolutionary descent are demonstrated with a number of examples, including the simple Spain(23F)-1 clonal complex of Streptococcus pneumoniae, "population snapshots" of the entire S. pneumoniae and Staphylococcus aureus MLST databases, and the more complicated clonal complexes observed for Campylobacter jejuni and Neisseria meningitidis.
                Bookmark

                Author and article information

                Journal
                Genome Biol
                Genome Biology
                BioMed Central
                1465-6906
                1465-6914
                2010
                29 October 2010
                : 11
                : 10
                : R107
                Affiliations
                [1 ]Novartis Vaccines and Diagnostics, Via Fiorentina 1, 53100 Siena, Italy
                [2 ]Allegheny General Hospital, Allegheny-Singer Research Institute, Center for Genomic Sciences, Pittsburgh, Pennsylvania 152123, USA
                [3 ]Institute for Genome Sciences, Department of Microbiology and Immunology, University of Maryland School of Medicine, 801 West Baltimore Street, MD 21201, USA
                [4 ]The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
                [5 ]Laboratorio di Microbiologia Molecolare e Biotecnologia, Dipartimento di Biologia Molecolare, Universita' di Siena, Policlinico Le Scotte, 53100 Siena, Italy
                [6 ]Division of Infection and Immunity, Glasgow Biomedical Research Centre, University of Glasgow, 120 University Place, Glasgow G12 8TA, UK
                [7 ]Institute of Medical Microbiology and Immunology, Aarhus University, DK-8000 Aarhus, Denmark
                [8 ]University of Oxford Department of Paediatrics, Medical Sciences Division, John Radcliffe Hospital, Headington OX3 9DU, UK
                Article
                gb-2010-11-10-r107
                10.1186/gb-2010-11-10-r107
                3218663
                21034474
                8cd7e3a0-489b-46ef-8814-477e51662ac4
                Copyright ©2010 Donati et al.; licensee BioMed Central Ltd.

                This is an open access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited

                History
                : 7 June 2010
                : 19 October 2010
                : 29 October 2010
                Categories
                Research

                Genetics
                Genetics

                Comments

                Comment on this article