380
views
0
recommends
+1 Recommend
0 collections
    9
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Organised Genome Dynamics in the Escherichia coli Species Results in Highly Diverse Adaptive Paths

      research-article
      1 , 2 , 3 , 3 , 4 , 5 , 6 , 6 , 6 , 7 , 3 , 8 , 9 , 3 , 8 , 10 , 5 , 4 , 11 , 12 , 3 , 13 , 10 , 14 , 15 , 16 , 3 , 4 , 16 , 5 , 12 , 4 , 11 , 16 , 8 , 5 , 17 , 3 , 4 , 8 , 8 , * , 1 , 2 , * , 3 , *
      PLoS Genetics
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Escherichia coli species represents one of the best-studied model organisms, but also encompasses a variety of commensal and pathogenic strains that diversify by high rates of genetic change. We uniformly (re-) annotated the genomes of 20 commensal and pathogenic E. coli strains and one strain of E. fergusonii (the closest E. coli related species), including seven that we sequenced to completion. Within the ∼18,000 families of orthologous genes, we found ∼2,000 common to all strains. Although recombination rates are much higher than mutation rates, we show, both theoretically and using phylogenetic inference, that this does not obscure the phylogenetic signal, which places the B2 phylogenetic group and one group D strain at the basal position. Based on this phylogeny, we inferred past evolutionary events of gain and loss of genes, identifying functional classes under opposite selection pressures. We found an important adaptive role for metabolism diversification within group B2 and Shigella strains, but identified few or no extraintestinal virulence-specific genes, which could render difficult the development of a vaccine against extraintestinal infections. Genome flux in E. coli is confined to a small number of conserved positions in the chromosome, which most often are not associated with integrases or tRNA genes. Core genes flanking some of these regions show higher rates of recombination, suggesting that a gene, once acquired by a strain, spreads within the species by homologous recombination at the flanking genes. Finally, the genome's long-scale structure of recombination indicates lower recombination rates, but not higher mutation rates, at the terminus of replication. The ensuing effect of background selection and biased gene conversion may thus explain why this region is A+T-rich and shows high sequence divergence but low sequence polymorphism. Overall, despite a very high gene flow, genes co-exist in an organised genome.

          Author Summary

          Although abundant knowledge has been accumulated regarding the E. coli laboratory strain K-12, little is known about the evolutionary trajectories that have driven the high diversity observed among natural isolates of the species, which encompass both commensal and highly virulent intestinal and extraintestinal pathogenic strains. We have annotated or re-annotated the genomes of 20 commensal and pathogenic E. coli strains and one strain of E. fergusonii (the closest E. coli related species), including seven that we sequenced to completion. Although recombination rates are much higher than mutation rates, we were able to reconstruct a robust phylogeny based on the ∼2,000 genes common to all strains. Based on this phylogeny, we established the evolutionary scenario of gains and losses of thousands of specific genes, identifying functional classes under opposite selection pressures. This genome flux is confined to very few positions in the chromosome, which are the same for every genome. Notably, we identified few or no extraintestinal virulence-specific genes. We also defined a long-scale structure of recombination in the genome with lower recombination rates at the terminus of replication. These findings demonstrate that, despite a very high gene flow, genes can co-exist in an organised genome.

          Related collections

          Most cited references98

          • Record: found
          • Abstract: found
          • Article: not found

          Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome".

          The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for approximately 80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoidea.

            A maximum likelihood method for inferring evolutionary trees from DNA sequence data was developed by Felsenstein (1981). In evaluating the extent to which the maximum likelihood tree is a significantly better representation of the true tree, it is important to estimate the variance of the difference between log likelihood of different tree topologies. Bootstrap resampling can be used for this purpose (Hasegawa et al. 1988; Hasegawa and Kishino 1989), but it imposes a great computation burden. To overcome this difficulty, we developed a new method for estimating the variance by expressing it explicitly. The method was applied to DNA sequence data from primates in order to evaluate the maximum likelihood branching order among Hominoidea. It was shown that, although the orangutan is convincingly placed as an outgroup of a human and African apes clade, the branching order among human, chimpanzee, and gorilla cannot be determined confidently from the DNA sequence data presently available when the evolutionary rate constancy is not assumed.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data.

              O. Gascuel (1997)
              We propose an improved version of the neighbor-joining (NJ) algorithm of Saitou and Nei. This new algorithm, BIONJ, follows the same agglomerative scheme as NJ, which consists of iteratively picking a pair of taxa, creating a new mode which represents the cluster of these taxa, and reducing the distance matrix by replacing both taxa by this node. Moreover, BIONJ uses a simple first-order model of the variances and covariances of evolutionary distance estimates. This model is well adapted when these estimates are obtained from aligned sequences. At each step it permits the selection, from the class of admissible reductions, of the reduction which minimizes the variance of the new distance matrix. In this way, we obtain better estimates to choose the pair of taxa to be agglomerated during the next steps. Moreover, in comparison with NJ's estimates, these estimates become better and better as the algorithm proceeds. BIONJ retains the good properties of NJ--especially its low run time. Computer simulations have been performed with 12-taxon model trees to determine BIONJ's efficiency. When the substitution rates are low (maximum pairwise divergence approximately 0.1 substitutions per site) or when they are constant among lineages, BIONJ is only slightly better than NJ. When the substitution rates are higher and vary among lineages,BIONJ clearly has better topological accuracy. In the latter case, for the model trees and the conditions of evolution tested, the topological error reduction is on the average around 20%. With highly-varying-rate trees and with high substitution rates (maximum pairwise divergence approximately 1.0 substitutions per site), the error reduction may even rise above 50%, while the probability of finding the correct tree may be augmented by as much as 15%.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                January 2009
                January 2009
                23 January 2009
                : 5
                : 1
                : e1000344
                Affiliations
                [1 ]Atelier de BioInformatique, Université Pierre et Marie Curie - Paris 6 (UPMC), Paris, France
                [2 ]Microbial Evolutionary Genomics, Institut Pasteur, CNRS URA2171, Paris, France
                [3 ]Faculté de Médecine, Université Paris 7 Denis Diderot, INSERM U722, Site Xavier Bichat, Paris, France
                [4 ]Génoscope, Institut de Génomique, CEA, Evry, France
                [5 ]Faculté de Médecine, Université Paris 5 René Descartes, INSERM U571, Paris, France
                [6 ]Université Paris 7 Denis Diderot, Hôpital Robert Debré (APHP), EA 3105, Paris, France
                [7 ]Plate-Forme Génomique, Institut Pasteur, Paris, France
                [8 ]Laboratoire de Génomique Comparative, CNRS UMR8030, Institut de Génomique, CEA, Génoscope, Evry, France
                [9 ]UR1077 Mathématique, Informatique, et Génome, INRA, Jouy en Josas, France
                [10 ]Unité de Génétique des Génomes Bactériens, Institut Pasteur, CNRS URA2171, Paris, France
                [11 ]UR888 Unité des Bactéries Lactiques et Pathogènes Opportunistes, INRA, Jouy en Josas, France
                [12 ]Faculté de Médecine, Université Paris 5 René Descartes, INSERM U570, Paris, France
                [13 ]Unité de Génétique des Biofilms, Institut Pasteur, CNRS URA2172, Paris, France
                [14 ]Veterans Affairs Medical Center, Minneapolis, Minnesota, United States of America
                [15 ]Department of Medicine, University of Minnesota, Minneapolis, Minnesota, United States of America
                [16 ]Pathogénie Bactérienne des Muqueuses, Institut Pasteur, Paris, France
                [17 ]Université Grenoble 1 Joseph Fourier, CNRS UMR 5163, Grenoble, France
                Universidad de Sevilla, Spain
                Author notes

                Conceived and designed the experiments: OT VB EPCR ED. Performed the experiments: VB CB OC CD LG SM SO BV. Analyzed the data: MT CH OT VB SB PB EB SB OB AC HC SC AD MD MEK EF JMG AMG JJ CLB ML VMJ IM XN MAP CP ZR CSR DS JT DV CM EPCR ED. Contributed reagents/materials/analysis tools: MT CH OT VB CM EPCR. Wrote the paper: MT CH OT JJ CM EPCR ED.

                Article
                08-PLGE-RA-1131R2
                10.1371/journal.pgen.1000344
                2617782
                19165319
                159b4c40-246b-480a-80f7-28eb631debc0
                Touchon et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 2 September 2008
                : 16 December 2008
                Page count
                Pages: 25
                Categories
                Research Article
                Evolutionary Biology/Evolutionary and Comparative Genetics
                Evolutionary Biology/Microbial Evolution and Genomics
                Microbiology/Medical Microbiology

                Genetics
                Genetics

                Comments

                Comment on this article