Blog
About

286
views
0
recommends
+1 Recommend
0 collections
    9
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Organised Genome Dynamics in the Escherichia coli Species Results in Highly Diverse Adaptive Paths

      1 , 2 , 3 , 3 , 4 , 5 , 6 , 6 , 6 , 7 , 3 , 8 , 9 , 3 , 8 , 10 , 5 , 4 , 11 , 12 , 3 , 13 , 10 , 14 , 15 , 16 , 3 , 4 , 16 , 5 , 12 , 4 , 11 , 16 , 8 , 5 , 17 , 3 , 4 , 8 , 8 , * , 1 , 2 , * , 3 , *

      PLoS Genetics

      Public Library of Science

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Escherichia coli species represents one of the best-studied model organisms, but also encompasses a variety of commensal and pathogenic strains that diversify by high rates of genetic change. We uniformly (re-) annotated the genomes of 20 commensal and pathogenic E. coli strains and one strain of E. fergusonii (the closest E. coli related species), including seven that we sequenced to completion. Within the ∼18,000 families of orthologous genes, we found ∼2,000 common to all strains. Although recombination rates are much higher than mutation rates, we show, both theoretically and using phylogenetic inference, that this does not obscure the phylogenetic signal, which places the B2 phylogenetic group and one group D strain at the basal position. Based on this phylogeny, we inferred past evolutionary events of gain and loss of genes, identifying functional classes under opposite selection pressures. We found an important adaptive role for metabolism diversification within group B2 and Shigella strains, but identified few or no extraintestinal virulence-specific genes, which could render difficult the development of a vaccine against extraintestinal infections. Genome flux in E. coli is confined to a small number of conserved positions in the chromosome, which most often are not associated with integrases or tRNA genes. Core genes flanking some of these regions show higher rates of recombination, suggesting that a gene, once acquired by a strain, spreads within the species by homologous recombination at the flanking genes. Finally, the genome's long-scale structure of recombination indicates lower recombination rates, but not higher mutation rates, at the terminus of replication. The ensuing effect of background selection and biased gene conversion may thus explain why this region is A+T-rich and shows high sequence divergence but low sequence polymorphism. Overall, despite a very high gene flow, genes co-exist in an organised genome.

          Author Summary

          Although abundant knowledge has been accumulated regarding the E. coli laboratory strain K-12, little is known about the evolutionary trajectories that have driven the high diversity observed among natural isolates of the species, which encompass both commensal and highly virulent intestinal and extraintestinal pathogenic strains. We have annotated or re-annotated the genomes of 20 commensal and pathogenic E. coli strains and one strain of E. fergusonii (the closest E. coli related species), including seven that we sequenced to completion. Although recombination rates are much higher than mutation rates, we were able to reconstruct a robust phylogeny based on the ∼2,000 genes common to all strains. Based on this phylogeny, we established the evolutionary scenario of gains and losses of thousands of specific genes, identifying functional classes under opposite selection pressures. This genome flux is confined to very few positions in the chromosome, which are the same for every genome. Notably, we identified few or no extraintestinal virulence-specific genes. We also defined a long-scale structure of recombination in the genome with lower recombination rates at the terminus of replication. These findings demonstrate that, despite a very high gene flow, genes can co-exist in an organised genome.

          Related collections

          Most cited references 128

          • Record: found
          • Abstract: found
          • Article: not found

          A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood.

          The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximum- likelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            APE: Analyses of Phylogenetics and Evolution in R language.

            Analysis of Phylogenetics and Evolution (APE) is a package written in the R language for use in molecular evolution and phylogenetics. APE provides both utility functions for reading and writing data and manipulating phylogenetic trees, as well as several advanced methods for phylogenetic and evolutionary analysis (e.g. comparative and population genetic methods). APE takes advantage of the many R functions for statistics and graphics, and also provides a flexible framework for developing and implementing further statistical methods for the analysis of evolutionary processes. The program is free and available from the official R package archive at http://cran.r-project.org/src/contrib/PACKAGES.html#ape. APE is licensed under the GNU General Public License.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.

               F Tajima (1989)
              The relationship between the two estimates of genetic variation at the DNA level, namely the number of segregating sites and the average number of nucleotide differences estimated from pairwise comparison, is investigated. It is found that the correlation between these two estimates is large when the sample size is small, and decreases slowly as the sample size increases. Using the relationship obtained, a statistical method for testing the neutral mutation hypothesis is developed. This method needs only the data of DNA polymorphism, namely the genetic variation within population at the DNA level. A simple method of computer simulation, that was used in order to obtain the distribution of a new statistic developed, is also presented. Applying this statistical method to the five regions of DNA sequences in Drosophila melanogaster, it is found that large insertion/deletion (greater than 100 bp) is deleterious. It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                January 2009
                January 2009
                23 January 2009
                : 5
                : 1
                Affiliations
                [1 ]Atelier de BioInformatique, Université Pierre et Marie Curie - Paris 6 (UPMC), Paris, France
                [2 ]Microbial Evolutionary Genomics, Institut Pasteur, CNRS URA2171, Paris, France
                [3 ]Faculté de Médecine, Université Paris 7 Denis Diderot, INSERM U722, Site Xavier Bichat, Paris, France
                [4 ]Génoscope, Institut de Génomique, CEA, Evry, France
                [5 ]Faculté de Médecine, Université Paris 5 René Descartes, INSERM U571, Paris, France
                [6 ]Université Paris 7 Denis Diderot, Hôpital Robert Debré (APHP), EA 3105, Paris, France
                [7 ]Plate-Forme Génomique, Institut Pasteur, Paris, France
                [8 ]Laboratoire de Génomique Comparative, CNRS UMR8030, Institut de Génomique, CEA, Génoscope, Evry, France
                [9 ]UR1077 Mathématique, Informatique, et Génome, INRA, Jouy en Josas, France
                [10 ]Unité de Génétique des Génomes Bactériens, Institut Pasteur, CNRS URA2171, Paris, France
                [11 ]UR888 Unité des Bactéries Lactiques et Pathogènes Opportunistes, INRA, Jouy en Josas, France
                [12 ]Faculté de Médecine, Université Paris 5 René Descartes, INSERM U570, Paris, France
                [13 ]Unité de Génétique des Biofilms, Institut Pasteur, CNRS URA2172, Paris, France
                [14 ]Veterans Affairs Medical Center, Minneapolis, Minnesota, United States of America
                [15 ]Department of Medicine, University of Minnesota, Minneapolis, Minnesota, United States of America
                [16 ]Pathogénie Bactérienne des Muqueuses, Institut Pasteur, Paris, France
                [17 ]Université Grenoble 1 Joseph Fourier, CNRS UMR 5163, Grenoble, France
                Universidad de Sevilla, Spain
                Author notes

                Conceived and designed the experiments: OT VB EPCR ED. Performed the experiments: VB CB OC CD LG SM SO BV. Analyzed the data: MT CH OT VB SB PB EB SB OB AC HC SC AD MD MEK EF JMG AMG JJ CLB ML VMJ IM XN MAP CP ZR CSR DS JT DV CM EPCR ED. Contributed reagents/materials/analysis tools: MT CH OT VB CM EPCR. Wrote the paper: MT CH OT JJ CM EPCR ED.

                Article
                08-PLGE-RA-1131R2
                10.1371/journal.pgen.1000344
                2617782
                19165319
                Touchon et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                Counts
                Pages: 25
                Categories
                Research Article
                Evolutionary Biology/Evolutionary and Comparative Genetics
                Evolutionary Biology/Microbial Evolution and Genomics
                Microbiology/Medical Microbiology

                Genetics

                Comments

                Comment on this article