Blog
About

88
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Comparison of 61 Sequenced Escherichia coli Genomes

      1 ,   1 , 2 ,   , 1

      Microbial Ecology

      Springer-Verlag

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics trees, and to identify the pan- and core genomes of this set of sequenced strains. A hierarchical clustering of variable genes allowed clear separation of the strains into clusters, including known pathotypes; clinically relevant serotypes can also be resolved in this way. In contrast, when in silico MLST was performed, many of the various strains appear jumbled and less well resolved. The predicted pan-genome comprises 15,741 gene families, and only 993 (6%) of the families are represented in every genome, comprising the core genome. The variable or ‘accessory’ genes thus make up more than 90% of the pan-genome and about 80% of a typical genome; some of these variable genes tend to be co-localized on genomic islands. The diversity within the species E. coli, and the overlap in gene content between this and related species, suggests a continuum rather than sharp species borders in this group of Enterobacteriaceae.

          Related collections

          Most cited references 44

          • Record: found
          • Abstract: found
          • Article: not found

          Clustal W and Clustal X version 2.0.

          The Clustal W and Clustal X multiple sequence alignment programs have been completely rewritten in C++. This will facilitate the further development of the alignment algorithms in the future and has allowed proper porting of the programs to the latest versions of Linux, Macintosh and Windows operating systems. The programs can be run on-line from the EBI web server: http://www.ebi.ac.uk/tools/clustalw2. The source code and executables for Windows, Linux and Macintosh computers are available from the EBI ftp site ftp://ftp.ebi.ac.uk/pub/software/clustalw2/
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found

            RNAmmer: consistent and rapid annotation of ribosomal RNA genes

            The publication of a complete genome sequence is usually accompanied by annotations of its genes. In contrast to protein coding genes, genes for ribosomal RNA (rRNA) are often poorly or inconsistently annotated. This makes comparative studies based on rRNA genes difficult. We have therefore created computational predictors for the major rRNA species from all kingdoms of life and compiled them into a program called RNAmmer. The program uses hidden Markov models trained on data from the 5S ribosomal RNA database and the European ribosomal RNA database project. A pre-screening step makes the method fast with little loss of sensitivity, enabling the analysis of a complete bacterial genome in less than a minute. Results from running RNAmmer on a large set of genomes indicate that the location of rRNAs can be predicted with a very high level of accuracy. Novel, unannotated rRNAs are also predicted in many genomes. The software as well as the genome analysis results are available at the CBS web server.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The complete genome sequence of Escherichia coli K-12.

              The 4,639,221-base pair sequence of Escherichia coli K-12 is presented. Of 4288 protein-coding genes annotated, 38 percent have no attributed function. Comparison with five other sequenced microbes reveals ubiquitous as well as narrowly distributed gene families; many families of similar genes within E. coli are also evident. The largest family of paralogous proteins contains 80 ABC transporters. The genome as a whole is strikingly organized with respect to the local direction of replication; guanines, oligonucleotides possibly related to replication and recombination, and most genes are so oriented. The genome also contains insertion sequence (IS) elements, phage remnants, and many other patches of unusual composition indicating genome plasticity through horizontal transfer.
                Bookmark

                Author and article information

                Contributors
                +45-4525-2488 , +45-4593-1585 , dave@cbs.dtu.dk
                Journal
                Microb Ecol
                Microbial Ecology
                Springer-Verlag (New York )
                0095-3628
                1432-184X
                11 July 2010
                11 July 2010
                November 2010
                : 60
                : 4
                : 708-720
                Affiliations
                [1 ]Center for Biological Sequence Analysis, Building 208, Department of Systems Biology, The Technical University of Denmark, 2800 Kgs., Lyngby, Denmark
                [2 ]Molecular Microbiology and Genomics Consultants, Tannenstrasse 7, 55576 Zotzenheim, Germany
                Article
                9717
                10.1007/s00248-010-9717-3
                2974192
                20623278
                © The Author(s) 2010
                Categories
                Minireviews
                Custom metadata
                © Springer Science+Business Media, LLC 2010

                Microbiology & Virology

                Comments

                Comment on this article