16
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Linking high GC content to the repair of double strand breaks in prokaryotic genomes

      research-article
      , , *
      PLoS Genetics
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Genomic GC content varies widely among microbes for reasons unknown. While mutation bias partially explains this variation, prokaryotes near-universally have a higher GC content than predicted solely by this bias. Debate surrounds the relative importance of the remaining explanations of selection versus biased gene conversion favoring GC alleles. Some environments (e.g. soils) are associated with a high genomic GC content of their inhabitants, which implies that either high GC content is a selective adaptation to particular habitats, or that certain habitats favor increased rates of gene conversion. Here, we report a novel association between the presence of the non-homologous end joining DNA double-strand break repair pathway and GC content; this observation suggests that DNA damage may be a fundamental driver of GC content, leading in part to the many environmental patterns observed to-date. We discuss potential mechanisms accounting for the observed association, and provide preliminary evidence that sites experiencing higher rates of double-strand breaks are under selection for increased GC content relative to the genomic background.

          Author summary

          The overall nucleotide composition of an organism’s genome varies greatly between species. Previous work has identified certain environmental factors (e.g., oxygen availability) associated with the relative number of GC bases as opposed to AT bases in the genomes of species. Many of these environments that are associated with high GC content are also associated with relatively high rates of DNA damage. We show that organisms possessing the non-homologous end-joining DNA repair pathway, which is one mechanism to repair DNA double-strand breaks, have an elevated GC content relative to expectation. We also show that certain sites on the genome that are particularly susceptible to double strand breaks have an elevated GC content. This leads us to suggest that an important underlying driver of variability in nucleotide composition across environments is the rate of DNA damage (specifically double-strand breaks) to which an organism living in each environment is exposed.

          Related collections

          Most cited references57

          • Record: found
          • Abstract: found
          • Article: not found

          The All-Species Living Tree project: a 16S rRNA-based phylogenetic tree of all sequenced type strains.

          The signing authors together with the journal Systematic and Applied Microbiology (SAM) have started an ambitious project that has been conceived to provide a useful tool especially for the scientific microbial taxonomist community. The aim of what we have called "The All-Species Living Tree" is to reconstruct a single 16S rRNA tree harboring all sequenced type strains of the hitherto classified species of Archaea and Bacteria. This tree is to be regularly updated by adding the species with validly published names that appear monthly in the Validation and Notification lists of the International Journal of Systematic and Evolutionary Microbiology. For this purpose, the SAM executive editors, together with the responsible teams of the ARB, SILVA, and LPSN projects (www.arb-home.de, www.arb-silva.de, and www.bacterio.cict.fr, respectively), have prepared a 16S rRNA database containing over 6700 sequences, each of which represents a single type strain of a classified species up to 31 December 2007. The selection of sequences had to be undertaken manually due to a high error rate in the names and information fields provided for the publicly deposited entries. In addition, from among the often occurring multiple entries for a single type strain, the best-quality sequence was selected for the project. The living tree database that SAM now provides contains corrected entries and the best-quality sequences with a manually checked alignment. The tree reconstruction has been performed by using the maximum likelihood algorithm RAxML. The tree provided in the first release is a result of the calculation of a single dataset containing 9975 single entries, 6728 corresponding to type strain gene sequences, as well as 3247 additional high-fquality sequences to give robustness to the reconstruction. Trees are dynamic structures that change on the basis of the quality and availability of the data used for their calculation. Therefore, the addition of new type strain sequences in further subsequent releases may help to resolve certain branching orders that appear ambiguous in this first release. On the web sites: www.elsevier.de/syapm and www.arb-silva.de/living-tree, the All-Species Living Tree team will release a regularly updated database compatible with the ARB software environment containing the whole 16S rRNA dataset used to reconstruct "The All-Species Living Tree". As a result, the latest reconstructed phylogeny will be provided. In addition to the ARB file, a readable multi-FASTA universal sequence editor file with the complete alignment will be provided for those not using ARB. There is also a complete set of supplementary tables and figures illustrating the selection procedure and its outcome. It is expected that the All-Species Living Tree will help to improve future classification efforts by simplifying the selection of the correct type strain sequences. For queries, information updates, remarks on the dataset or tree reconstructions shown, a contact email address has been created (living-tree@arb-silva.de). This provides an entry point for anyone from the scientific community to provide additional input for the construction and improvement of the first tree compiling all sequenced type strains of all prokaryotic species for which names had been validly published.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A comparison of homologous recombination rates in bacteria and archaea.

            It is a standard practice to test for the signature of homologous recombination in studies examining the genetic diversity of bacterial populations. Although it has emerged that homologous recombination rates can vary widely between species, comparing the results from different studies is made difficult by the diversity of estimation methods used. Here, Multi Locus Sequence Typing (MLST) datasets from a wide variety of bacteria and archaea are analyzed using the ClonalFrame method. This enables a direct comparison between species and allows for a first exploration of the question whether phylogeny or ecology is the primary determinant of homologous recombination rate.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              REBASE--a database for DNA restriction and modification: enzymes, genes and genomes.

              REBASE is a comprehensive database of information about restriction enzymes, DNA methyltransferases and related proteins involved in the biological process of restriction-modification (R-M). It contains fully referenced information about recognition and cleavage sites, isoschizomers, neoschizomers, commercial availability, methylation sensitivity, crystal and sequence data. Experimentally characterized homing endonucleases are also included. The fastest growing segment of REBASE contains the putative R-M systems found in the sequence databases. Comprehensive descriptions of the R-M content of all fully sequenced genomes are available including summary schematics. The contents of REBASE may be browsed from the web (http://rebase.neb.com) and selected compilations can be downloaded by ftp (ftp.neb.com). Additionally, monthly updates can be requested via email.
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Formal analysisRole: InvestigationRole: MethodologyRole: Writing – original draftRole: Writing – review & editing
                Role: Writing – review & editing
                Role: ConceptualizationRole: SupervisionRole: Writing – original draftRole: Writing – review & editing
                Role: Editor
                Journal
                PLoS Genet
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, CA USA )
                1553-7390
                1553-7404
                November 2019
                8 November 2019
                : 15
                : 11
                : e1008493
                Affiliations
                [001] Department of Biology, University of Maryland - College Park, College Park, Maryland, United States of America
                University of Warwick, UNITED KINGDOM
                Author notes

                The authors have declared that no competing interests exist.

                Author information
                http://orcid.org/0000-0002-4237-4807
                http://orcid.org/0000-0001-6087-7064
                Article
                PGENETICS-D-19-01378
                10.1371/journal.pgen.1008493
                6867656
                31703064
                5b93751e-3f3c-47a2-8fbe-96ebe00d660f
                © 2019 Weissman et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 16 August 2019
                : 25 October 2019
                Page count
                Figures: 5, Tables: 1, Pages: 19
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/100000138, U.S. Department of Education;
                Award ID: GAANN
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000082, Division of Graduate Education;
                Award ID: DGE-1632976
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100006754, Army Research Laboratory;
                Award ID: W911NF-14-1-0490
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: R00 GM104158
                Award Recipient :
                JLW was supported by a GAANN Fellowship from the U.S. Department of Education and the University of Maryland as well as a COMBINE Fellowship from the University of Maryland and funded by NSF DGE-1632976. WFF was partially supported the U.S. Army Research Laboratory and the U.S. Army Research Office under Grant W911NF-14-1-0490. PLFJ was supported in part by NIH R00 GM104158. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and life sciences
                Genetics
                DNA
                DNA repair
                Non-Homologous End Joining
                Biology and life sciences
                Biochemistry
                Nucleic acids
                DNA
                DNA repair
                Non-Homologous End Joining
                Research and Analysis Methods
                Database and Informatics Methods
                Biological Databases
                Genomic Databases
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Genomic Databases
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Genomic Databases
                Biology and Life Sciences
                Computational Biology
                Comparative Genomics
                Biology and Life Sciences
                Genetics
                Genomics
                Comparative Genomics
                Biology and Life Sciences
                Evolutionary Biology
                Evolutionary Systematics
                Phylogenetics
                Biology and Life Sciences
                Taxonomy
                Evolutionary Systematics
                Phylogenetics
                Computer and Information Sciences
                Data Management
                Taxonomy
                Evolutionary Systematics
                Phylogenetics
                Biology and Life Sciences
                Cell Biology
                Cellular Types
                Prokaryotic Cells
                Biology and life sciences
                Genetics
                DNA
                DNA recombination
                Homologous Recombination
                Biology and life sciences
                Biochemistry
                Nucleic acids
                DNA
                DNA recombination
                Homologous Recombination
                Biology and life sciences
                Genetics
                DNA
                DNA repair
                Biology and life sciences
                Biochemistry
                Nucleic acids
                DNA
                DNA repair
                Biology and Life Sciences
                Evolutionary Biology
                Evolutionary Systematics
                Phylogenetics
                Phylogenetic Analysis
                Biology and Life Sciences
                Taxonomy
                Evolutionary Systematics
                Phylogenetics
                Phylogenetic Analysis
                Computer and Information Sciences
                Data Management
                Taxonomy
                Evolutionary Systematics
                Phylogenetics
                Phylogenetic Analysis
                Custom metadata
                vor-update-to-uncorrected-proof
                2019-11-20
                All data used came from public repositories. Completely sequenced prokaryotic genomes were from NCBI’s non-redundant RefSeq database (ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/). Relationships between prokaryotes were from the SILVA Living Tree ( https://www.arb-silva.de/projects/living-tree/). Clusters of related genomes were from the Alignable Tight Genomic Cluster (ATGC) database ( http://dmk-brain.ecn.uiowa.edu/ATGC/). Prokaryotic trait data were from the ProTraits database ( http://protraits.irb.hr/). Linkages between genomes and restriction enzymes were from the REBASE database ( http://rebase.neb.com/rebase/rebase.html). Intermediate data files and code may be found at: https://github.com/jlw-ecoevo/gcku.

                Genetics
                Genetics

                Comments

                Comment on this article