2
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12

      research-article
      1 , 2 , 1 , 1 , 1 , 1 , 2 , 3 , 1 , 1 , 1 , 1 , 1 , 1 , 4 , 1 , 1 , 1 , 1 , 5 , 2 , 6 , 7 , 1 , 2 , 8 , * ,
      Microbial Genomics
      Microbiology Society
      ChIP-seq, ChIP-exo, RNA-seq, gSELEX, DAP-seq, Transcriptional Regulatory Network, High-Throughput Nucleotide Sequencing, Escherichia coli K-12

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Genomics has set the basis for a variety of methodologies that produce high-throughput datasets identifying the different players that define gene regulation, particularly regulation of transcription initiation and operon organization. These datasets are available in public repositories, such as the Gene Expression Omnibus, or ArrayExpress. However, accessing and navigating such a wealth of data is not straightforward. No resource currently exists that offers all available high and low-throughput data on transcriptional regulation in Escherichia coli K-12 to easily use both as whole datasets, or as individual interactions and regulatory elements. RegulonDB ( https://regulondb.ccg.unam.mx) began gathering high-throughput dataset collections in 2009, starting with transcription start sites, then adding ChIP-seq and gSELEX in 2012, with up to 99 different experimental high-throughput datasets available in 2019. In this paper we present a radical upgrade to more than 2000 high-throughput datasets, processed to facilitate their comparison, introducing up-to-date collections of transcription termination sites, transcription units, as well as transcription factor binding interactions derived from ChIP-seq, ChIP-exo, gSELEX and DAP-seq experiments, besides expression profiles derived from RNA-seq experiments. For ChIP-seq experiments we offer both the data as presented by the authors, as well as data uniformly processed in-house, enhancing their comparability, as well as the traceability of the methods and reproducibility of the results. Furthermore, we have expanded the tools available for browsing and visualization across and within datasets. We include comparisons against previously existing knowledge in RegulonDB from classic experiments, a nucleotide-resolution genome viewer, and an interface that enables users to browse datasets by querying their metadata. A particular effort was made to automatically extract detailed experimental growth conditions by implementing an assisted curation strategy applying Natural language processing and machine learning. We provide summaries with the total number of interactions found in each experiment, as well as tools to identify common results among different experiments. This is a long-awaited resource to make use of such wealth of knowledge and advance our understanding of the biology of the model bacterium E. coli K-12.

          Related collections

          Most cited references52

          • Record: found
          • Abstract: found
          • Article: not found

          Fast gapped-read alignment with Bowtie 2.

          As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Cutadapt removes adapter sequences from high-throughput sequencing reads

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              MultiQC: summarize analysis results for multiple tools and samples in a single report

              Motivation: Fast and accurate quality control is essential for studies involving next-generation sequencing data. Whilst numerous tools exist to quantify QC metrics, there is no common approach to flexibly integrate these across tools and large sample sets. Assessing analysis results across an entire project can be time consuming and error prone; batch effects and outlier samples can easily be missed in the early stages of analysis. Results: We present MultiQC, a tool to create a single report visualising output from multiple tools across many samples, enabling global trends and biases to be quickly identified. MultiQC can plot data from many common bioinformatics tools and is built to allow easy extension and customization. Availability and implementation: MultiQC is available with an GNU GPLv3 license on GitHub, the Python Package Index and Bioconda. Documentation and example reports are available at http://multiqc.info Contact: phil.ewels@scilifelab.se
                Bookmark

                Author and article information

                Journal
                Microb Genom
                Microb Genom
                mgen
                mgen
                Microbial Genomics
                Microbiology Society
                2057-5858
                2022
                18 May 2022
                18 May 2022
                : 8
                : 5
                : mgen000833
                Affiliations
                [ 1] Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Avenida Universidad s/n , Cuernavaca 62210, Morelos, Mexico
                [ 2] departmentDepartment of Biomedical Engineering , Boston University, 44 Cummington Mall , Boston, MA 02215, USA
                [ 3] Instituto Nacional de Medicina Genómica, INMEGEN, Periférico Sur 4809, Arenal Tepepan , Tlalpan 14610, CDMX, Mexico
                [ 4] Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Querétaro 76230 , Querétaro, Mexico
                [ 5] departmentDepartment of Biology , Wilfrid Laurier University, 75 University Ave W , Waterloo, ON N2L 3C5, Canada
                [ 6] Wadsworth Center, New York State Department of Health , Albany, NY, USA
                [ 7] departmentDepartment of Biomedical Sciences , University at Albany, SUNY , Albany, NY, USA
                [ 8] departmentCentre for Genomic Regulation (CRG) , The Barcelona Institute of Science and Technology, Dr. Aiguader 88 , Barcelona 08003, Universitat Pompeu Fabra(UPF), Barcelona, Spain
                Author notes
                [†]

                These authors contributed equally to this work

                *Correspondence: Julio Collado-Vides, colladojulio@ 123456gmail.com
                Author information
                https://orcid.org/0000-0003-3506-8657
                https://orcid.org/0000-0002-7684-8679
                https://orcid.org/0000-0002-3166-5801
                https://orcid.org/0000-0001-7708-5143
                https://orcid.org/0000-0002-6320-9501
                https://orcid.org/0000-0002-9462-2737
                https://orcid.org/0000-0002-8895-3564
                https://orcid.org/0000-0002-2549-1614
                https://orcid.org/0000-0003-4966-138X
                https://orcid.org/0000-0002-2457-4450
                https://orcid.org/0000-0001-8780-7664
                Article
                000833
                10.1099/mgen.0.000833
                9465075
                35584008
                967faaa8-0d1f-497c-80db-2439da502ec6
                © 2022 The Authors

                This is an open-access article distributed under the terms of the Creative Commons Attribution License.

                History
                : 18 December 2021
                : 24 April 2022
                Funding
                Funded by: DGAPA-UNAM
                Award ID: 369220
                Award Recipient : EstefaniGaytan-Nuñez
                Funded by: DGAPA-UNAM
                Award ID: 18182
                Award Recipient : EstefaniGaytan-Nuñez
                Funded by: CONACyT
                Award ID: 929687
                Award Recipient : ClaireRioualen
                Funded by: Canadian Network for Research and Innovation in Machining Technology, Natural Sciences and Engineering Research Council of Canada
                Award Recipient : GabrielMoreno-Hagelsieb
                Funded by: National Institute of General Medical Sciences
                Award ID: 5RO1GM131643
                Award Recipient : JulioCollado-Vides
                Funded by: UNAM-PAPIIT
                Award ID: IA203420
                Award Recipient : Carlos-FranciscoMéndez-Cruz
                Funded by: DGAPA-UNAM
                Award ID: Postdoctoral Fellowship
                Award Recipient : LaraPaloma
                Categories
                Research Articles
                Genomic Methodologies
                Custom metadata
                0

                chip-seq,chip-exo,rna-seq,gselex,dap-seq,transcriptional regulatory network,high-throughput nucleotide sequencing, escherichia coli k-12

                Comments

                Comment on this article