16
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      CAGEfightR: analysis of 5′-end data using R/Bioconductor

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          5′-end sequencing assays, and Cap Analysis of Gene Expression (CAGE) in particular, have been instrumental in studying transcriptional regulation. 5′-end methods provide genome-wide maps of transcription start sites (TSSs) with base pair resolution. Because active enhancers often feature bidirectional TSSs, such data can also be used to predict enhancer candidates. The current availability of mature and comprehensive computational tools for the analysis of 5′-end data is limited, preventing efficient analysis of new and existing 5′-end data.

          Results

          We present CAGEfightR, a framework for analysis of CAGE and other 5′-end data implemented as an R/Bioconductor-package. CAGEfightR can import data from BigWig files and allows for fast and memory efficient prediction and analysis of TSSs and enhancers. Downstream analyses include quantification, normalization, annotation with transcript and gene models, TSS shape statistics, linking TSSs to enhancers via co-expression, identification of enhancer clusters, and genome-browser style visualization. While built to analyze CAGE data, we demonstrate the utility of CAGEfightR in analyzing nascent RNA 5′-data (PRO-Cap). CAGEfightR is implemented using standard Bioconductor classes, making it easy to learn, use and combine with other Bioconductor packages, for example popular differential expression tools such as limma, DESeq2 and edgeR.

          Conclusions

          CAGEfightR provides a single, scalable and easy-to-use framework for comprehensive downstream analysis of 5′-end data. CAGEfightR is designed to be interoperable with other Bioconductor packages, thereby unlocking hundreds of mature transcriptomic analysis tools for 5′-end data. CAGEfightR is freely available via Bioconductor: bioconductor.org/packages/CAGEfightR .

          Electronic supplementary material

          The online version of this article (10.1186/s12859-019-3029-5) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: not found

          The transcriptional landscape of the mammalian genome.

          This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Genome-wide analysis of mammalian promoter architecture and evolution.

            Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a well-defined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Eukaryotic core promoters and the functional basis of transcription initiation

              RNA polymerase II (Pol II) core promoters are specialized DNA sequences at transcription start sites of protein-coding and non-coding genes that support the assembly of the transcription machinery and transcription initiation. They enable the highly regulated transcription of genes by selectively receiving and integrating regulatory cues from distal enhancers and associated regulatory proteins. In this Review we discuss the defining properties of gene core promoters, including their sequence features, chromatin architecture, and transcription initiation patterns. We provide an overview of molecular mechanisms underlying the function and regulation of core promoters and their emerging functional diversity, which defines distinct transcription programmes. Based on the established properties of gene core promoters, we discuss transcription start sites within enhancers and integrate recent results obtained from dedicated functional assays to propose a functional model of transcription initiation. This model can explain the nature and function of transcription initiation at gene starts and at enhancers and the different functional roles of core promoters, of RNA polymerase II and its associated factors and of the activating cues provided by enhancers and the transcription factors and cofactors they recruit.
                Bookmark

                Author and article information

                Contributors
                malte.thodberg@bric.ku.dk
                albin@binf.ku.dk
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                4 October 2019
                4 October 2019
                2019
                : 20
                : 487
                Affiliations
                [1 ]ISNI 0000 0001 0674 042X, GRID grid.5254.6, Department of Biology, , University of Copenhagen, ; Ole Maaløes Vej 5, DK2100, Copenhagen N, Denmark
                [2 ]ISNI 0000 0001 0674 042X, GRID grid.5254.6, Biotech Research and Innovation Centre, , University of Copenhagen, ; Ole Maaløes Vej 5, DK2100, Copenhagen N, Denmark
                [3 ]ISNI 0000 0001 2175 6024, GRID grid.417390.8, Danish Cancer Society, ; Strandboulevarden 49 DK2100, Copenhagen Ø, Denmark
                Author information
                http://orcid.org/0000-0001-6244-3841
                Article
                3029
                10.1186/s12859-019-3029-5
                6778389
                31585526
                598643f7-5676-4872-93ef-1ae02f5348ac
                © The Author(s). 2019

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 15 March 2019
                : 15 August 2019
                Categories
                Software
                Custom metadata
                © The Author(s) 2019

                Bioinformatics & Computational biology
                transcription start site,promoter,enhancer,enhancer rna,cage,5′-end methods,r-package,bioconductor

                Comments

                Comment on this article