270
views
0
recommends
+1 Recommend
0 collections
    12
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Identifying master regulators of biological processes and mapping their downstream gene networks are key challenges in systems biology. We developed a computational method, called iRegulon, to reverse-engineer the transcriptional regulatory network underlying a co-expressed gene set using cis-regulatory sequence analysis. iRegulon implements a genome-wide ranking-and-recovery approach to detect enriched transcription factor motifs and their optimal sets of direct targets. We increase the accuracy of network inference by using very large motif collections of up to ten thousand position weight matrices collected from various species, and linking these to candidate human TFs via a motif2TF procedure. We validate iRegulon on gene sets derived from ENCODE ChIP-seq data with increasing levels of noise, and we compare iRegulon with existing motif discovery methods. Next, we use iRegulon on more challenging types of gene lists, including microRNA target sets, protein-protein interaction networks, and genetic perturbation data. In particular, we over-activate p53 in breast cancer cells, followed by RNA-seq and ChIP-seq, and could identify an extensive up-regulated network controlled directly by p53. Similarly we map a repressive network with no indication of direct p53 regulation but rather an indirect effect via E2F and NFY. Finally, we generalize our computational framework to include regulatory tracks such as ChIP-seq data and show how motif and track discovery can be combined to map functional regulatory interactions among co-expressed genes. iRegulon is available as a Cytoscape plugin from http://iregulon.aertslab.org.

          Author Summary

          Gene regulatory networks control developmental, homeostatic, and disease processes by governing precise levels and spatio-temporal patterns of gene expression. Determining their topology can provide mechanistic insight into these processes. Gene regulatory networks consist of interactions between transcription factors and their direct target genes. Each regulatory interaction represents the binding of the transcription factor to a specific DNA binding site near its target gene. Here we present a computational method, called iRegulon, to identify master regulators and direct target genes in a human gene signature, i.e. a set of co-expressed genes. iRegulon relies on the analysis of the regulatory sequences around each gene in the gene set to detect enriched TF motifs or ChIP-seq peaks, using databases of nearly 10.000 TF motifs and 1000 ChIP-seq data sets or “ tracks”. Next, it associates enriched motifs and tracks with candidate transcription factors and determines the optimal subset of direct target genes. We validate iRegulon on ENCODE data, and use it in combination with RNA-seq and ChIP-seq data to map a p53 downstream network with new predicted co-factors and targets. iRegulon is available as a Cytoscape plugin, supporting human, mouse, and Drosophila genes, and provides access to hundreds of cancer-related TF-target subnetworks or “ regulons”.

          Related collections

          Most cited references83

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Transcriptional control of human p53-regulated genes.

            The p53 protein regulates the transcription of many different genes in response to a wide variety of stress signals. Following DNA damage, p53 regulates key processes, including DNA repair, cell-cycle arrest, senescence and apoptosis, in order to suppress cancer. This Analysis article provides an overview of the current knowledge of p53-regulated genes in these pathways and others, and the mechanisms of their regulation. In addition, we present the most comprehensive list so far of human p53-regulated genes and their experimentally validated, functional binding sites that confer p53 regulation.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The UCSC Genome Browser database: extensions and updates 2013

              The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic datasets. As of September 2012, genomic sequence and a basic set of annotation ‘tracks’ are provided for 63 organisms, including 26 mammals, 13 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms, yeast and sea hare. In the past year 19 new genome assemblies have been added, and we anticipate releasing another 28 in early 2013. Further, a large number of annotation tracks have been either added, updated by contributors or remapped to the latest human reference genome. Among these are an updated UCSC Genes track for human and mouse assemblies. We have also introduced several features to improve usability, including new navigation menus. This article provides an update to the UCSC Genome Browser database, which has been previously featured in the Database issue of this journal.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                1553-734X
                1553-7358
                July 2014
                24 July 2014
                : 10
                : 7
                : e1003731
                Affiliations
                [1 ]Laboratory of Computational Biology, KU Leuven Center for Human Genetics, Leuven, Belgium
                [2 ]Laboratory for Molecular Cancer Biology, KU Leuven Center for Human Genetics, Leuven, Belgium
                [3 ]VIB Center for the Biology of Disease, Laboratory for Molecular Cancer Biology, Leuven, Belgium
                Columbia University, United States of America
                Author notes

                The authors have declared that no competing interests exist.

                Conceived and designed the experiments: JCM SA. Performed the experiments: RJ AV LS VC. Analyzed the data: RJ AV SA. Wrote the paper: RJ AV SA. Developed the software: RJ BVdS KH GH SA. Performed the tool comparison: RJ AV HI LS VC MNS DP DS ZKA MF.

                Article
                PCOMPBIOL-D-14-00274
                10.1371/journal.pcbi.1003731
                4109854
                25058159
                afb052dd-4d7b-4132-8eb2-d2f431a4fdf7
                Copyright @ 2014

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 13 February 2014
                : 27 May 2014
                Page count
                Pages: 19
                Funding
                This work is funded by FWO ( www.fwo.be) (grants G.0704.11N and G.0640.13 to SA), Special Research Fund (BOF) KU Leuven ( http://www.kuleuven.be/research/funding/bof/) (grant PF/10/016 and OT/13/103 to SA), HFSP ( www.hfsp.org) (grant RGY0070/2011 to SA), and Foundation Against Cancer ( http://www.cancer.be) (grants 2010-154 and 2012-F2 to SA). RJ is supported by postdoc fellowships from Belspo, KU Leuven Research Fund (F+) and FWO. AV and LS have PhD fellowships from FWO. BVdS was supported by a 1-year fellowship from the Vlaamse Liga tegen Kanker (VLK). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Transcriptome Analysis
                Genetics
                Genomics
                Computer and Information Sciences
                Network Analysis
                Regulatory Networks

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article