54
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Biophysical Model for Analysis of Transcription Factor Interaction and Binding Site Arrangement from Genome-Wide Binding Data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          How transcription factors (TFs) interact with cis-regulatory sequences and interact with each other is a fundamental, but not well understood, aspect of gene regulation.

          Methodology/Principal Findings

          We present a computational method to address this question, relying on the established biophysical principles. This method, STAP (sequence to affinity prediction), takes into account all combinations and configurations of strong and weak binding sites to analyze large scale transcription factor (TF)-DNA binding data to discover cooperative interactions among TFs, infer sequence rules of interaction and predict TF target genes in new conditions with no TF-DNA binding data. The distinctions between STAP and other statistical approaches for analyzing cis-regulatory sequences include the utility of physical principles and the treatment of the DNA binding data as quantitative representation of binding strengths. Applying this method to the ChIP-seq data of 12 TFs in mouse embryonic stem (ES) cells, we found that the strength of TF-DNA binding could be significantly modulated by cooperative interactions among TFs with adjacent binding sites. However, further analysis on five putatively interacting TF pairs suggests that such interactions may be relatively insensitive to the distance and orientation of binding sites. Testing a set of putative Nanog motifs, STAP showed that a novel Nanog motif could better explain the ChIP-seq data than previously published ones. We then experimentally tested and verified the new Nanog motif. A series of comparisons showed that STAP has more predictive power than several state-of-the-art methods for cis-regulatory sequence analysis. We took advantage of this power to study the evolution of TF-target relationship in Drosophila. By learning the TF-DNA interaction models from the ChIP-chip data of D. melanogaster (Mel) and applying them to the genome of D. pseudoobscura (Pse), we found that only about half of the sequences strongly bound by TFs in Mel have high binding affinities in Pse. We show that prediction of functional TF targets from ChIP-chip data can be improved by using the conservation of STAP predicted affinities as an additional filter.

          Conclusions/Significance

          STAP is an effective method to analyze binding site arrangements, TF cooperativity, and TF target genes from genome-wide TF-DNA binding data.

          Related collections

          Most cited references54

          • Record: found
          • Abstract: found
          • Article: not found

          DNA binding sites: representation and discovery.

          G Stormo (2000)
          The purpose of this article is to provide a brief history of the development and application of computer algorithms for the analysis and prediction of DNA binding sites. This problem can be conveniently divided into two subproblems. The first is, given a collection of known binding sites, develop a representation of those sites that can be used to search new sequences and reliably predict where additional binding sites occur. The second is, given a set of sequences known to contain binding sites for a common factor, but not knowing where the sites are, discover the location of the sites in each sequence and a representation for the specificity of the protein.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A protein interaction network for pluripotency of embryonic stem cells.

            Embryonic stem (ES) cells are pluripotent and of therapeutic potential in regenerative medicine. Understanding pluripotency at the molecular level should illuminate fundamental properties of stem cells and the process of cellular reprogramming. Through cell fusion the embryonic cell phenotype can be imposed on somatic cells, a process promoted by the homeodomain protein Nanog, which is central to the maintenance of ES cell pluripotency. Nanog is thought to function in concert with other factors such as Oct4 (ref. 8) and Sox2 (ref. 9) to establish ES cell identity. Here we explore the protein network in which Nanog operates in mouse ES cells. Using affinity purification of Nanog under native conditions followed by mass spectrometry, we have identified physically associated proteins. In an iterative fashion we also identified partners of several Nanog-associated proteins (including Oct4), validated the functional relevance of selected newly identified components and constructed a protein interaction network. The network is highly enriched for nuclear factors that are individually critical for maintenance of the ES cell state and co-regulated on differentiation. The network is linked to multiple co-repressor pathways and is composed of numerous proteins whose encoding genes are putative direct transcriptional targets of its members. This tight protein network seems to function as a cellular module dedicated to pluripotency.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update

              JASPAR is a popular open-access database for matrix models describing DNA-binding preferences for transcription factors and other DNA patterns. With its third major release, JASPAR has been expanded and equipped with additional functions aimed at both casual and power users. The heart of the JASPAR database—the JASPAR CORE sub-database—has increased by 12% in size, and three new specialized sub-databases have been added. New functions include clustering of matrix models by similarity, generation of random matrices by sampling from selected sets of existing models and a language-independent Web Service applications programming interface for matrix retrieval. JASPAR is available at http://jaspar.genereg.net.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2009
                1 December 2009
                : 4
                : 12
                : e8155
                Affiliations
                [1 ]Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, Illinois, United States of America
                [2 ]Department of Bioengineering, University of Illinois at Urbana-Champaign, Champaign, Illinois, United States of America
                [3 ]Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, Illinois, United States of America
                [4 ]Gene Regulation Laboratory, Genome Institute of Singapore, Singapore, Singapore
                National University of Ireland Galway, Ireland
                Author notes

                Conceived and designed the experiments: XH SZ. Performed the experiments: XH CCC FH FF HHN. Analyzed the data: XH CCC FH SZ. Contributed reagents/materials/analysis tools: XH CCC SS HHN. Wrote the paper: XH SS SZ.

                Article
                09-PONE-RA-13262
                10.1371/journal.pone.0008155
                2780727
                19956545
                dc3774ef-da7f-4ec4-bbf7-3b0b334f9f3a
                He et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 30 September 2009
                : 10 November 2009
                Page count
                Pages: 14
                Categories
                Research Article
                Computational Biology/Sequence Motif Analysis
                Computational Biology/Transcriptional Regulation
                Developmental Biology/Stem Cells

                Uncategorized
                Uncategorized

                Comments

                Comment on this article