55
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Discovery of non-directional and directional pioneer transcription factors by modeling DNase profile magnitude and shape

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Here we describe Protein Interaction Quantitation (PIQ), a computational method that models the magnitude and shape of genome-wide DNase profiles to facilitate the identification of transcription factor (TF) binding sites. Through the use of machine learning techniques, PIQ identified binding sites for >700 TFs from one DNase-seq experiment with accuracy comparable to ChIP-seq for motif-associated TFs (median AUC=0.93 across 303 TFs). We applied PIQ to analyze DNase-seq data from mouse embryonic stem cells differentiating into pre-pancreatic and intestinal endoderm. We identified (n=120) and experimentally validated eight ‘pioneer’ TF families that dynamically open chromatin, enabling other TFs to bind to adjacent DNA. Four pioneer TF families only open chromatin in one direction from their motifs. Furthermore, we identified a class of ‘settler’ TFs whose genomic binding is principally governed by proximity to open chromatin. Our results support a model of hierarchical TF binding in which directional and non-directional pioneer activity shapes the chromatin landscape for population by settler TFs.

          Related collections

          Most cited references36

          • Record: found
          • Abstract: found
          • Article: not found

          High-resolution mapping and characterization of open chromatin across the genome.

          Mapping DNase I hypersensitive (HS) sites is an accurate method of identifying the location of genetic regulatory elements, including promoters, enhancers, silencers, insulators, and locus control regions. We employed high-throughput sequencing and whole-genome tiled array strategies to identify DNase I HS sites within human primary CD4+ T cells. Combining these two technologies, we have created a comprehensive and accurate genome-wide open chromatin map. Surprisingly, only 16%-21% of the identified 94,925 DNase I HS sites are found in promoters or first exons of known genes, but nearly half of the most open sites are in these regions. In conjunction with expression, motif, and chromatin immunoprecipitation data, we find evidence of cell-type-specific characteristics, including the ability to identify transcription start sites and locations of different chromatin marks utilized in these cells. In addition, and unexpectedly, our analyses have uncovered detailed features of nucleosome structure.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Opening of compacted chromatin by early developmental transcription factors HNF3 (FoxA) and GATA-4.

            The transcription factors HNF3 (FoxA) and GATA-4 are the earliest known to bind the albumin gene enhancer in liver precursor cells in embryos. To understand how they access sites in silent chromatin, we assembled nucleosome arrays containing albumin enhancer sequences and compacted them with linker histone. HNF3 and GATA-4, but not NF-1, C/EBP, and GAL4-AH, bound their sites in compacted chromatin and opened the local nucleosomal domain in the absence of ATP-dependent enzymes. The ability of HNF3 to open chromatin is mediated by a high affinity DNA binding site and by the C-terminal domain of the protein, which binds histones H3 and H4. Thus, factors that potentiate transcription in development are inherently capable of initiating chromatin opening events.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              An expansive human regulatory lexicon encoded in transcription factor footprints

              Regulatory factor binding to genomic DNA protects the underlying sequence from cleavage by DNaseI, leaving nucleotide-resolution footprints. Using genomic DNaseI footprinting across 41 diverse cell and tissue types, we detected 45 million factor occupancy events within regulatory regions, representing differential binding to 8.4 million distinct short sequence elements. Here we show that this small genomic sequence compartment, roughly twice the size of the exome, encodes an expansive repertoire of conserved recognition sequences for DNA-binding proteins that nearly doubles the size of the human cis-regulatory lexicon. We find that genetic variants affecting allelic chromatin states are concentrated in footprints, and that these elements are preferentially sheltered from DNA methylation. High-resolution DNaseI cleavage patterns mirror nucleotide-level evolutionary conservation and track the crystallographic topography of protein-DNA interfaces, indicating that transcription factor structure has been evolutionarily imprinted on the human genome sequence. We identify a stereotyped 50 base-pair footprint that precisely defines the site of transcript origination within thousands of human promoters. Finally, we describe a large collection of novel regulatory factor recognition motifs that are highly conserved in both sequence and function, and exhibit cell-selective occupancy patterns that closely parallel major regulators of development, differentiation, and pluripotency.
                Bookmark

                Author and article information

                Journal
                9604648
                20305
                Nat Biotechnol
                Nat. Biotechnol.
                Nature biotechnology
                1087-0156
                1546-1696
                13 February 2014
                19 January 2014
                February 2014
                01 August 2014
                : 32
                : 2
                : 171-178
                Affiliations
                [1 ] Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115
                [2 ] Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142
                [3 ]Department of Stem Cell and Regenerative Biology, Harvard University and Harvard Medical School, 7 Divinity Avenue, Cambridge, MA 02138
                Author notes
                Please direct correspondence to R.I.S. ( rsherwood@ 123456partners.org ) and D.K.G. ( gifford@ 123456mit.edu ).
                Article
                NIHMS549898
                10.1038/nbt.2798
                3951735
                24441470
                d200c8dc-6169-40e3-9c1e-718652ecba83

                Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms

                History
                Categories
                Article

                Biotechnology
                Biotechnology

                Comments

                Comment on this article