123
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Integrative annotation of chromatin elements from ENCODE data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The ENCODE Project has generated a wealth of experimental information mapping diverse chromatin properties in several human cell lines. Although each such data track is independently informative toward the annotation of regulatory elements, their interrelations contain much richer information for the systematic annotation of regulatory elements. To uncover these interrelations and to generate an interpretable summary of the massive datasets of the ENCODE Project, we apply unsupervised learning methodologies, converting dozens of chromatin datasets into discrete annotation maps of regulatory regions and other chromatin elements across the human genome. These methods rediscover and summarize diverse aspects of chromatin architecture, elucidate the interplay between chromatin activity and RNA transcription, and reveal that a large proportion of the genome lies in a quiescent state, even across multiple cell types. The resulting annotation of non-coding regulatory elements correlate strongly with mammalian evolutionary constraint, and provide an unbiased approach for evaluating metrics of evolutionary constraint in human. Lastly, we use the regulatory annotations to revisit previously uncharacterized disease-associated loci, resulting in focused, testable hypotheses through the lens of the chromatin landscape.

          Related collections

          Most cited references35

          • Record: found
          • Abstract: found
          • Article: not found

          Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn's disease.

          Crohn's disease and ulcerative colitis, the two main types of chronic inflammatory bowel disease, are multifactorial conditions of unknown aetiology. A susceptibility locus for Crohn's disease has been mapped to chromosome 16. Here we have used a positional-cloning strategy, based on linkage analysis followed by linkage disequilibrium mapping, to identify three independent associations for Crohn's disease: a frameshift variant and two missense variants of NOD2, encoding a member of the Apaf-1/Ced-4 superfamily of apoptosis regulators that is expressed in monocytes. These NOD2 variants alter the structure of either the leucine-rich repeat domain of the protein or the adjacent region. NOD2 activates nuclear factor NF-kB; this activating function is regulated by the carboxy-terminal leucine-rich repeat domain, which has an inhibitory role and also acts as an intracellular receptor for components of microbial pathogens. These observations suggest that the NOD2 gene product confers susceptibility to Crohn's disease by altering the recognition of these components and/or by over-activating NF-kB in monocytes, thus documenting a molecular model for the pathogenic mechanism of Crohn's disease that can now be further investigated.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            ChIP-seq accurately predicts tissue-specific activity of enhancers.

            A major yet unresolved quest in decoding the human genome is the identification of the regulatory sequences that control the spatial and temporal expression of genes. Distant-acting transcriptional enhancers are particularly challenging to uncover because they are scattered among the vast non-coding portion of the genome. Evolutionary sequence constraint can facilitate the discovery of enhancers, but fails to predict when and where they are active in vivo. Here we present the results of chromatin immunoprecipitation with the enhancer-associated protein p300 followed by massively parallel sequencing, and map several thousand in vivo binding sites of p300 in mouse embryonic forebrain, midbrain and limb tissue. We tested 86 of these sequences in a transgenic mouse assay, which in nearly all cases demonstrated reproducible enhancer activity in the tissues that were predicted by p300 binding. Our results indicate that in vivo mapping of p300 binding is a highly accurate means for identifying enhancers and their associated activities, and suggest that such data sets will be useful to study the role of tissue-specific enhancers in human biology and disease on a genome-wide scale.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Design and analysis of ChIP-seq experiments for DNA-binding proteins

              Recent progress in massively parallel sequencing platforms has allowed for genome-wide measurements of DNA-associated proteins using a combination of chromatin immunoprecipitation and sequencing (ChIP-seq). While a variety of methods exist for analysis of the established microarray alternative (ChIP-chip), few approaches have been described for processing ChIP-seq data. To fill this gap, we propose an analysis pipeline specifically designed to detect protein binding positions with high accuracy. Using three separate datasets, we illustrate new methods for improving tag alignment and correcting for background signals. We also compare sensitivity and spatial precision of several novel and previously described binding detection algorithms. Finally, we analyze the relationship between the depth of sequencing and characteristics of the detected binding positions, and provide a method for estimating the sequencing depth necessary for a desired coverage of protein binding sites.
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                January 2013
                5 December 2012
                5 December 2012
                : 41
                : 2
                : 827-841
                Affiliations
                1Department of Genome Sciences, University of Washington, 3720 15th Ave NE, Seattle, WA 98195-5065, 2Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA 02139, 3Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA, 4EMBL–European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, England, UK, 5Department of Computer Science, Stanford University, 318 Campus Dr, Stanford, CA 94305, 6Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, PA 16802, 7Department of Computer Science and Engineering, University of Washington, 185 Stevens Way, Seattle, WA 98195-2350 and 8Department of Electrical Engineering, University of Washington, 185 Stevens Way, Seattle, WA 98195-2500, USA
                Author notes
                *To whom correspondence should be addressed. Tel: +1 206 221 4973; Fax: +1 206 685 7301; Email: william-noble@ 123456uw.edu
                Correspondence may also be addressed to Manolis Kellis. Tel: +1 617 253 2419; Fax: +1 617 452 5034; Email: manoli@ 123456mit.edu

                The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.

                Present addresses: Jason Ernst, Department of Biological Chemistry, University of California, Los Angeles, 615 Charles E Young Dr S, Los Angeles, CA 90095, USA.

                Anshul Kundaje, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA 02139, USA.

                Article
                gks1284
                10.1093/nar/gks1284
                3553955
                23221638
                35b068ea-7379-45a5-9957-185cec1dad6a
                © The Author(s) 2012. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com.

                History
                : 13 September 2012
                : 5 November 2012
                : 10 November 2012
                Page count
                Pages: 15
                Categories
                Gene Regulation, Chromatin and Epigenetics

                Genetics
                Genetics

                Comments

                Comment on this article