Blog
About

13
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Computational Identification of Genomic Features That Influence 3D Chromatin Domain Formation

      * ,

      PLoS Computational Biology

      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Recent advances in long-range Hi-C contact mapping have revealed the importance of the 3D structure of chromosomes in gene expression. A current challenge is to identify the key molecular drivers of this 3D structure. Several genomic features, such as architectural proteins and functional elements, were shown to be enriched at topological domain borders using classical enrichment tests. Here we propose multiple logistic regression to identify those genomic features that positively or negatively influence domain border establishment or maintenance. The model is flexible, and can account for statistical interactions among multiple genomic features. Using both simulated and real data, we show that our model outperforms enrichment test and non-parametric models, such as random forests, for the identification of genomic features that influence domain borders. Using Drosophila Hi-C data at a very high resolution of 1 kb, our model suggests that, among architectural proteins, BEAF-32 and CP190 are the main positive drivers of 3D domain borders. In humans, our model identifies well-known architectural proteins CTCF and cohesin, as well as ZNF143 and Polycomb group proteins as positive drivers of domain borders. The model also reveals the existence of several negative drivers that counteract the presence of domain borders including P300, RXRA, BCL11A and ELK1.

          Author Summary

          Chromosomal DNA is tightly packed up in 3D such that around 2 meters of this long molecule fits into the microscopic nucleus of every cell. The genome packing is not random, but instead structured in 3D domains that are essential to numerous key processes in the cell, such as for the regulation of gene expression or for the replication of DNA. A current challenge is to identify the key molecular drivers of this higher-order chromosome organization. Here we propose a novel computational integrative approach to identify proteins and DNA elements that positively or negatively influence the establishment or maintenance of 3D domains. Analysis of Drosophila data at very high resolution suggests that among architectural proteins, BEAF-32 and CP190 are the main positive drivers of 3D domains. In humans, our results highlight the roles of CTCF, cohesin, ZNF143 and Polycomb group proteins as positive drivers of 3D domains, in contrast to P300, RXRA, BCL11A and ELK1 that act as negative drivers.

          Related collections

          Most cited references 54

          • Record: found
          • Abstract: found
          • Article: not found

          An Integrated Encyclopedia of DNA Elements in the Human Genome

          Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Comprehensive mapping of long-range interactions reveals folding principles of the human genome.

            We describe Hi-C, a method that probes the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing. We constructed spatial proximity maps of the human genome with Hi-C at a resolution of 1 megabase. These maps confirm the presence of chromosome territories and the spatial proximity of small, gene-rich chromosomes. We identified an additional level of genome organization that is characterized by the spatial segregation of open and closed chromatin to form two genome-wide compartments. At the megabase scale, the chromatin conformation is consistent with a fractal globule, a knot-free, polymer conformation that enables maximally dense packing while preserving the ability to easily fold and unfold any genomic locus. The fractal globule is distinct from the more commonly used globular equilibrium model. Our results demonstrate the power of Hi-C to map the dynamic conformations of whole genomes.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions

              The spatial organization of the genome is intimately linked to its biological function, yet our understanding of higher order genomic structure is coarse, fragmented and incomplete. In the nucleus of eukaryotic cells, interphase chromosomes occupy distinct chromosome territories (CT), and numerous models have been proposed for how chromosomes fold within CTs 1 . These models, however, provide only few mechanistic details about the relationship between higher order chromatin structure and genome function. Recent advances in genomic technologies have led to rapid revolutions in the study of 3D genome organization. In particular, Hi-C has been introduced as a method for identifying higher order chromatin interactions genome wide 2 . In the present study, we investigated the 3D organization of the human and mouse genomes in embryonic stem cells and terminally differentiated cell types at unprecedented resolution. We identify large, megabase-sized local chromatin interaction domains, which we term “topological domains”, as a pervasive structural feature of the genome organization. These domains correlate with regions of the genome that constrain the spread of heterochromatin. The domains are stable across different cell types and highly conserved across species, suggesting that topological domains are an inherent property of mammalian genomes. Lastly, we find that the boundaries of topological domains are enriched for the insulator binding protein CTCF, housekeeping genes, tRNAs, and SINE retrotransposons, suggesting that these factors may play a role in establishing the topological domain structure of the genome.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                May 2016
                20 May 2016
                : 12
                : 5
                Affiliations
                Laboratoire de Biologie Moléculaire Eucaryote (LBME), CNRS, Université Paul Sabatier (UPS), Toulouse, France
                University of Pennsylvania, UNITED STATES
                Author notes

                The authors have declared that no competing interests exist.

                Conceived and designed the experiments: RM. Performed the experiments: RM. Analyzed the data: RM. Contributed reagents/materials/analysis tools: RM. Wrote the paper: RM OC.

                PCOMPBIOL-D-16-00058
                10.1371/journal.pcbi.1004908
                4874696
                27203237
                © 2016 Mourad, Cuvier

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                Counts
                Figures: 8, Tables: 1, Pages: 24
                Product
                Funding
                This work was supported by the University of Toulouse IDEX program, the CNRS and by the ANR ‘INSULa’. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Genetics
                Genomics
                Animal Genomics
                Invertebrate Genomics
                Research and Analysis Methods
                Model Organisms
                Animal Models
                Drosophila Melanogaster
                Biology and Life Sciences
                Organisms
                Animals
                Invertebrates
                Arthropoda
                Insects
                Drosophila
                Drosophila Melanogaster
                Biology and life sciences
                Biochemistry
                Proteins
                DNA-binding proteins
                Biology and Life Sciences
                Biochemistry
                Enzymology
                Enzyme Chemistry
                Cofactors (Biochemistry)
                Biology and Life Sciences
                Cell Biology
                Chromosome Biology
                Chromatin
                Biology and Life Sciences
                Genetics
                Epigenetics
                Chromatin
                Biology and Life Sciences
                Genetics
                Gene Expression
                Chromatin
                Research and Analysis Methods
                Database and Informatics Methods
                Biological Databases
                Genomic Databases
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Genomic Databases
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Genomic Databases
                Biology and Life Sciences
                Genetics
                Genomics
                Functional Genomics
                Physical Sciences
                Materials Science
                Materials by Attribute
                Insulators
                Custom metadata
                All relevant data are within the paper and its Supporting Information files.

                Quantitative & Systems biology

                Comments

                Comment on this article