The nucleosome repeat length (NRL) is an integral chromatin property important for its biological functions. Recent experiments revealed several conflicting trends of the NRL dependence on the concentrations of histones and other architectural chromatin proteins, both in vitro and in vivo, but a systematic theoretical description of NRL as a function of DNA sequence and epigenetic determinants is currently lacking. To address this problem, we have performed an integrative biophysical and bioinformatics analysis in species ranging from yeast to frog to mouse where NRL was studied as a function of various parameters. We show that in simple eukaryotes such as yeast, a lower limit for the NRL value exists, determined by internucleosome interactions and remodeler action. For higher eukaryotes, also the upper limit exists since NRL is an increasing but saturating function of the linker histone concentration. Counterintuitively, smaller H1 variants or non-histone architectural proteins can initiate larger effects on the NRL due to entropic reasons. Furthermore, we demonstrate that different regimes of the NRL dependence on histone concentrations exist depending on whether DNA sequence-specific effects dominate over boundary effects or vice versa. We consider several classes of genomic regions with apparently different regimes of the NRL variation. As one extreme, our analysis reveals that the period of oscillations of the nucleosome density around bound RNA polymerase coincides with the period of oscillations of positioning sites of the corresponding DNA sequence. At another extreme, we show that although mouse major satellite repeats intrinsically encode well-defined nucleosome preferences, they have no unique nucleosome arrangement and can undergo a switch between two distinct types of nucleosome positioning.
The DNA molecule of a human or mouse can be up to two meters long, if stretched. However, it is stored inside the small volume of the nucleus in the living cell. DNA compaction is achieved at different hierarchical levels with the help of a number of architectural proteins. The elementary unit of compaction is the nucleosome, where DNA is wrapped around the protein octamer core. Each nucleosome contains about 147 DNA base pairs; the length of DNA between the neighboring nucleosomes varies from nearly zero to several hundred of base pairs. This variability determines the biological function of the underlying DNA, since some parts of the genome are less compact, and thus potentially actively transcribed, while others are more compact, and their transcription is limited. The DNA distances between neighboring nucleosomes depend on the interaction with many chromatin-associated proteins. The average distance between two neighboring nucleosomes change for different genomic locations and even for the same genomic region in different cell states of the same organism. Here we study these effects and provide their quantitative biophysical description using available experimental data in a number of organisms, ranging from yeast to frog to mouse.