Histones:
At the Crossroads of Peptide and Protein
Chemistry

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

1 Introduction In eukaryotic cells, inheritable information is stored in a nucleoprotein complex referred to as chromatin. 1 This genome architecture serves two key purposes. On the one hand, wrapping DNA (approximately 145–147 basepairs) twice around a spool composed of two copies each of the highly basic core histones H2A, H2B, H3, and H4 leads to compaction of DNA strands (Figure 1a,b). These assemblies are called nucleosomes. Contacts between individual nucleosomes are often mediated by cationic tails at the N- and C-termini of all histone proteins that protrude from the core and further tighten the chromatin fiber (Figure 1c). Additional packing is achieved through attachment of histone H1 to the DNA that links neighboring nucleosomes or by nonhistone proteins that are able to bridge units within or between chromatin fibers. 2 The second pivotal function of storing genetic information as a DNA–protein complex is the additional layer of regulation that this feature provides. 3−5 For instance, the very presence of histones on DNA sequences can occlude access to these sites by transcription factors and other DNA binding proteins. 6 Thus, nucleosome positioning, shaped in part by DNA sequence preferences and shifted by ATP-powered molecular motors (referred to as chromatin remodelers), directly affects chromatin transactions. 7 Beyond their location, the biochemical makeup of nucleosomes provides further opportunity for regulation. Canonical histones can be replaced with closely resembling variants, and all histones are dynamically decorated with post-translational modifications (PTMs). These biochemical marks can be as small as just a few atoms, such as methyl (Lys, Arg, Gln), acetyl (Lys), or phosphoryl groups (Ser, Thr), or as large as an entire protein in the case of ubiquitin or SUMO. Upon attachment by dedicated transferase enzymes, PTMs can directly alter the biophysical properties of the target protein, provide a docking site for specific interaction partners, interfere with binding events of other factors, or act through a combination of these mechanisms. In this way, signaling through histone PTMs serves to orchestrate chromatin-templated processes, including fine-tuning transcriptional outputs. Remarkably, transcriptional states can be inherited through cell division cycles, thus providing a mode of epigenetic memory. 8,9 Not surprisingly, misregulation of the inputs and outputs of chromatin signaling occurs in many diseases, especially cancer. 10−13 Figure 1 Chromatin architecture in eukaryotic cells. (a) Structure of a mononucleosome. DNA (gray) is wrapped around two copies each of H2A (orange), H2B (red), H3 (blue), and H4 (green); pdb code: 1kx5. (b) Electrostatic surface rendering of a histone octamer. Highly cationic patches (blue) guide the trajectory of DNA wrapping. (c) Schematic representation of genome architecture. Lysine acetylation, serine/threonine phosphorylation, and lysine ubiquitylation have a strong propensity to directly influence the structure of chromatin. Both acetylation and phosphorylation reduce the net positive charge of histones, and thereby weaken electrostatic interactions with the negatively charged DNA. In particular, acetylation at multiple lysine residues is associated with decompaction of chromatin, providing space for the transcription machinery to engage with acetylated chromatin domains. These transcriptionally active, open regions are referred to as euchromatin. Attaching an entire protein such as ubiquitin (8.5 kDa) to histones (10–15 kDa) can also preclude tight packing of nucleosomes. Consequently, histone ubiquitylation is associated with active transcription (specifically, ubiquitylation of histone H2B at lysine 120; abbreviated as H2B-K120ub) and DNA damage repair (H2A-K119ub). 14 In contrast, lysine and arginine methylation events only slightly change the biophysical properties of nucleosomes. These modifications are often targeted by protein factors present in the nucleus that discern the methylation states and the surrounding sequences, and thereby act as signaling hubs. A paradigm for this mechanism is the binding of heterochromatin protein 1 (HP1) to histone H3 carrying a trimethylation mark at lysine 9 (H3K9me3). 15 Through oligomerization, HP1 can noncovalently link multiple nucleosomes to create a compact architecture that impedes transcription, and simultaneously provides a docking platform for a cohort of associated proteins. 16 Such inactive chromatin domains are commonly referred to as heterochromatin. Several factors drive the interactions between nuclear proteins and histones. Remarkably, all histones display a strong compositional bias in their amino acid content. They are highly enriched in basic residues, whereas acidic and aromatic residues as well as cysteines are strongly underrepresented (Figure 2a). In addition, the protruding tail regions of all four of the core histones, as well as the linker histone H1, contain strikingly few hydrophobic amino acids. Notably, high charge density and low hydrophobicity are defining features of intrinsically disordered proteins. 17 Thus, histone binding frequently relies on electrostatic contributions and hydrogen bonding rather than complementary hydrophobic surfaces that typically drive protein–protein interactions. Perhaps as a consequence of the low building block diversity found in histone tails, recurring sequence motifs can be discerned (Figure 2b). Most prominently, the ARKS tetrapeptide occurs twice in the H3 tail, encompassing Lys9 and Lys27, both associated with heterochromatin-specific methylation. A permutated variation, ARTK, is located at the very N-terminus of H3. Similarly, the H4 tail contains three instances of a GKG tripeptide. Many histone binders engage these short linear motifs, 18 often in a PTM-dependent fashion, allowing for tightly controlled interactions between chromatin and designated binding proteins. 4,19 Many endogenous proteins contain histone-like sequences, suggesting that such motifs play an important role in cellular physiology. 20,22 Interestingly, an influenza protein mimics the ARTK sequence of the H3 tail (ARSK) to highjack the host cell’s transcription machinery. 21 Figure 2 Histone sequence features. (a) Histones contain a skewed amino acid composition. Amino acid frequencies are normalized to the average occurrence found in all proteins contained in the uniprot database (www.uniprot.org). Cationic residues, Arg, Lys; anionic, Asp, Glu; polar, Asn, Gln, Ser, Thr; aromatic, His, Phe, Trp, Tyr; aliphatic, Ile, Leu, Met, Val; Ala and Cys are plotted individually; the secondary structure breaking residues Gly and Pro are binned together. (b) Recurring sequence motifs in histone tails surrounding modified lysine residues. Given that many chromatin-related processes involve interactions with the unstructured histone tails that protrude from nucleosomes and are subject to a plethora of PTMs, peptide chemistry has aided tremendously in assigning functional roles to these modifications. The small size of these tails makes them ideal targets for peptide synthesis, and their lack of a defined 3D-structure obviates the need for refolding synthetic material. In particular, peptide models have contributed to the characterization of enzymes that attach and remove histone PTMs, and proteins that interact with specific marks. These proteins are often anthropomorphically called histone mark writers, erasers, and readers, respectively. With increasing sophistication of proposed mechanisms for the regulation of chromatin structure and function by signaling cascades, there is a growing need for chemically defined model systems with which to directly address these emerging hypotheses. Contemporary protein chemistry and chromatin assembly strategies can fulfill this requirement to a large degree. In this Review, we first summarize the contributions of peptide chemistry over the last 40+ years in an eclectic journey that aims to provide a glimpse into the variety of histone PTMs that modulate chromatin. We focus on the synthesis of modified histone peptides and their contribution to deciphering the supramolecular chemistry that controls the function of histone PTMs. We then discuss modern approaches to generate chemically defined chromatin templates, involving innovative uses of protein chemistry and synthetic biology, and how “designer” chromatin has furthered our understanding of key molecular recognition events that govern nuclear biochemistry. 2 Historic Perspective Since the early days of chromatin biology, protein chemistry has played a pivotal role in exploring the mechanisms of chromatin transactions. Following the discoveries that histones inhibit RNA synthesis in nuclear extracts, 23−25 and that histones are also heavily acetylated, 26 Allfrey and co-workers surmised that these modifications are installed post-translationally and serve to regulate transcription. 27 Indeed, limited chemical acetylation of isolated histones using acetic anhydride diminished their ability to inhibit transcription. In the following years, it became evident that histone acetylation occurs largely on lysine side-chains 28 and is biochemically reversible. 29 As determined by then emerging protein sequencing technologies, the main sites of acetylation on histone H4 correspond to Lys16, and to a lesser degree Lys5, 8, and 12. 30,31 These early biochemical investigations coincided with the development of solid-phase peptide synthesis (SPPS) by Merrifield (Scheme 1), 32 a process that led to a tremendous surge in efficiency of oligopeptide preparation, as exemplified by the total synthesis of Ribonuclease S, a 124 residue protein. 33 Because Allfrey and Merrifield were colleagues at the Rockefeller University, the stage was set for the first targeted studies on the biochemistry of specific histone post-translational modifications employing synthetic peptides. Scheme 1 Solid-Phase Peptide Synthesis (SPPS) Using the N-α-Boc-Protection Strategy 3 Histone Peptide Chemistry Chromatin biochemistry is fertile ground for peptide chemists. Chromatin-associated proteins perform many molecular transactions with the flexible histone tails that protrude from the compact nucleosome core. Thus, assays based on synthetic peptides can recapitulate certain key aspects of the interplay between chromatin and the nuclear proteome. In this section, we will highlight the contributions of peptide chemistry to solving some of the mysteries that chromatin biology harbors. We will discuss histone PTMs, as well as the utility of cross-linking and combinatorial methods to investigate their functions, with an emphasis on synthesis and molecular recognition. 3.1 Lysine Acetylation 3.1.1 Pioneering Studies with Acetylated Histone Peptides Early studies on the biochemistry of specific histone PTMs focused on delineating the substrate scope of deacetylases (HDACs). 34,35 To narrow the substrate specificity of calf thymus histone deacetylase, Merrifield, Allfrey, and co-workers prepared histone peptides by limited proteolysis of H4, purified from calf thymus nuclei that were previously incubated with radioactive acetate. 35 Digestion with CNBr and chymotrypsin yielded two fragments, H4(1–84) and H4(1–37), respectively (Figure 3a). In addition, a small peptide spanning residues 15–21 containing radiolabeled Ac-Lys at position 16 was prepared by SPPS using standard N-α-tert-butyloxycarbonyl (Boc) protected building blocks, as well as N-α-Boc-N-ε-[14C]acetyllysine (Figure 3b). Cleavage from the resin was achieved with hydrofluoric acid (HF), and the resulting peptide was purified by ion exchange chromatography. HDAC activity, monitored by release of radiolabeled acetate, was detected only when using long peptide constructs; the synthetic peptide was not a substrate. Because H4K16 is a prominent histone acetylation site, these results suggested that long N-terminal peptides were required for substrate recognition. As SPPS became more routine, and the first automated peptide synthesizers were built, 36−38 peptides of such length became accessible. Thus, a doubly modified peptide encompassing H4(1–37) was prepared with a 3H-labeled and a 14C-labeled acetyl group at positions 16 and 12, respectively (Figure 3c). This setup allows for a straightforward distinction between the acetyl groups at each position. HDAC-catalyzed release of 3H and 14C was equal, demonstrating that this enzyme is able to remove both marks efficiently. Figure 3 Synthesis of acetylated H4 peptides to define the substrate specificity of an HDAC. (a) Limited proteolysis of acetylated H4 yields long peptidic HDAC substrates. HATs = histone acetyltransferases. (b) Solid-phase peptide synthesis (SPPS) using a radiolabeled acetyllysine building block (inset) yields a hepta-peptide that is not an HDAC substrate. (c) A long synthetic peptide bearing two distinctly radio-labeled acetyl groups illustrates promiscuity in HDAC activity. 3.1.2 Molecular Recognition of Acetyllysine in Histones Electrostatic interactions contribute strongly to nucleosome formation. Lysine and arginine residues, present at the lateral surface of histone octamers and on the flexible tails, direct DNA wrapping and mediate internucleosomal contacts to establish higher order chromatin structure, respectively. 39,40 Accordingly, lysine acetylation is expected to affect DNA binding because this modification decreases the basicity of histones. Consistent with this hypothesis, several studies employing either full-length histone H4, or peptide fragments thereof, indicate that acetylation weakens histone DNA interactions. 27,41−43 Besides a direct biophysical effect on chromatin structure, histone acetylation also serves a biochemical function by recruiting specialized reader domains. Characterization of these interactions was made possible by the ease of access to site-specifically acetylated histone peptides granted by SPPS. Currently, the most well-studied acetyllysine reader module is the Bromodomain (BD), a small protein domain encompassing approximately 110 amino acids that is often found in transcriptional coactivators. 44−46 A structural investigation by NMR of the BD from one such coactivator, the acetyltransferase P/CAF, revealed a 4-helix-bundle fold with a prominent hydrophobic pocket. 47 The small molecule acetyllysine analogue, N-acetyl-histamine, was able to bind to this site, with the acetamide moiety facing toward the protein interior. 47 Local chemical shift perturbations upon titration of the BD with a synthetic H4 peptide containing a single acetyl mark, K8ac, revealed an interaction with a high micromolar K d. A cocrystal structure of the BD from another acetyltransferase, GCN5, with an H4 peptide acetylated at Lys16 provides further insight into the binding interface (Figure 4a). 48 The nature of the interaction is predominantly hydrophobic with a tight fit of the lysine side-chain methylene groups into an apolar cleft and a somewhat more loose fit of the terminal methyl group within the pocket. Specific hydrogen bonds with a conserved Asn residue in the BD, as well as several ordered water molecules in the partially solvent-accessible binding crevice, orient the acetamide modification. Additional interactions with residues surrounding the acetylated lysine originate from shape complementarity to the BD surface, a limited number of backbone–backbone hydrogen bonds, and, in the case of the GCN5 BD, an ion pair at the i+3 position (to Arg19 of H4). 48,49 Consequently, associations of BDs with acetylated peptides are often weak and rather unspecific. 50 Moreover, thermodynamic analyses performed on the binding of a typical BD to H3K9ac confirm that such interactions are primarily driven by the hydrophobic effect. 51 Figure 4 Recognition of acetyllysine residues by bromodomains. (a) Binding pocket of the GCN5 BD in complex with an H4 peptide acetylated at Lys16 (green). A hydrogen bond between Asn407 of the BD (black) and the acetyl group is indicated with a dotted line; pdb code: 1E6I. (b) Architecture of the double-BD module of TAFII250. The acetyllysine binding pocket of each lobe is indicated in red; pdb code: 1EQF. (c) Simultaneous binding of two acetyllysine residues by BD1 of Brdt. A synthetic H4 peptide bearing K5ac and K8ac is depicted in green with hydrogen-bonding networks indicated by dotted lines; pdb code: 2WP2. Surfaces in subfigures (a) and (c) are shown in electrostatic rendering (blue, positive; white, neutral; red, negative). Ordered water molecules are shown as red spheres. The frequent occurrence of multiple acetyl marks on a single histone tail 52 raises the question as to how two or more acetyllysine residues are recognized. Crystal structures and binding studies of the double-BD containing proteins TAFII250 (a general transcription factor) 53 and Brdt (a testis-specific genome organizing factor) 54 shed light on this question. In TAFII250, the two BDs are oriented by protein–protein interactions to enable simultaneous binding of i, i+7, or i+8 acetyl marks with low micromolar affinity (Figure 4b). 53 In contrast, the first BD of Brdt preferentially binds multiple acetyl marks in a single pocket. 54 This feature is accomplished due to the open BD cleft, where one residue (K5ac) is bound in an orientation typical for BD–acetyllysine interactions, including a hydrogen bond to Asn108 (Figure 4c). The side-chain of K8ac reaches into the open pocket, forming a hydrogen-bond network from its amide oxygen through an ordered water molecule to the amide nitrogen of K5ac. The orientation of K8ac is reinforced by hydrophobic interactions with the methylene groups as well as the terminal methyl group of the acetamide moiety. 3.1.3 Generation of Histone-PTM-Specific Antibodies Modern chromatin biology relies heavily on antibodies that recognize distinct histone PTMs. Modification-specific antibodies serve to detect the presence of their cognate mark within a biological sample, for instance, in a Western blot format. Moreover, they represent indispensable affinity reagents for chromatin immunoprecipitation (ChIP), 55 enabling the isolation of mono- or oligonucleosomes bearing a designated histone PTM (Figure 5a). Subsequent analysis of isolated chromatin segments by proteomic (e.g., mass spectrometry) or genomic (e.g., DNA sequencing) methods provides detailed information about the biochemistry of the targeted histone PTM, and its genomic distribution. Access to site-specificically modified histone peptides through SPPS represents the basis for generating these invaluable tools. Initial efforts to elicit antibodies that recognize acetylated H4 focused on purified acetylated forms of the protein, 56 chemically acetylated full length protein (ref (57)), or H4 N-terminal peptides. 58 The resulting antisera were capable of distinguishing acetylated from nonacetylated H4, but lacked the ability to distinguish individual acetylation sites. To address this limitation, Turner et al. synthesized a series of acetylated H4 peptides and used these as epitopes for antibody generation (Figure 5b). 59 These same peptides were then used to probe antibody specificity, enabling estimates of acetylation site usage during cell division 59 and in human cells. 60 Figure 5 PTM-selective antibodies as tools in chromatin biochemistry. (a) General outline of the chromatin immunoprecipitation (ChIP) workflow. (b) Production of site-specific acetyllysine antibodies using synthetic peptides. Specifically acetylated peptides are used to immunize rabbits to elicit a collection of antibodies that recognize defined acetylation marks. In this example, antibody selectivity was probed using the synthetic peptide substrates (right). Plus symbols denote a strong recognition, (+) stands for weak binding, whereas minus signs indicate no cross-reactivity. Data taken from ref (60). This seminal series of studies served as a template for many future endeavors. Indeed, a cohort of poly- and monoclonal antibodies that recognize site-specificic acetylation marks with improved selectivity have been raised, and many are commercially available. 61 Similarly, antibodies against essentially all known histone PTMs, elicited using synthetic peptides featuring the modification in question, have been added to the toolkit of chromatin biochemists. This list is continuously growing, and newly discovered histone PTMs are immediately incorporated into peptide epitopes for antibody generation (see also examples in section 3.8). 3.1.4 Mechanism of Histone Deacetylases Chemical synthesis permits the installation of non-natural analogues of acetyllysine. To scrutinize the mechanism of substrate recognition and turnover by class III histone deacetylases (these enzymes consume NAD+ during deacetylation, yielding O-acetyl-ADP-ribose and nicotinamide as byproducts), Smith and Denu prepared versions of the H3 tail containing acetyllysine mimics at position 14 (Figure 6). 62,63 Hydrophobicity was found to correlate with binding strength, 63 and nucleophilicity of the amide oxygen with catalysis. 62 These results suggest a concerted SN2-like mechanism for NAD+ cleavage, and highlight that some HDACs tolerate bulkier substrates such as propionyllysine (see also section 3.8). Figure 6 Mechanism of class III HDACs (a) probed with histone peptides carrying analogues of acetyllysine (b). n.d. stands for not determined. Data taken from refs (62) and (63). 3.2 Lysine Methylation The protein sequencing efforts performed in the 1960s revealed not only that some histone lysine residues are acetylated, but also the presence of methyllysine isoforms. 31,64,65 However, biochemical investigations of histone lysine methylation lagged behind the more conveniently assayed histone acetylation. 66 A further complication is that lysine side-chains are mono-, di-, and trimethylated, and each methylation state may confer a distinct biological impact. 67−69 Nevertheless, a tremendous body of research has been amassed on the biochemistry of histone lysine methylation, sparked by the discoveries of S-adenosylmethionine (SAM)-dependent, lysine-specific histone methyltransferases, 70 protein domains that specifically interact with lysines in different methylation states, 15,71 and the importance of lysine methylation in the regulation of gene expression 72,73 and the DNA damage response. 74−76 While initially thought of as irreversible marks, lysine methyl groups can be removed through the action of site-specific histone lysine demethylases. 77,78 As for acetyllysine, synthetic peptides bearing homogeneously modified methyllysine residues were instrumental in this endeavor. 3.2.1 Synthesis of Methyllysine-Containing Peptides Since the turn of the millennium, when histone lysine methylation became a prolific area of study, routine SPPS of methyllysine-containing peptides has been performed using the N-α-Fmoc protecting group scheme. 79 In this strategy, base-labile main chain protection is combined with side-chain protecting groups and resin linkages sensitive to TFA treatment (Scheme 2), thereby bypassing the hazardous HF cleavage step commonly employed in Boc-SPPS. In addition, SPPS benefited from improved coupling chemistries based on novel uronium 80−82 and phosphonium 83 reagents, as well as auxiliary nucleophiles such as oximes 84 (Figure 7a). Building blocks for the incorporation of all lysine methylation states are readily available synthetically, or can be obtained commercially (Figure 7b). N-α-Fmoc-protected di- and trimethyllysine are prepared by reductive alkylation with formaldehyde and electrophilic alkylation with iodomethane, respectively. 85 The monomethylated isoform is typically employed in the N-ε-Boc protected form, accessible through reductive alkylation of an N-ε-benzyl-protected intermediate. 86 Peptides synthesized with these building blocks are at the routine disposal of chromatin biochemists, and have been harnessed to obtain a palette of PTM-specific antibodies 87 and have found use in countless biochemical and biophysical studies. Scheme 2 Solid-Phase Peptide Synthesis (SPPS) Using the N-α-Fmoc-Protection Strategy Figure 7 Synthesis of methyllysine-containing peptides. (a) Commonly used activating agents and additives. (b) Standard methyllysine building blocks used for Fmoc-based SPPS. 3.2.2 Molecular Recognition of Methyllysine-Containing Histone Peptides As lysine methylation does not change the side-chain charge, this class of modification exerts its biochemical effects predominantly by serving as a docking platform for protein–protein interactions. 88 Trimethylation at H4K20 represents a prominent exception to this rule, and will be treated in section 4.1.4. 89 Although many modules capable of interpreting lysine methyl marks do exist, 69,88 we will focus here on chromodomains (CDs) to discuss the energetics of methyllysine binding and how specificity between methylation states is achieved. For a more comprehensive survey of the range of protein modules that specifically interact with histone PTMs, including methyllysine, the reader is directed to the recent review by Patel and colleagues. 90 CDs are small protein modules (approximately 50 residues in size) initially identified in heterochromatin protein 1 (HP1) and polycomb protein (Pc), key organizers of heterochromatin. 91−94 The CD of HP1 specifically recognizes H3K9me2/3, 15 and its structure was solved in complex with a series of short peptides containing either H3K9me3, H3K9me2, or H3K9me2 in combination with H3K4me2. 95,96 The methylated ammonium side-chain is enveloped in an aromatic cage, formed by an induced fit mechanism upon peptide binding (Figure 8a). 95,97 The structures of the CD bound to di- and trimethyllysine are highly similar, as are their binding affinities, both in the low micromolar range. 95 Imperfect size selection in dimethyllysine binding is compensated for by a water-mediated hydrogen bond between the lysine ε-amine and a glutamate side-chain (Figure 8a). The HP1 chromodomain discriminates strongly against the lower methylation states of Lys9: its affinity for monomethyllysine and unmodified lysine is reduced by 1.3 and >2.7 kcal/mol, respectively. 98 An analogue of the dimethyllysine side-chain, 3-dimethylamino-1-propanol, does not bind appreciably. 96 Instead, additional residues on the substrate peptide, bound as an extended strand, contribute to histone recognition based on size and charge, and concomitantly confer specificity for designated methyllysine sites. 95,96 In agreement with this mechanism, the K4/K9 doubly modified peptide binds HP1 exclusively through K9me2. 96 Specificity for lower methylation states is exemplified by the interaction of the CD of MSL3 (a transcriptional regulator) with mono- and dimethyllysine. 99 As compared to the CD from HP1, the MSL3 CD contains an additional Trp residue that serves as a tight lid for the aromatic cage to favor binding of secondary and tertiary ammonium ions over the quaternary trimethyllysine (Figure 8b). 99,100 Additional strategies to favor lower methylation states, discussed in detail in ref (90), include steric restriction as well as ionic hydrogen bonds to the ε N–H group. 101,102 Figure 8 Recognition of methyllysine. (a) Structures of the HP1 chromodomain in complex with methyllysine residues (pdb codes for K9me3, 1kne; K9me2, 1kna; K9me1, 1q3l) or in its apo form (right, pdb code: 1ap0). The CD is depicted in green, the ligand in yellow. Note that the apo-structure was solved with murine HP1 while the liganded structures were obtained from drosophila HP1, which contains a Tyr residue in place of Phe45. (b) Selective recognition of lower methylation states by the chromodomain of MSL3 (pdb code: 3m9p). The CD is depicted in cyan, the ligand in pink. For comparison, the corresponding residues in the HP1 CD are indicated in pale rendering. (c) Structure of tert-butylnorleucine (1), a trimethyllysine isostere. (d) Structure of a calix[4]arene receptor (2) for methyllysine-containing peptides. The driving force for methyllysine binding is the cation−π interaction, a common motif for recognition of cations in biology. 103−105 As is typical for this type of interactions, 104 complex formation between HP1 and K9me3-modified peptides is mediated by a strong favorable enthalpy, with a slightly unfavorable entropic contribution. 106 To gain more insight into the forces governing CD binding, Waters and co-workers prepared an H3 peptide containing tert-butylnorleucine (1) at position 9 (Figure 8c). 98 This residue is isosteric to trimethyllysine but lacks the charge, and therefore precludes electrostatic interactions with the aromatic cage of HP1. CD binding of the H3 peptide was reduced by approximately 2 kcal/mol upon replacing K9me3 with its neutral isostere, 98 in agreement with typical values for cation−π interactions (on the order of 0.4–2.4 kcal/mol). 104 Synthetic receptors have been generated that mimic the biological mode of binding methyllysine residues. 107,108 For example, sulfonated calix[4]arene-based hosts (2, Figure 8d) can engulf methyllysine residues by harnessing cation−π and electrostatic interactions. 108,109 By matching the dimensions of the aromatic cage to the size of methylated lysine, specificity for methylation states can be achieved. Such supramolecular receptors are able to compete for the binding of H3K9me3 with its natural readers, and, as a consequence, perturb chromatin structure in cells. 109 3.2.3 Identification of New Methyllysine Binders To identify proteins that specifically bind a given histone PTM, Wysocka et al. performed pull-down experiments with synthetic peptides and cell lysates. 110,111 To this end, H3 peptides, carrying the K4me3 mark and a biotin tag, were immobilized on avidin beads (Figure 9a). Incubation with nuclear extracts, followed by SDS page analysis of bound proteins, yielded a band at molecular weight >300 kDa. 111 Mass spectrometry identified this H3K4me3 binder as BPTF, the largest subunit of the chromatin remodeling complex NURF. 112 BPTF contains two zinc finger motifs termed plant homeodomains (PHDs) and a bromodomain. Repeating the peptide pull-down assays with purified truncated BPTF constructs demonstrated that the second PHD was necessary and sufficient for H3K4me3 binding. Subsequent structural characterization indicated that the H3K4me3 mark is bound in an aromatic cage, and sequence specificity is granted by additional cation−π interactions and an ionic H-bond to Arg2 in the peptide (Figure 9b). 113 Notably, this peptide pull-down workflow has been applied to numerous histone PTMs and has provided a vast body of knowledge on stable interactions between nuclear proteins and specific histone marks. 114 Figure 9 Identification of new histone PTM binders. (a) Schematic of the workflow for peptide pull-downs of nuclear proteins. Modified peptides are immobilized on avidin beads and used to fish out specific binders such as the H3K4me3 binder BPTF. (b) Structure of the BPTF PHD finger (mauve) in complex with H3K4me3 (yellow, pdb code: 2f6j). An ion pair between Arg2 of histone H3 and an Asp residue of the PHD finger contributes to selectivity. (c) SILAC-based identification of methyllysine binders. Modified and control histone peptides are immobilized and incubated with isotopically labeled nuclear extracts. A hypothetical mass spectrum illustrating different selectivities of detected proteins is depicted on the right. The use of stable isotope labeling by amino acids in cell culture (SILAC) 115 greatly increases the sensitivity and throughput of this pull-down approach. 116 Vermeulen et al. generated a map of the human histone-methyllysine interactome using histone peptides containing one of the key trimethyl marks: H3K4me3, H3K9me3, H3K27me3, H3K36me3, or H4K20me3. Methylated peptides were used to pull down nuclear factors from HeLa cells grown in normal media. In parallel, unmodified peptide controls were used to enrich binding proteins from cells grown in the presence of 13C- and 15N-labeled Arg and Lys (“heavy” medium). The “light” proteins isolated with a specific methyllysine peptide are combined with “heavy” proteins from unmodified peptide pull-downs, and the mixture analyzed by mass spectrometry (Figure 9c). For each protein identified, the ratio of “light” versus “heavy” signal (L/H) obtained by MS reveals its binding preference: L/H > 1 indicates a Kme3-dependent interaction, while analytes with L/H < 1 favor unmethylated lysine residues. Proteins with L/H ≈ 1 are nonspecific interactors, and are typically ignored in further analyses. This approach yielded between 10 and 60 specific binding protein candidates for each mark, thus significantly expanding the catalog of potential trimethyllysine reader proteins. Reactivity-based probes have been developed to enable specific isolation of histone demethylases from nuclear lysates. To achieve this, Cole and co-workers installed a propargyllysine residue in place of Lys4 of an H3 peptide. 117 Peptides armed with this warhead were recognized by the H3K4-specific demethylase LSD1, triggering their oxidation with FAD (Figure 10). This reaction yields a potent electrophile that covalently links the reduced flavin cofactor to the probe. Thus, propargyllysine peptides represent potent mechanism-based inhibitors of FAD-dependent lysine demethylases. Immobilized versions of these probes successfully pulled down LSD1 and its binding partner, the corepressor CoREST, from nuclear lysates. A panel of related probes has since been devised by the same group. 118 In conjunction with SAM cofactor analogues developed by the Luo group, 119 these reagents facilitate chemical proteomic approaches to delineate histone lysine methylation and demethylation pathways. Figure 10 Mechanism-based histone demethylase inhibitors. Propargyllysine is oxidized by LSD1 via its FAD cofactor. The resulting Michael acceptor forms a covalent adduct with the reduced cofactor. 3.3 Arginine Methylation Histone arginine methylation occurs in three flavors: monomethylarginine (Rme) as well as the asymmetric (Rme2a) and symmetric (Rme2s) isoforms of dimethylarginine (Figure 11a). Methylarginine marks are installed by a panel of protein arginine methyltransferases (PRMTs), which are specific in regard to the Rme2 isomer they produce, but rather promiscuous in terms of site. 120,121 Chemically, the synthesis of methylarginine-containing peptides using Fmoc-SPPS is straightforward. Methylarginine isoforms that contain a free N-ω atom (Rme and Rme2a) are commonly sold in Pbf-protected forms, while Rme2s is available with di-Boc protection (Figure 11b). Given the importance of arginine residues in mediating both histone–DNA and histone–protein interactions, it is not surprising that its methylation has a range of critical functions in chromatin biology, including transcription regulation. 122−124 However, the majority of PRMT substrates are nonhistone proteins, often involved in RNA biochemistry, which complicates the assignment of cellular roles for histone arginine methylation. 124 3.3.1 Arginine Methylation and Protein–Histone Interactions Arginine methylation often exerts its biological effect by interfering with the biochemistry of other histone PTMs, in particular with methyllysine. 125 Many key sites of lysine methylation contain an arginine residue at the −1 (H3K9, H3K27, H4K20, all typically considered repressive marks) or the −2 position (H3K4, an activating mark). Inspired by the negative correlation between the presence of H3K4me3 and H3R2me2a, 126 Guccione et al. tested if peptides containing preinstalled methyl marks at K4 or R2 were substrates of PRMT6 and ASH2, the corresponding arginine and lysine methyltransferases, respectively. 127 In agreement with their hypothesis, H3K4me3-peptides were poor substrates for PRMT6, and, reciprocally, peptides containing H3R2me2a were not methylated by ASH2. Furthermore, the presence of H3R2me2a impeded the interaction of K4me3 with many of its known readers. 127,128 In contrast, some effectors of K4me2/3, such as the recombinase RAG2, benefit slightly from an additional R2me2s mark. 129 Figure 11 Methylarginine structure and recognition. (a) Isoforms of methylarginine residues. (b) Standard methylarginine building blocks for Fmoc-based SPPS. (c) Structure of the aromatic cage of the TDRD3 tudor domain (pdb code: 2lto). The Rme2a residue is colored in yellow, the specificity-determining tyrosine in pale green. (d) Structure of a synthetic Rme2a receptor isolated from a dynamic combinatorial library. The search for histone methylarginine readers gained a boost with the discovery that certain tudor domain proteins, some of which were previously known to be methyllysine binders, can specifically recognize Rme2s residues. 130 In 2009, the DNA methyltransferase DNMT3A was shown to bind the H4 tail in an R3me2s specific manner. 131 To find additional site-specific readers of histone me2a marks, Bedford and co-workers employed a microarray featuring more than 100 chromatin associated domains 132 including bromo, chromo, and tudor domains, among others. 133 To generate the array, individual domains were produced and purified as fusions with the enzyme glutathione S-transferase (GST) and spotted on a glass slide precoated with nitrocellulose polymer and immobilized by drying. 134 When the array was probed with H3 peptides containing R17me2a and a Cy3 label, a single protein domain, the tudor domain of TDRD3, displayed a fluorescent spot. 133 This interaction was confirmed using peptide pull-down experiments, and promiscuous binding between TDRD3 and several histone-derived Rme2a marks was observed. TDRD3 functions as a transcriptional coactivator; thus, another link between histone arginine methylation and transcription regulation was found. 133 The structure of the TDRD3 tudor domain has been solved by crystallography in its apo form 135 and by NMR in complex with an RNA polymerase-derived peptide containing the Rme2a mark. 136 The domain features a spacious aromatic cage, ideally suited to accommodate methyl arginine residues (Figure 11c). 137 Selectivity for the asymmetric isomer is controlled, at least in part, by a tyrosine residue that stacks with the guanidinium group. 136 However, the molecular mechanisms that underlie the discrimination for histone sites remain unclear. Synthetic receptors for Rme2 have been generated using dynamic combinatorial libraries. 138 Several aromatic dithiol building blocks were incubated in the presence of a short Rme2a-containing peptide (Figure 11d). Upon prolonged incubation, a three-membered, disulfide-bonded host molecule had formed that recognized histone peptides featuring an Rme2a mark. The same peptides containing Rme2s or Rme were bound less tightly by approximately 1 kcal/mol, although the host displayed no selectivity against trimethyllysine. 138 Conceivably, such receptors may find application as affinity reagents for enriching methylated histones and other proteins for proteomics studies. 3.3.2 Histone Citrullination Whether histone arginine methylation marks can be removed is contentious. 120,139 One possibility under active research is the potential for methyl-deimination of methylarginine into citrulline by peptidyl arginine deiminases such as PAD4 (Figure 12). 140−143 Interestingly, histone citrullination steers diverse biochemical functions independent of arginine methylation. Examples include the regulation of transcription 140,144 and linker histone binding. 145 It is currently unknown whether histone citrullination is reversible. However, given that biological mechanisms to convert free citrulline into arginine exist, 146 it is tempting to speculate that related enzymes might also operate on proteins containing this residue. Figure 12 PAD4-catalyzed deimination and possibly demethylimination to citrulline. Whether mechanisms exist to convert citrulline back to arginine in the context of histones is unknown. 3.4 Histone Phosphorylation Protein phosphorylation plays a central role in signaling, and histone substrates are no exception. Regulation of chromatin structure by histone phosphorylation is particularly important during cell cycle progression. As is common for protein phosphorylation in eukaryotes in general, serine and threonine phosphorylation have been studied in most detail, although histone tyrosine phosphorylation is also known to control chromatin structure and function. 147−152 In addition, phosphoarginine 153,154 and phosphohistidine 154−156 residues have been detected in histones, but their biochemistry is much less well studied due to the chemical instability of these marks. 3.4.1 Synthesis of Histone Phosphopeptides Incorporation of residues with O-linked phosphoryl groups by Fmoc-based solid-phase synthesis is in most cases routine nowadays. Typically, monobenzyl groups are used to protect the phosphoryl group during synthesis (Figure 13a). The presence of a negative charge on the monoprotected phosphoryl group during Fmoc deprotection with piperidine drastically reduces beta elimination for phosphoserine and phosphothreonine as compared to when dialkylated phosphoamino acids are used. 157 However, during the coupling of monoprotected phosphorylated amino acids, additional base is required for efficient coupling, and reversible acylation of the phosphoryl group can occur. 157,158 Figure 13 Building blocks for the synthesis of O-linked (a) and N-linked (b) phosphopeptides and their analogues. The synthesis of peptides containing acid-labile N-linked phosphoryl groups is much more challenging. 159,160 Nevertheless, recent developments have enabled the incorporation of phosphoarginine residues through the use of trichloroethyl (Tc) protecting groups that are selectively removed by hydrogenolysis after global deprotection using a TFA/scavenger cocktail (Figure 13b). 161 Furthermore, stable analogues for both isomers of phosphohistidine, where the phosphoryl group is attached to either N-τ (3-pHis, analogues 3 and 4) or N-π (1-pHis, analogue 5), have been synthesized, using a click reaction, for SPPS using Boc and Fmoc strategies (Figure 13b). 162,163 These analogues permitted the generation of pan-antiphosphohistidine antibodies 164 as well as variants that selectively recognize phosphohistidine in histone peptides. 162 Thus, chemical and biochemical tools to study histone phosphorylation at basic residues are coming of age, enabling long-awaited investigations into the biochemistry of these intriguing PTMs. 3.4.2 Effects of Ser/Thr Phosphorylation in Protein–Protein Interactions The close proximity of Ser/Thr residues to the major sites of histone lysine methylation on histone H3 (Thr3,Lys4; Lys9,Ser10; Lys27,Ser28) led Fischle, Wang, and Allis to propose that phosphorylation can switch the function of adjacent methylation marks. 165 Experiments using site-specifically phosphorylated histone peptides were able to directly confirm this hypothesis. The binding of the CD of HP1 to H3K9me3 is abolished in the presence of a phosphorylation mark on the neighboring H3S10 (Figure 14a). 166,167 Phosphorylation of H3S10 occurs during mitosis, and serves to evict HP1. In addition, H3S10ph precludes methylation of H3K9 by the heterochromatin-specific methyltransferase Suv39h1. 70 Histone serine phosphorylation can also be recognized by dedicated reader modules. To isolate binders of pSer in the context of the N-terminal tail of histone H3, Mahadevan and co-workers affinity purified cell lysate using an immobilized synthetic peptide containing H3S10ph and acetyl marks at Lys9 and Lys14. 168 Using mass spectrometry, they identified a member of the 14-3-3 family, 169 a helical pSer/pThr binding motif. 168 Structural studies revealed that the phosphoryl group was accommodated in a cationic binding pocket featuring two arginine residues from 14-3-3 that form salt bridges with the ligand (Figure 14b). In addition, Arg8 on the histone peptide was sandwiched between pSer and a glutamate residue, thereby contributing to substrate specificity. Consistent with this binding mode, 14-3-3 also binds to H3S28ph with Arg26 at the −2 position. Figure 14 Biochemical readout of histone phosphorylation. (a) Illustration of a meLys/pSer switch. Phosphorylation at H3S10 ejects the K9me3 binding protein HP1, and prevents K9 methylation by the methyltransferase Suv39h1. (b) Structure of 14-3-3γ (green, pdb code: 2c1j) in complex with an H3 peptide containing S10ph and K9ac (yellow). Hydrogen bonds are indicated by dotted lines. Synthetic phosphopeptides also aided in illuminating the biochemistry of histone tyrosine phosphorylation. For example, the Drosophila transcription regulator Eyes Absent (EYA) was determined to be a histone tyrosine phosphatase that was able to dephosphorylate peptides of the histone variant H2A.X containing pTyr142 but not pSer139. 150,151 Currently, no specific binding module for histone tyrosine phosphorylation is known, and the positions of Tyr residues in histones (only two of 15 histone Tyr residues are surface-exposed) 152,170 suggest that pTyr may be able to exert its functions by directly modulating nucleosome structure and DNA access. 3.5 Glycosylation Glycosylation has important implications for protein structure and function. 171 Among the myriad of biologically pivotal carbohydrate modifications, attachment of β-N-acetylglucosamine (GlcNac) to Ser and Thr residues is most germane to histone biochemistry. 172 Using lectins 173 (carbohydrate-binding proteins), GlcNac-specific antibodies, 173 or metabolic labeling with azide-modified GlcNac 172 (and subsequent derivatization with a biotinylated alkyne moiety) to enrich GlcNac-ylated proteins, all core histones have been shown by mass spectrometry to carry this PTM. Biochemically, GlcNac-ylation of histone H2B at Ser112 promotes the ubiquitination of the proximal Lys120, and is associated with transcription activation. 174 To test the effect of the GlcNAc modification in vitro, the ubiquitylation of nucleosomes by the E3 ligase BRE1A and its associated complex members was studied in the presence of H2B peptides. GlcNAc modified H2B peptides inhibited the reaction, while unmodified congeners or free GlcNAc-ylated serine did not. These results suggest that the ligase binds strongly to site-specifically glycosylated H2B. The study of histone glycosylation is still in its infancy, but advanced methods to study protein glycosylation may be borrowed from other fields of research, 175−177 and highly complex glycopeptides and glycoproteins can be synthesized. 178 These tools might provide a means to answer the remaining biochemical questions about how glycosylation intersects with chromatin biology. 3.6 ADP-Ribosylation Histones are subject to mono- and poly-ADP-ribosylation (MAR and PAR, respectively) involving many different side-chains, including lysine, arginine, asparagine, and glutamate. 179,180 These modifications are associated with a plethora of important biological functions, 181,182 yet the mechanistic contributions of individual ADP-ribosylation marks are difficult to dissect due to a dearth of (bio)chemical tools to study this diverse class of PTMs. 183 Nevertheless, recent progress has provided strategies to incorporate ADP-ribosylated building blocks and analogues into peptides. Orthogonally protected ribose conjugates to Asn and Gln have been synthesized that allow selective phosphorylation of the 5′-OH group, followed by coupling with an activated AMP building block during Fmoc-SPPS (Figure 15a). 184 In this way, a heptapeptide corresponding to the N-terminus of H2B containing an analogue of mono-ADP-ribosylated Glu has been created. Stable analogues of mono-ADP-ribosylated Glu residues can be generated using a chemoselective ligation approach. 185 Peptides encompassing residues 1–19 of histone H2B, functionalized with a nucleophilic aminoxy group at position 2, form oximes with ADP-ribose at pH 4.5 (Figure 15b). This reaction is selective because lysine and arginine residues are protonated under these conditions. While the use of a secondary alkoxyamine was beneficial for retaining the ADP-ribose conjugate in the furanose form as opposed to an open configuration (Figure 15c), the yield of the ligation was poor. 185 ADP-ribosylated proteins specifically interact with macrodomain-containing proteins. 183 The histone variant macroH2A is the founding member of this family. 186 Indeed, chemically ADP-ribosylated H2B(1–19) interacted with macroH2A, 185 suggesting a role for this modification in regulating chromatin structure. 181 Conceivably, the synthetic advances discussed above will enable the generation of antibodies recognizing mono-ADP-ribosylated proteins, and will thus provide a much needed tool to study ADP-ribosylation. 183 Figure 15 Synthesis of mono-ADP-ribosylated peptides. (a) On-resin phosphorylation and AMP conjugation of an orthogonally protected ribosyl moiety. (b) Chemoselective ADP-ribose (inset) ligation to aminoxy-functionalized peptides. (c) ADP-ribose conjugates of N-methyl aminoxy-functionalized peptides retain the ribo-furanosyl-form. AMP = adenosine monophosphate, ADP = adenosine diphosphate. 3.7 Ubiquitylation Histone ubiquitylation represents a particularly intriguing PTM given that the size of the modification (76 amino acids) rivals the size of the histone substrate. In contrast to polyubiquitylation, which commonly serves to flag proteins for degradation, the monoubiquitylation signals observed on H2A and H2B are associated with regulation of gene expression. 14 Detailed evaluation of the genomic distribution of H2B modified with ubiquitin at Lys120 (H2B-K120ub) was enabled by an antibody that specifically recognizes this species. 187 As described by Minsky et al., 187 a branched peptide encompassing residues 116–124 of H2B and the C-terminal four residues of ubiquitin, conjugated to H2B-K120 via an isopeptide bond, served as a surrogate for H2B-ubiquitin in the immunization process. The authors subsequently performed ChIP assays on human cell lines with this antibody and found that H2B ubiquitination occurs in the transcribed regions of highly expressed genes. Beyond antibody preparation, fully understanding the diversity of direct biochemical and biophysical consequences of attaching ubiquitin and related proteins to histones required the development of synthetic strategies for site-specific attachment of the complete ubiquitin protein to histones. A key step toward this goal involved the synthesis of peptide-ubiquitin conjugates. 188 As this process hinges upon protein ligation techniques, we defer its detailed description to section 4.3.5. 3.8 A Growing List of Histone PTMs Novel histone PTMs continue to be discovered. Highly sensitive mass spectrometry has revealed, for instance, that lysine residues can be modified with a diverse set of acyl groups. 189 The prime example in this category is lysine crotonylation (Figure 16a), a mark widely distributed through active chromatin regions. 189 The presence of this PTM was authenticated by synthesis; the chromatographic and mass spectrometric properties of cell-derived and synthetic histone peptides were identical. Antibodies that recognize this mark are already commercially available. Still, little is known about nuclear factors that attach, remove, or specifically bind this modification, although HDAC3, as well as members of the sirtuin family, have been found to possess measurable but small decrotonylase activity. 189−191 The latest addition to the histone lysine acylation roster is 2-hydroxyisobutyrylation (Khib, 6). 192 Zhao and co-workers detected a mass shift of +86.0354 in tryptic digests of histones from mouse testis cells, corresponding to the addition of a C4H7O2 fragment. Several isoforms of this composition are plausible (Figure 16b, 6–10). Therefore, five peptides encompassing residues 68–78 of H4 were synthesized with different lysine modifications. Among these, the variant where lysine 77 was acylated with 2-hydroxyisobutyric acid was indistinguishable from the biological sample in LC/MS/MS assays, thus confirming the identity of the novel PTM. The genomic localization of H4K8hib was found to differ from the distribution of H4K8ac, suggesting a distinct biochemical function for these two marks. Recently, Tessarz et al. described that the amide side-chain of Gln104 in human histone H2A can also be selectively methylated in vivo. 193 This modification abrogates binding of H2A to the histone chaperone FACT (facilitates chromatin transcription), as evidenced by a peptide-based pull-down in vitro. Glutamine methylation was only detected in the nucleolus, where it regulates the expression of the 35S rDNA gene, and hence represents the first histone mark that is associated with only one specific polymerase. Figure 16 Newly discovered lysine acylation marks. (a) Lysine crotonylation. (b) Lysine hydroxyisobutyrylation (6), and control isomers (7–10). With the ever increasing sensitivity of mass spectrometers, as well as advances in sample workup, more histone PTMs will likely appear in the near future. 194 These new marks contribute to the immense complexity of biological signaling, and challenge the analytical creativity of protein biochemists, not to mention the synthetic skills of peptide chemists for subsequent mechanistic investigations. 3.9 Proline Isomerization Amino acids in proteins can occur in the cis and the trans conformation with respect to the backbone amide bond. While for most residues the equilibrium lies far on the side of the trans isomer, Pro residues populate a significant extent of the cis conformer (Figure 17a). 195 The position of the equilibrium can be fine-tuned by tertiary interactions that stabilize either state. Interconversion between the distinct forms occurs spontaneously, albeit slowly on the time scale of minutes. Dedicated proline isomerases such as the yeast enzyme Fpr4 catalyze this process, and several Pro residues on histone tails have been identified as substrates. 196 Kouzarides and co-workers proposed that the H3K36-specific methyltransferase Set2 is only active when the neighboring Pro38 is in the trans conformation, and that the H3K36-specific demethylase, JMJD2A, prefers the cis isomer. 197 Reciprocally, K36 trimethylation inhibits the activity of Fpr4, leading to a model where genes can be activated quickly through the combined action of Fpr4 and Set2 (Figure 17b). 197 The slow isomerization of the methylated trans conformer to the cis state followed by JMJD2A-mediated demethylation could act to set a timer for the duration of the active state. Figure 17 Proline isomerization. (a) Amino acid cis/trans equilibria. (b) Proposed switches through coupled Pro isomerization and Lys methylation to activate associated genes. Lys36 methyl marks are indicated as green spheres. Chemical tools for the synthesis of peptides and proteins containing proline analogues with distinct conformational preferences have found application in the study of ion channels 198 and protein aggregation, 199 among others. 200 They might also lend themselves to directly probe the structural and functional consequences of this noncovalent histone PTM. 3.10 Probing the Function of Cancer-Derived Histone Mutations Genomic sequencing efforts have revealed that several cancers are associated with histone mutations. 201−204 In particular, the mutation Lys27Met on histone H3 isoforms occurs in the majority of cases of a subtype of pediatric brain tumor (diffuse intrinsic pontine glioblastomas, DIPG). Lys 27 is the target site of the multisubunit methyltransferase, Polycomb Repressor Complex 2 (PRC2, Figure 18a). This molecular machine and its associated histone PTM, H3K27me3, play a central role in gene silencing, and thus are essential for cell differentiation and development of multicellular organisms. 205 In mammalian cells, histone proteins are encoded on many synonymous genes. Consequently, it came as a surprise that cells carrying the K27M mutation on only one H3 gene, corresponding to a total of 3–18% of the total histone H3 protein pool, 206 display strongly reduced H3K27me3 on all wild-type histones (Figure 18b). 206−209 Similarly, the presence of the H3K27M mutant dramatically lowered Lys27 methylation in cell lines 206,208,209 and in Drosophila. 210 Figure 18 Cancer-derived H3K27M mutants inhibit PRC2 activity. (a) Molecular architecture of PRC2 according to Ciferri et al. 218 (b) PRC2 inhibition by K27M causes aberrant gene expression. PRC2 serves to silence certain genes through its HMT activity (left). In K27M tumor cells (right), trimethylation at Lys27 is dramatically reduced, preventing gene repression. K27me3 marks are shown as green flags, K27M mutant as a red circle. (c) Structure of Lys, Met, and Nle. Recent collaborative efforts from the Allis and Muir groups provided unequivocal proof that these mutant histones directly inhibit PRC2. 206,211 In vitro histone methyltransferase activity on recombinant unmodified nucleosome substrates was strongly reduced by a synthetic peptide bearing the K27M mutation. 206 Substituting the thioether moiety in methionine with a methylene group in norleucine (Nle, Figure 18c) resulted in even more potent inhibition of PRC2. By contrast, peptides with polar and branched residues at position 27 were poor inhibitors. 211 Peptide-based inhibitor studies revealed extensive contacts between the entire H3 tail and EZH2, the catalytic subunit of the complex (see also section 3.11.1). Intriguingly, many naturally occurring PTMs of the H3 tail drastically reduced inhibitor potency of K-to-M mutant peptides, illustrating that chromatin context influences the downstream effect of “oncohistones”. 211 Lewis et al. also demonstrated that Lys to Met mutations inhibit many different HMTs (all sharing a common catalytic SET domain) in vitro and in vivo. 206 Specific peptide inhibitors, derived from these initial observations, would be tremendously useful to understand the biochemistry of histone methyltransferases, and might find use in combatting diseases associated with hyperactive HMTs. The recent discovery that Lys to Met histone mutations at H3K36 are also associated with pathologies 212 underscores the importance of investigating the interactions of histone methyltransferases with their substrates and inhibitors. Simultaneously, these findings provide an enormous challenge to medicinal chemists and (chemical) biologists alike to devise novel strategies to inhibit the inhibition of pivotal nuclear factors, such as PRC2, by pathological histone mutants. 3.11 Cross-linkers Synthetic peptides can be furnished with a broad range of invaluable probes, including cross-linkers. These are stable molecules that, upon activation with a chemical or physical stimulus, become extremely reactive and covalently attach themselves to diverse functional groups in spatial proximity. 213,214 Cross-linking can be harnessed to capture ephemeral interactions, and thus it lends itself to the study of transient protein–protein contacts and detection of binding partners and surfaces in complex mixtures. 3.11.1 Analysis of PRC2 Regulation PRC2 (see section 3.10) interacts with nucleosomes through several of its subunits, and many of these binding events are regulated by chromatin state. 205 Cross-linking strategies have been exploited to aid in disentangling the PRC2 regulatory network, specifically by identifying to which subunit cancer-derived histone mutants bind, and how PRC2 detects the nucleosome density of genomic targets. The methionine analogue photomethionine 215 (Figure 19a) can be incorporated into peptides by Fmoc-based SPPS, and represents an excellent tool to study the binding site of H3K27M mutants. Upon irradiation with UV light, the diazirine moiety decomposes into N2 and a highly reactive carbene, immediately inserting into nearby bonds, including C–H bonds. 213,214 An H3 peptide (residues 23–34) containing K27photoMet and a biotin tag cross-linked efficiently to EZH2, the catalytic subunit of PRC2, suggesting that histone mutants act as orthosteric active site-directed inhibitors (Figure 19b). 206 Figure 19 Cross-linking strategies to study PRC2 regulation. (a) Structure and photo-cross-linking mechanism of photomethionine. (b) H3(23–34)K27photoMet cross-links to the catalytic subunit EZH2. The diazirine cross-linker is shown as a red triangle, the covalent adduct as a red line. (c) Structure and oxidative cross-linking mechanism of DOPA. (d) H3(35–42) cross-links to SUZ12. The DOPA cross-linker is shown as a red hexagon, the covalent adduct as a red line. Dense chromatin is methylated more efficiently by PRC2 than dispersed arrays. 216 This stimulation is mediated by an octapeptide corresponding to residues 35–42 of H3. To determine which component of PRC2 senses local chromatin density, this peptide was synthesized with a DOPA 217 residue and a biotin tag (Figure 19c). Treatment of the H3(35–42)-DOPA peptide, bound to PRC2, with periodate led to covalent cross-linking to SUZ12, the central scaffolding subunit 218 of the complex (Figure 19d). 216 3.11.2 Capture of Transient Interactions Cross-linking turns weak or transient interactions into covalent ones. This feature is particularly useful to isolate binding partners from complex mixtures such as cell lysates. For example, ADP-ribosylated peptides (see section 3.6) were not able to pull down macroH2A doped into nuclear extracts, but furnishing the peptide with a benzoyl-phenylalanine (BPA) residue 219 (Figure 20a) and performing the purification step after UV irradiation enabled trapping of this weak interaction. 185 Figure 20 Photo-cross-linking strategies. (a) Structure and photoexcitation of p-benzoyl-phenylalanine. (b) Cross-linking-based workflow to identify proteins that are sensitive to the methylation state of H3K4. Kapoor and co-workers used photo-cross-linking to identify PTM-specific histone binding proteins in an unbiased fashion. Initially, H3K4me3 peptides furnished with a BPA residue and an alkyne group were used to evaluate the cross-linking reaction in vitro and in vivo. 220 As expected, the probe modified the PHD finger protein ING2, a known binding module for this PTM, but not HP1, which binds H3K9me3 (section 3.2.2). In follow-up studies, mass spectrometric analysis of cross-linked samples enabled proteome-wide analysis of PTM binders. 221 For this work, the group synthesized a second version of their probe without the K4me3 mark and performed a SILAC experiment (Figure 20b, see also section 3.2.3). Cells grown in “light” media were lysed, incubated with the K4me3 probe, and subjected to UV irradiation. In parallel, the K4me0 probe was cross-linked to extracts from cells grown in “heavy” media. Subsequently, the two experiments were combined and reacted with biotin-N3 allowing for enrichment of cross-linked species using streptavidin beads. MS analysis provided a list of known K4me3 binders, along with a set of potentially novel readers of this mark. Similarly, the panel of proteins that prefer K4me0 included familiar and candidate interactors. Several of the newly discovered interactions were verified using ITC, demonstrating the validity of the approach. 221 Extension of this methodology to H3 tails modified with K9me3, H3T3ph, and the doubly modified T3ph/K4me3 has been reported since. 222 3.12 Combinatorial Approaches To Study Histone Biochemistry Over 100 distinct histone PTMs are currently known. 189 These marks seldom occur in isolation. Instead, many histone PTMs coassociate into so-called chromatin states, 223−225 characterizing the biochemical environment of genomic loci. For instance, the activating signature H3K4me2/3 often manifests in combination with H3K9ac and H2B-K120ub. 225 Thus, it is not surprising that many histone PTMs exert their full effect only in conjunction with other marks. Individual reader domains are sensitive to the presence of histone modifications close to their main target residue, and most chromatin associated proteins contain several histone binding domains. It is therefore important to interrogate the molecular consequences of histone modification in a combinatorial fashion. Peptide libraries provide an ideal means to screen interactions between histone PTMs with nuclear proteins, in particular when synergism and antagonism of local PTM combinations on the same peptide ought to be explored. 226 In this section, we will discuss various strategies to assemble histone peptide libraries, and highlight their key applications. 3.12.1 Histone Peptide Microarrays Individually synthesized and purified peptides, each bearing a biotin handle and a unique PTM signature, can be printed onto avidin-coated glass slides to yield a densely covered microarray (Figure 21a). While the synthesis of such a collection is time and labor intensive, a typical synthesis scale provides enough material for hundreds of chips. 227 The synthetic effort is rewarded by a dramatically increased throughput based on the simultaneous analysis of pairwise interactions between effectors and each member of the peptide library. Readout is most easily achieved by fluorescently labeled antibodies, and the use of epitope tagged proteins of interest facilitates this process (Figure 21a). Upon hybridization, bright spots are simply matched with the peptide identity through their position on the microchip. Peptide arrays containing tens to hundreds of peptides displaying varying histone PTMs at distinct residues have been utilized to screen the binding specificity of known and novel chromatin interacting domains. 228−236 Figure 21 Histone peptide microarrays. (a) Preparation of microarrays and protein binding assay. POI stands for protein of interest, AB for antibody. (b) Structure of the coupled TTD (light blue) and PHD (pale green) of UHRF1 (pdb code: 3ask). The H3 peptide trimethylated at residue 9 is depicted in yellow, the linker between the two modules in black. (c) HDAC assay using SAMDI. Xaa and Yaa denote any amino acid. A case in point is the study by Matthews et al. on how RAG2, a protein essential to V(D)J recombination during immune cell maturation, engages chromatin. 229 A 45-membered histone peptide array featuring different methyllysine, methylarginine, acetyllysine, and phosphothreonine marks identified the PHD finger of RAG2 as a K4me3-binding module. This interaction and its specificity were verified by classical pull-down approaches. Notably, mutations that cripple the aromatic cage of the RAG2 PHD finger caused a reduction in V(D)J recombination, and similar mutations occur in patients suffering from immunodeficiency. 237 The same approach led to the characterization of ORC1, a component of the origin of replication complex (ORC). 232 This protein contains a BAH domain (bromo-adjacent homology), 238 which mediates selective binding to H4K20me2, as determined with a 82-peptide microarray. Again, the results of the screen were verified in vitro, in this case by ITC, and in vivo. Indeed, H4K20me2 binding by ORC1 is important for recruitment of ORC to designated genomic loci, and the loss of this interaction is linked to a growth retardation syndrome. 239,240 Strahl and co-workers profiled several methyllysine binding domains with a peptide microarray containing 130 peptides with up to six simultaneous PTMs including lysine and arginine methylation, serine and threonine phosphorylation, and lysine acetylation. 233 In most cases, the presence of a phosphoryl group proximal to the target methyllysine residue abolished binding. In contrast, the tandem tudor domain (TTD) of the E3 ubiquitin ligase UHRF1 tolerated a peptide epitope containing both H3K9me3 and S10ph. This feature enables UHRF1 to remain bound to H3K9me3 during mitosis when Aurora B-mediated S10 phosphorylation ejects many known K9me3 binders. 166 A rescreen of the binding preference of the UHRF1 TTD coupled to its neighboring PHD finger suggested that the PHD, which recognized the unmodified N-terminus of H3, 241 dominates the association with histone peptides. 234 Variants with a mutated PHD finger unable to bind the H3 tail did not interact significantly with any peptide probe on the chip. A crystal structure of the coupled TTD-PHD domains demonstrates that the two modules associate and compactly bind to an H3 tail containing K9me3 (Figure 21b). 242 Interestingly, the lipid phosphatidylinositol 5-phosphate can allosterically activate the TTD of UHRF for H3K9me3 binding, thus providing a link between lipid metabolism and chromatin architecture. 243 Microarrays consisting of 250 biotinylated peptides encompassing all monoacetyllysine marks on the H3 and H4 tails, as well as di- and poly acetylated versions, have been used to profile commercial, site-specific acetyllysine antibodies. 244 Surprisingly, all antibodies tested preferentially bound to polyacetylated peptides, suggesting that there is a need for improved acetyllysine detection reagents. When peptides are immobilized on gold plates covered with a self-assembled monolayer, the resulting arrays can be used in laser desorption ionization mass spectrometry (SAMDI-MS). 245,246 Gold surfaces are covered with alkane-thiolates, and subsequently functionalized with maleimide groups. Hexapeptides, centered around an acetylated lysine, were attached to the surface via C-terminal cysteine residues. Subsequent treatment with various HDACs, followed by SAMDI-MS, enabled the substrate scope of these eraser enzymes to be profiled (Figure 21c). 246 3.12.2 SPOT Synthesis of Peptide Arrays Direct synthesis of peptides on cellulose paper (so-called SPOT synthesis) provides a convenient route to spatially addressable microarrays. 247 This strategy parallelizes the library synthesis and bypasses labor intensive purification steps associated with the immobilization strategies discussed above, but, as a consequence, limits the length (6–18 residues are common) 248 and complexity of peptide targets. 247 To commence peptide synthesis, spots on cellulose membranes are first esterified with Fmoc-β-Ala-OH or a similar protected amine. The membrane is then capped with acetic anhydride. Subsequent Fmoc deprotection is followed by iterative, parallelized peptide synthesis, where each spot is reacted with a desired Fmoc-protected amino acid separately by dispensing only enough reagents to cover the spot (Figure 22). Because cellulose membranes are resilient to short exposures in TFA, side-chain deprotection can be achieved while retaining peptide attachment and the integrity of the support. Alternatively, peptides can be cleaved from the membrane by base treatment for analytical purposes, or if soluble peptides are required. Binding assays are performed in analogy to dot-blot detection. SPOT arrays are incubated with epitope tagged proteins of interest, which are subsequently detected with primary and, if required, secondary antibodies conjugated to horseradish peroxidase or alkaline phosphatase. Peptides, identified by their position on the membrane, targeted by the protein of interest are visualized using bioluminescent or chromogenic substrates. In this way, SPOT arrays containing hundreds of modified peptides have been used to profile the specificity of a range of sequence specific methyllysine reader domains, including the CDs of HP1β 249 and HP1γ, 250 the PHD finger of the chromatin remodeler ATRX, 251 and the PWWP domain of a DNA methyltransferase, 252 among others. 249,250 In addition, the diversity of sequences that can be synthesized on spot arrays is ideally suited to assess the promiscuous binding of readers, as exemplified by the interaction of MBT repeats of L3MBTL1 with dimethyllysine residues. 250 Figure 22 SPOT synthesis of histone peptide arrays on cellulose membranes. Xaa and Yaa denote any amino acid, pg stands for side chain protecting group. Of particular interest is a recent comprehensive analysis of human bromodomains. 253 SPOT arrays containing all possible acetylation sites on human histones were used to profile 33 individual BD family members, together spanning thousands of pairwise interactions. In general, BD binding to acetylated peptides was weak. Some binding modules displayed remarkable specificity (e.g., the BDs of TRIM28 and MLL), while others bound most acetylated peptides (e.g., the BDs of SP140 and PCAF). SPOT arrays with numerous combinations of acetyl marks were synthesized to evaluate cooperative binding. Several BDs, including those of the transcriptional coactivator BRD4, were shown to strongly prefer multiply acetylated histone peptides. A fraction of the hits were assayed by ITC using soluble peptides, and almost 30 crystal structures of BDs were determined in this landmark study. SPOT arrays are compatible with a range of different detection strategies, and are well-suited for enzymatic assays. The substrate specificity of the histone methyltransferase G9a was evaluated using a SPOT membrane encompassing residues 1–20 of H3 with numerous mutations and PTMs. 254 G9a activity was determined by fluorography upon incubation with 3H-S-adenosylmethionine. A minimal recognition motif includes an unmethylated Arg in the −1 position, with moderate selectivity at the −2,+1,+2 positions, indicating that G9a is quite promiscuous. Indeed, several nonhistone targets were found to be methylated in vitro, and the products recognized by HP1β. These results suggest that G9a exerts its effects through a combination of histone and nonhistone pathways. Similar analyses were carried out for the methyltransferases Dim-5, 255 NSD1, 256 and SET7/9 257 to determine the substrate specificities of these important enzymes. A 384-membered SPOT library of 19-mers was used to probe a variety of different PTM-recognizing antibodies. 258 Overall, many antibodies displayed the desired specificity, but noncognate binding to the same PTM at different sites was certainly an issue. False negatives due to epitope occlusion by additional modifications surrounding the targeted residues were also frequently observed. 258,259 Thus, the thorough profiling of antibody specificity using a range of different peptide approaches has provided valuable insight into the applicability of some of the most used reagents in chromatin biochemistry. While many antibodies display the proclaimed specificity, some suffer from severe cross-reactivity, and most exhibit additional preferences for the modification state of adjacent residues. Using a particularly comprehensive array (746 peptides), Denu, Garcia, and co-workers evaluated histone-PTM reader domains as specific reagents to isolate nucleosomes from particular chromatin states. 260 Consistent with previous observations, the authors found that the ADD domain (a type of zinc finger) of ATRX binds with high specificity to H3K9me3 in the context of unmodified H3K4. In contrast, antibodies raised against H3K9me3 displayed poor selectivity for their cognate marks. Chromatin affinity purifications with the ADD domain led to the enrichment of histones that were hypermethylated at H3K9 and H4K20, and hypomethylated at H3K4 and H3K79, as judged by mass spectrometry. These results demonstrate that reader domains can serve as valuable alternatives to antibodies to interrogate the composition and distribution of chromatin states. 3.12.3 One Bead-One Compound Peptide Libraries Libraries containing thousands of peptides are produced most easily by split-pool synthesis. 261,262 In this approach, peptides are synthesized using Fmoc chemistry on beads that are resilient to TFA cleavage. Additionally, for every coupling step, resin beads are split into different vials, each containing a unique activated amino acid. Upon completion of the reaction, beads are pooled again, and randomly redistributed for subsequent couplings. Finally, peptides are deprotected with TFA containing scavengers. In this way, each bead will carry only one peptide sequence, although several beads may contain the same peptide. Identification of peptides upon isolation of individual beads is achieved by microsequencing or by mass spectrometry, facilitated by performing partial capping steps at strategic sites, thus generating a mass ladder. 263 Cyanogen bromide can be used to cleave peptides from the resin prior to MS analysis when a C-terminal methionine residue is included in the sequence. 264 Figure 23 One bead-one compound libraries of modified H3 and H4 tails. One bead-one compound libraries are particularly useful when a large number of closely related peptides are desirable, as is the case when synergies between PTMs on a histone tail are queried. Denu and co-workers have prepared peptide collections encompassing 800 and 5000 members with combinations of known histone PTMs on the H4 (ref (265)) and H3 (refs (264,266)) tail, respectively (Figure 23). A colorimetric on-bead western assay was used to evaluate the binding profile of a range of GST-tagged domains to the H3 library (residues 1–10) in an unbiased manner. The expected preferences for methylation states at Lys4 of the interrogated PHD domains were observed along with various degrees of sensitivity to proximal PTMs. 264 Switch-like behavior occurred in the case of phosphorylation at Thr3 in that this modification abrogated binding to surrounding residues by all proteins tested. Regulation of binding by arginine methylation followed a rheostat model in some cases. For example, ING2 binding was gradually decreased by each additional methyl group at Arg2. Some domains (the PHD fingers of RAG2, BHC80, AIRE) were ejected by Thr6ph, while the double tudor domain (DTD) of the demethylase JMJD2A was insensitive to this mark. The potential for reader-specific responses to Thr6ph prompted a search for this modification in vivo. Indeed, MS analysis detected this mark upon phosphopeptide enrichment by affinity chromatography. 264 3.12.4 Toward Nucleic Acid Encoded Histone Peptide Libraries Suga and co-workers performed in vitro translation of RNA sequences coding for histone peptides with an expanded genetic code. 267 Redundant codons were reassigned to be interpreted by tRNA molecules acylated with modified lysine building blocks (Figure 24a). Ironically, this strategy entails the incorporation of desired post-translational modifications prior to ribosomal translation on the residue level. Peptides containing Kme1, Kme2, Kme3, and Kac residues at positions 4, 9, 27, and 36 on the H3 tail were synthesized, although the yield for monomethylated products was poor. Up to four PTMs were incorporated simultaneously, allowing synergies between different marks to be explored. As expected, HP1 bound specifically to peptides containing H3K9me3, with a slight increase in affinity when K27 is methylated as well. Figure 24 In vitro translation of histone peptides. (a) Reassigned codons with corresponding amino-acyl-tRNAs. (b) Schematic representation of mRNA display with the puromycin-mediated attachment of the mRNA to the growing peptide chain. Notably, peptides translated in vitro can be tethered to their coding mRNA sequence, for example, through the use of puromycin-tagged mRNAs (Figure 24b). 268 The link between the translated peptide or protein to its mRNA enables decoding of molecules that exhibit a given phenotype, such as binding of the peptide to a receptor, simply by sequencing the RNA portion. This strategy, combined with an expanded genetic code, has been applied by the Suga laboratory to select macrocyclic peptide inhibitors for diverse targets. 269 Thus, mRNA display and related technologies 270 harbor great potential for the synthesis of large encoded histone peptide libraries. 3.13 Beyond Peptides Chemical synthesis enables routine preparation of histone peptides carrying most known PTMs and, if desired, a wealth of probes, including cross-linkers, affinity reagents, and spectroscopic handles. Such peptides have proven to be indispensable in chromatin research because many central molecular transactions in chromatin biochemistry occur at the unstructured histone tails. Specifically, the biochemical rules and the physical chemical driving forces for how histone PTMs mediate interactions with nuclear proteins have been elucidated using the peptide chemistry toolbox. Peptides are thus a first resort utensil to validate and characterize biological discoveries. Despite their utility, peptides are, however, insufficient for certain protein interaction studies and functional assays. For example, nucleosomes are required to investigate processes that involve multivalency through different histones or depend on the presence of DNA. Nucleosomal DNA can provide electrostatic interactions to strengthen otherwise weak binding events, or even serve as the substrate per se, as is the case in transcription or remodeling assays, among others. Access to modified nucleosomes requires corresponding access to modified histones, all of which are over 100 amino acids in length. Because routine SPPS approaches are limited to approximately 50 residues, and thus fall short of attaining entire histone proteins, continuative and complementary technologies are required to increase the level of complexity of chromatin-related phenomena that can be scrutinized in vitro. 4 Chemical Approaches To Manufacture Histones and Chromatin In this section, we review modern approaches to synthesize site-specifically modified histones and chemically defined “designer” chromatin templates, and their application in investigating chromatin biochemistry. Robust protocols for the assembly of nucleosomes and chromatin templates have been developed for structural, biochemical, and biophysical studies. 271 Besides isolation from eukaryotes, histones can be produced recombinantly in E. coli as inclusion bodies. 272,273 Recombinant proteins are devoid of PTMs, and thus represent clean slates for in vitro studies. Their small size and positive charge permit facile purification of recombinant histones through size exclusion, ion exchange, and reverse phase chromatography. Stoichiometric amounts of each of the core histones are then refolded into octamers and supplied with DNA sequences with a high propensity to bend, that is, wrap around histone octamers. Nucleosomes assembled in this way have been crystallized, and their structure solved to <2 Å by the Richmond group. 274,275 The histone octamer forms a disk with a cationic lateral surface, which is enveloped by DNA (Figure 25a). The histone tails protrude from this compact structure, with residues 16–23 of H4 docking into an acidic patch on the H2A/H2B interface of an adjacent particle in the crystal lattice (Figure 25b). 274 These contacts, initially observed in crystal packing, are believed to play a central role in the folding of the chromatin fiber. 276,277 Homogeneous nucleosome arrays can be assembled from repeats of a strong nucleosome positioning sequence such as the “Widom 601” (ref (278)) or the 5S rDNA sequence. 279−281 Structural studies with tetranucleosome arrays by X-ray crystallography 282 and dodecanucleosome arrays by cryo-EM 283 reveal details on the packing interactions that govern chromatin folding. In both structures, chromatin adopts a two-start helix with close interactions between i, i+2 nucleosomes (Figure 25c and d). However, alternative packing models for the chromatin fiber have been supported experimentally, 284,285 and the existence of highly ordered fibers in vivo is still contentious. 286 Several recent accounts cover this controversy in detail. 287−289 Figure 25 Nucleosome and chromatin architecture. (a) Electrostatic surface rendering of the mononucleosome (pdb code: 1kx5). Cationic areas are colored in blue, anionic patches in red, the DNA backbone is drawn in gray. (b) Interaction of the acidic patch on H2A/H2B (red surface) with the H4 tail of a neighboring particle (yellow). (c) Crystal structure of a tetranucleosome array (pdb code: 1zbb). (d) Dodecanucleosome arrays fold into a two-start helix as suggested by a cryo-EM structural model (EMD-2600). To understand how specific histone PTMs alter chromatin behavior, access to homogeneously modified chromatin templates is crucial. Initial studies relied on enzymatic preparation of modified histones and, by extension, chromatin carrying the desired marks. This approach provided invaluable insight into chromatin signaling, but suffers from lack of specificity. Many histone-modifying enzymes target multiple sites on histones as well as other nuclear proteins, thereby rendering it difficult to produce unique PTMs without unintentionally affecting alternate aspects of the biochemical pathway in question. In addition, the low activity of many isolated histone mark writers precludes complete modification. Fortunately, chemical biology has provided chromatin biochemists with a rich toolbox geared toward the assembly of site-specifically modified “designer” chromatin. Below we discuss the contributions of protein chemistry to our understanding of chromatin structure and function, and how PTMs modulate these properties from a biophysical and biochemical point of view. 4.1 Site-Specific Modifications of Histones and Chromatin Most histone proteins are devoid of cysteine residues. Only H3 contains one completely conserved cysteine residue, which has been shown to be inessential in yeast, 290 and is frequently mutated to alanine in biochemical and biophysical studies. This feature greatly facilitates site-directed modification of histones and nucleosomes upon genetically incorporating Cys residues at desired locations due to the unique reactivity of the cysteine sulfhydryl group toward a diverse repertoire of electrophilic probes. 291 4.1.1 Site-Specific Protein Cross-linking of Chromatin Dorigo et al. exploited non-native cysteine residues to investigate predicted contacts between the H4 tail and the acidic patch in chromatin fibers. 292 Upon compaction with MgCl2, 12-mer nucleosome arrays containing H2A-E64C and H4 V21C were cross-linked by treatment with a mixture of oxidized and reduced glutathione (Figure 26a). The resulting disulfide bond between H2A and H4 stabilized the compact state of the array, and additionally revealed the fold of the chromatin fiber. Limited digestion with a nonspecific nuclease, followed by native gel electrophoresis, yielded cross-linked arrays containing maximally six nucleosomes. This result confirms a two-start helix, reinforced by contacts between i, i+2 nucleosomes, but lacking i, i+1 interactions (Figure 25c and d). Similar interactions also occur to stabilize interstrand association, as seen when arrays containing exclusively H2A-E64C are mixed with arrays containing only H4 V21C. 293 Chemical cross-linkers installed at specific histone sites complemented earlier approaches 294,295 that relied on nonspecific protein–DNA cross-linking to study nucleosome and chromatin architecture. 296 Cysteine residues, introduced, for example, at positions 2 or 12 of the H2A tail, can be conveniently alkylated with 4-azidophenacyl bromide (APB, Figure 26b). 297 Upon UV irradiation, the APB moiety decomposes to yield a nitrene that covalently inserts itself at two specific DNA sites, hence suggesting a defined arrangement for the H2A N-terminal tail with respect to nucleosomal DNA (Figure 26c). A similar strategy has been used to map intra- and internucleosomal contacts of other histone tails, 298,299 the position of linker histone H1 on a nucleosome, 300 as well as the mechanism of chromatin remodelers. 301 Figure 26 Nucleosome cross-linking. (a) Disulfide cross-linking from the H4 tail (green) to the acidic patch of H2A/H2B with engineered cysteines (black). (b) Structure of 4-azidophenacyl bromide (APB) and its reaction with cysteine. (c) Photo-cross-linking reveals the position of the H2A N-terminal tail. APB is attached to an engineered cysteine within the H2A tail (black), the cross-linking site on DNA is shown in black. Figure 27 Site-directed footprinting to map nucleosome positioning. (a) Activated disulfide reagent (7) to attach an EDTA derivative to cysteine residues (top) and hydroxyl radical generation by the Fenton reaction employing Fe(II) (bottom). (b) Model of the preferred cleavage site (red) upon introducing a sensitizer at H4S47C (yellow). The nucleosome dyad is indicated with a white arrow. (c) Structure of N-(1,10-phenanthroline-5-yl)iodoacetamide (8) in complex with Cu(I). 4.1.2 Footprinting Analysis of Nucleosome Positioning Information about the register in which nucleosomal DNA wraps around the histone octamer can be obtained from site-directed footprinting studies. By tethering an Fe(II)–EDTA complex (via disulfide 7) to a cysteine residue mutated into position 47 of H4, Richmond and co-workers were able to target hydroxyl radicals generated by the Fenton reaction 302,303 to specific DNA sites close to the dyad axis (Figure 27a and b). 304 The resulting strand cleavage was then used to map H3/H4 tetramer and nucleosome position at basepair resolution. Widom and co-workers have used a related approach to accurately map nucleosome positioning in yeast. 305 A copper chelator, N-(1,10-phenanthroline-5-yl)iodoacetamide (8), was used to modify the histone mutant H4S47C (Figure 27c), enabling Cu(I)- and H2O2-dependent strand cleavage in permeabilized cells. Next generation sequencing of the resulting fragments yielded a portrait of nucleosome architecture in cells at unprecedented resolution. This map provided detailed insight into DNA sequence patterns that govern histone positioning rules, and how distinct regions within nucleosomes impact the transactions of DNA with nuclear factors such as RNA polymerase. 4.1.3 Cysteine Labeling with Biophysical Probes Spectroscopic probes have played a central role in characterizing protein structure and function, and the ability to introduce labels into DNA or by cysteine conjugation into histones makes nucleosomes attractive targets for diverse types of spectroscopy. Biophysical studies on nucleosomes have been comprehensively reviewed recently, 276,306 and we will here only provide a glimpse into the types of probes that are commonly attached to histones. Fluorophores, available in all forms and colors as cysteine reactive dyes, provide a handle to study nucleosome stability and dynamics. FRET measurements were performed with donor-labeled DNA and acceptor-labeled histone octamers, for example, on H2AK119C or H3 V35C, both close to the DNA entry/exit site (Figure 28a and b). 307 A decrease in FRET efficiency, corresponding to an unwrapping of DNA, was found under physiological ionic strength, suggesting that nucleosomes breathe to facilitate access of trans acting factors. Luger and co-workers used FRET pairs strategically installed at H2B-T112C and on the histone chaperone, NAP1, to dissect the interaction of this factor with its substrates. 308,309 In addition, this assay could be harnessed to measure nucleosome stability through a coupled equilibrium cycle. 310 FRET-based assays of nucleosome properties have been adapted to the single-molecule level, reviewed in refs (311,312). Zhang et al. probed the interactions of the histone chaperones RbAp48 and ASF1 with H3–H4 complexes by EPR spectroscopy. 313 MTSL ((1-oxyl-2,2,5,5-tetramethylpyrroline-3-methyl)-methanethiosulfonate, 9) spin labels were installed at various positions within the histones to monitor structural changes in the H3/H4 tetramer through pulsed electron–electron double-resonance (PELDOR) 314 spectroscopy (Figure 28c and d). 313,315 This technique can measure distances ranging from 20 to 80 Å, 316 and is therefore ideally suited to probe histone assemblies. Association with RbAp48 disrupted the H3–H3 interaction (probed through labeling at H3Q125C), and changed H3–H4 distance distributions, demonstrating that this chaperone binds an H3–H4 dimer rather than a tetramer, the prominent oligomerization state in solution, and causes major structural rearrangements of the H3–H4 folds. 313 Figure 28 Spectroscopic characterization of nucleosomes and histones. (a) FRET assay to investigate DNA breathing. DNA (gray) is labeled with Cy3 (green star), H3 V35C (black), with Cy5 (red star). DNA unwrapping increases the distance between the fluorophores, leading to a loss in FRET signal. (b) Structure of cysteine-reactive Cy5-maleimide. (c) Structure of the MTSL spin label. (d) Distance measurement with PELDOR between spin labels (arrow) installed at H3Q125C. H3 is drawn in blue, H4 in green. (e) Asymmetric positioning of H1 (blue) on the nucleosome core particle. A spin label (arrow) placed at H3K37C perturbs NMR signals in its vicinity (dashed sphere). Introduction of paramagnetic probes into proteins also facilitates characterization of protein–protein interactions by NMR spectroscopy. 317 To investigate how the globular domain of linker histone H1 engages the nucleosome core, Bai and co-workers conjugated MTSL or Mn2+–EDTA complexes to cysteine residues at the periphery of the H1 globular domain, 318 or close to the dyad axis on the nucleosome (H2A-T119C and H3K37C). 319 By identifying NMR signals that are perturbed by the spin labels, the authors were able to triangulate the position of H1 on the nucleosome, and found that the complex is asymmetric; that is, H1 binds DNA on only one of its exit sites (Figure 28e). Thus, site-specific conjugation of probes to cysteine residues strategically engineered into histones has enabled characterization of the structure, stability, and dynamics of nucleosomes. Ever more sophisticated spectroscopic methods facilitate analysis of increasingly large chromatin templates, interactions with bigger complexes, and more detailed aspects of the properties of chromatin, even at the single molecule level. 4.1.4 Installation of PTM Mimics Inspired by the ease with which diverse probes can be conjugated to sulfhydryl groups, 291,320 Simon et al. used a cysteine alkylation strategy to prepare analogues of methyllysine residues. 321 N-Methylated 2-haloethylamines represent convenient reagents for such transformations (Figure 29a). Alkylation of cysteine with the mono- and dimethyl species to yield the thialysine products, KCme1 and KCme2, respectively, proceeds through an aziridine intermediate that is readily formed from the corresponding chloride. In contrast, the electrophile needed to produce the trimethyl analogue, KCme3, lacks a lone pair on nitrogen and consequently the ability to form an aziridine, and so requires the stronger bromide leaving group. Methyllysine analogues installed in this way at positions H3K4, H3K9, H3K36, H3K79, and H4K20 were recognized by cognate antisera. In addition, K9Cme2 peptides bound to HP1α, and were further methylated by the HMT Suv39h1. 321 Later studies confirmed that methyllysine-analogue containing nucleosomes could serve as substrates for other histone methyltransferases 322 and demethylases. 323 In general, nucleosomes constructed from alkylated histones behaved exactly like unmodified counterparts, auguring well for the use of these versatile reagents in chromatin biochemistry and biophysics, 321 or even as quantification standards in ChIP experiments. 324 A methylene-to-sulfide substitution causes an increase in length, flexibility, and acidity of the side-chain (Figure 29b). 321,325,326 To assess how these structural differences translate into energetic penalties for methyllysine analogue binding by reader modules, Fischle and co-workers investigated computational and experimental models. 327 Experimentally determined ΔΔG values for association of Kme versus KCme substrates to binder modules were highly context dependent, and ranged between −0.2 kcal/mol (the PHD finger of ING1 preferentially binds to H3K4Cme3 over H3K4m3) to +1.2 kcal/mol (L3MBT preferentially binds to H4K20me1 over the analogue H4K20Cme1). In the majority of cases, methyllysine analogues recapitulate the function of the native PTM, although there exist isolated examples where this approach falls short. Figure 29 Structure and applications of methyllysine analogues. (a) Synthesis of methyllysine analogues by cysteine alkylation. (b) Comparison between methyllysine and a thioether analogue. (c and d) Subtle local changes in nucleosome structure upon histone lysine methylation. Modified nucleosomes are depicted in light colors (H3K79Cme2 in light blue, pdb code 3c1c in (c); H4K20Cme3 in pale green, pdb code 3c1b in (d)). Unmodified versions are shown in corresponding dark tones (pdb code: 1kx5). Yellow arrows indicate modified residues. (e) Model of the structure of a nucleosome containing H3K36me3 (yellow arrow) in complex with the PSIP1 PWWP domain (the backbones of neutral, basic, and acidic side chains are shown in white, blue, and red, respectively; pdb code: 3ZH1). The ease of obtaining large amounts of homogeneously modified histones through cysteine alkylation permitted the determination of crystal structures of mononucleosomes containing either H3K79Cme2 or H4K20Cme3 (Figure 29c and d). 89 Both PTMs have known genomic associations, H3K79me2 is found in actively transcribed genes 328 and H4K20me3 is enriched in heterochromatin, 329 but little is known about how and if they modulate chromatin structure and function. Overall, the structures of the modified histones are superimposable with previously solved X-ray structures, but both modifications cause subtle local differences in conformation. Remarkably, nucleosome arrays prepared with H4K20Cme3 or H4K20Cme2 compacted much more readily than arrays containing unmodified H4 or a control protein with H4K20Cme0. This study demonstrates that methyllysine marks can directly affect the biophysical properties of chromatin and need not necessarily rely on protein effectors. Figure 30 Model for HP1 binding to heterochromatin domains. The chromodomain (CD, blue) binds to H3K9me3 (red flag), although this interaction is weak and dynamic. Stable dimerization through the chromoshadow domain (CSD, light blue and pale green) provides a polyvalent scaffold for chromatin binding. Plasticity is granted by the flexibility of the H3 tail and the linker between the HP1 domains (red arrows). The interaction between HP1 and nucleosomes marked with H3K9me3 is central to the formation of large repressive chromatin domains. 330 HP1 self-assembles into dimers and higher order oligomers through its C-terminal chromo-shadow domain (CSD), providing a polyvalent scaffold. Designer chromatin featuring H3 with the trimethyllysine analogue KCme3 at position 9 has enabled detailed studies of how HP1 recognizes its targets. For instance, biophysical studies based on NMR spectroscopy by Munari et al. revealed that only the CD (note, this module is distinct from the CSD) of HP1 stably contacts mononucleosomes bearing the H3K9Cme3 mark (Figure 30). 331 Additionally, the highly charged hinge region that bridges the N-terminal CD and dimerization domain interacts nonspecifically with DNA and nucleosomes. Overall, the complex of HP1 with an H3K9Cme3-mononucleosome displays considerable flexibility, granted by the mobility of the H3 tail and the HP1 hinge. 331,332 This plasticity presumably enables selection of diverse substrate arrangements. Notably, HP1 binding to heterochromatin domains is highly mobile in vivo, as evidenced by fluorescence recovery after photobleaching (FRAP) on the time scale of seconds. 333 Accordingly, the binding of methylated mononucleosomes by HP1 is weak, and, in the case of the yeast homologue Swi6, rather unselective. 334 Instead, Canzio et al. demonstrated that specificity for heterochromatin domains is imparted by cooperative engagement of HP1 oligomers with multiple H3KC9me3-modified nucleosomes in arrays. 334 Intriguingly, polyvalent binding based on multiple weak interactions is a typical feature of systems that undergo phase transition to provide compartments with liquid-like properties, 335 and might govern the formation of spatially distinct chromatin domains. Site-specifically modified histones are crucial reagents for structural analysis of PTM binding modules that are sensitive to the nucleosomal context. The molecular recognition of methylated H3K36 is a case in point, due to its proximity to nucleosomal DNA. Peptides carrying K36me3 bind poorly to the PWWP domain (a methyllysine binding domain characterized by a Pro-Trp-Trp-Pro sequence) of the coactivator PSIP1. 336 Binding studies with nucleosomes modified with K36Cme3 revealed that a basic surface on the PWWP domain reinforces K36me3 engagement. 337 NMR analysis of K36Cme3-containing nucleosomes and the PSIP1 PWWP domain, enabled by strategic isotopic labeling of the methyl groups of isoleucine, leucine, and valine (methyl-TROSY), 338 led to the construction of a model of this multivalent interaction (Figure 29e). 337 Binding of H3K36me3 by the aromatic cage of the PWWP domain positions the basic surface of PSIP1 proximal to the nucleosomal DNA, thereby reinforcing binding approximately 10 000 fold. Methyllysine analogues have also aided in characterizing histone modifying enzymes that require or prefer nucleosome substrates. For instance, the methylation state specificity of the methyltransferases NSD2 and SET2 was elucidated using mononucleosomes containing methyllysine analogues in place of Lys36. 339 Histone methyltransferase assays demonstrated that NSD2 mediates mono- and dimethylation, whereas SET2 is capable of trimethylating H3K36. Similarly, the interplay between activating trimethyl marks and the repressive PRC2-dependent methylation at Lys27 was studied using site-specifically modified mononucleosomes. 340 Templates carrying the methyllysine surrogates H3KC4me3 or H3KC36me3 were less efficiently methylated by PRC2, further illustrating the diversity of histone PTMs that this complex can sense. Roeder and co-workers have harnessed modified chromatin templates to characterize the biochemistry of transcription. To this end, histone octamers can be loaded onto a supercoiled plasmid backbone with the help of the chaperone NAP1 and the remodeler ACF in the presence of ATP. 281,341 When octamers carrying H3K4Cme3 are used, the impact of this mark, commonly associated with active genes, on transcription can be studied. H3K4Cme3 facilitated the recruitment of the preinitiator complex (an assembly of several general transcription factors that guide the positioning of RNA polymerase II to transcription start sites) to promoters, 342,343 thereby increasing transcription. 342 In addition, this analogue enhanced transcription activation by the coactivators p53 and p300. 344 Strategies to prepare analogues of several other histone PTMs have also been reported. For example, cysteine can be modified with N-vinyl-acetamide in a radical promoted thiol–ene reaction (Figure 31a). 345 This transformation results in a thia-analogue of Kac, KCac. Histones bearing H4KC16ac were recognized by an H4K16ac-specific antibody, and the PTM-analogue caused chromatin decompaction, which is a hallmark of this modification. An acetyllysine analogue featuring a methylthiocarbamoyl group, installed through alkylation of cysteine with an aziridine moiety (Figure 31b), was found to be resilient toward the deacetylases HDAC8 and Sir2. 346 When placed at position 5 or 8 of H4 tail peptides, this mimic is recognized by the bromodomain of Brdt, albeit with slightly lower affinity than the corresponding acetylated peptides. H3 variants functionalized with methylthiocarbamoyl-thialysine cross-reacted with designated antibodies, providing further evidence that this acetyllysine analogue is suitable for biochemical studies, in particular when HDAC resistance is desirable. Figure 31 Synthesis of (a,b) acetyllysine and (c) methylarginine analogues from cysteine-containing histones. Methylarginine analogues with different geometries were similarly prepared by a conjugate addition of a Michael acceptor to cysteine-containing histones (Figure 31c). 347 Besides the methylene-to-sulfur substitutions, the resulting residues contain an amidine functional group rather than the native guanidinium group. Nevertheless, full-length histones and peptides displaying methylarginine analogues were recognized by cognate antibodies and the H4R3me2a-binder TDRD3. The chemoselectivity of cysteine functionalization with versatile probes has been a workhorse in protein biochemistry and biophysics. Serendipitously, this reaction can be considered bio-orthogonal in chromatin research, because there is only one conserved yet nonessential cysteine residue in histones. Spectroscopic probes and cross-linkers, attached to engineered cysteines, have informed on the structure and dynamics of nucleosomes as well as the chromatin fiber. In addition, PTM analogues generated from cysteine mutants have contributed valuable data on the structural effects and molecular recognition of histone modifications, especially in the case of lysine methylation. 4.2 Synthetic Biology Meets Chromatin Research The genetic code specifies 20 standard amino acids. In addition, natural mechanisms exist to expand the scope of genetically encoded building blocks to include selenocysteine and pyrrolysine. The same strategies have been exploited to incorporate unnatural monomers into proteins in cells (Figure 32). 348 This methodology relies on the ability to suppress a stop-codon with a tRNA containing a complementary anticodon. The tRNA must be charged with the desired nonstandard amino acid in vivo using an engineered aminoacyl-tRNA synthetase (aaRS). Importantly, this system needs to be orthogonal to the cell’s endogenous apparatus in two ways: (a) the exogenous tRNA and amino acid must be recognized only by the exogenous aaRS, but not by any endogenous aaRSs, and (b) the exogenous aaRS must utilize only the exogenous building blocks, but none of the cell’s natural raw materials. This feature is usually achieved through elaborate directed evolution schemes, starting with a tRNA-aaRS pair from a different host organism such as the tRNACUA (complementary to the “amber” UAG stop codon) and the Pyrrolysyl-tRNA synthetase (PylRS) from Methanosarcinae. A comprehensive review by Lang and Chin on strategies to incorporate unnatural amino acids into proteins has recently been published in this journal. 349 Figure 32 Incorporation of nonstandard amino acids (blue) into proteins in E. coli. An engineered orthogonal aaRS (top) charges a cognate tRNA with a designated amino acid, but does not interact with natural amino acids or E. coli tRNAs (bottom). Similarly, neither the exogenously introduced tRNA (here from M. barkeri) nor the nonstandard amino acid is recognized by any aaRS from E. coli (gray). Translation of the unnatural amino acid occurs opposite an amber stop codon (UAG). 4.2.1 Genetic Incorporation of Acetyllysine Residues In the context of histones, the ability to genetically encode residues containing PTMs has been tremendously useful. In particular, acetyllysine can be integrated into ribosomal protein synthesis using an engineered aaRS originally dedicated to pyrrolysine in Methanosarcina barkeri. 350 Neumann et al. harnessed the amber suppression strategy to define the biophysical and biochemical effects of H3K56 acetylation. 351 By replacing the codon that specifies lysine 56 with an amber stop codon, and supplementing the growth medium with acetyllysine, the authors were able to produce H3 homogeneously modified with K56ac in E. coli carrying the orthogonal tRNACUA and the evolved aaRS. This protein was subsequently incorporated into nucleosomes and nucleosome arrays using standard techniques. Surprisingly, K56ac did not alter chromatin compaction, and only had minor effects on chromatin remodeling by bromodomain-containing motor proteins. In contrast, single-molecule FRET measurements revealed that DNA “breathing” was enhanced in K56ac-containing mononucleosomes as compared to unmodified versions, consistent with the position of K56 close to the DNA entry/exit site (Figure 33). 351 K56ac marks also facilitated binding of the pluripotency factor Oct4 to nucleosomes in vitro, 352 yet inhibited interactions with the components of the yeast silencing apparatus Sir2–4. 353 The modularity of the genetic acetyllysine incorporation strategy enabled Schneider and co-workers to insert this modified residue at several H3 sites and study the impact of site-specific histone acetylation on transcription. H3K64ac 354 and H3K122ac, 355 both present at the lateral surface of the histone octamer, 356 are found at the transcription start site (TSS) of actively transcribed genes. In vitro assays demonstrated that these PTMs, presumably installed by the histone acetyltransferase p300, stimulate transcription and facilitate histone eviction by NAP1. 354,355 In addition, K64ac directly destabilizes nucleosomes, as evidenced by FRET measurements. 354 Figure 33 K56ac (black) increases breathing of nucleosomal DNA. The Schultz and Carell laboratories have independently reported that amber suppression methods can also be harnessed to produce histones with crotonyllysine residues in E. coli. 357,358 Schultz and co-workers evolved a PylRS to recognize this residue, enabling the biosynthesis of H2B with a crotonyl modification on Lys11. 357 Carell and co-workers speculated that wild-type PylRS is able to accommodate modified lysines as well. 358 Indeed, this enzyme could be used to prepare H3 crotonylated at Lys9. In addition, PylRS tolerated propionyl- and butyryllysine, providing access to H3 with the corresponding residues at position 9. 4.2.2 Genetic Incorporation of Protected Species Direct incorporation of methyllysine residues into proteins through reengineered aaRSs has so far been unsuccessful. 359,360 In contrast, the spacious binding pocket of Pyl-tRNA synthetase is ideally suited to accommodate protected lysine species. This feature has been harnessed to achieve the incorporation of N-ε-Boc- or N-ε-Alloc-protected methyllysine residues into histones (Figure 34a). 360,361 Post-translational deprotection using dilute TFA or a ruthenium complex for Boc and Alloc protected building blocks, respectively, afforded monomethylated histones. H3K9me1 produced in this way interacted specifically with cognate antibodies and HP1, thus demonstrating the authenticity of the epitope. 360 Dimethyllysine can be incorporated into histones using a molecular detour involving a more elaborate protecting group scheme (Figure 34b). 359 First, N-ε-Boc-Lysine was site-specifically incorporated at H3K9. Global protection of all other amine groups using Cbz-OSu ensued, followed by chemoselective Boc deprotection in TFA/H2O. Reductive methylation with formaldehyde and a borane reagent preceded global deprotection with a mixture of trifluoromethanesulfonic acid, TFA, and dimethylsulfide. The resulting protein, modified with H3K9me2 to >90%, was recognized by HP1, as expected. Figure 34 Strategies to genetically encode methyllysine residues. (a) Incorporation of protected Kme1-species. (b) Protection-modification scheme to access Kme2-containing proteins. Boc-protected lysine is incorporated into histones through amber suppression. Orthogonal protection of other lysine residues, followed by removal of the Boc group and reductive alkylation enables site-specific modification. Global deprotection then provides the desired histone. TFMSA = trifluoromethylsulfonic acid, TFA = trifluoroacetic acid, DMS = dimethylsulfide. A range of additional PTM analogues have been incorporated into histones upon addition of phenylselenocysteine, a caged electrophile, to the genetic code. 362,363 Being prone to oxidation, this residue readily undergoes selenoxide pyrolysis in the presence of H2O2 to yield dehydroalanine, a Michael acceptor (Figure 35). 364 Subsequent functionalization with N-acetylcysteamine or N-methylcysteamines provides acetyllysine and methyllysine analogues. 363 H3 variants containing H3K9Cac or a thialysine control residue prepared in this way confirmed that the Ser10-targeting kinase AuroraB is sensitive to acetylation at the neighboring Lys9. 363,365 The yield of dehydroalanine-containing histones can be improved through the use of selenocysteine derivatives that are incorporated into proteins more readily. 366 Overall, however, this strategy is limited by undesired oxidation of methionine residues, 362 as well as a lack of stereochemical control; that is, both diastereoisomers are generated. 367 Figure 35 Biosynthetic incorporation of PTM analogues through dehydroalanine intermediates. Dehydroalanine can be generated through selenoxide pyrolysis (left) or cysteine-specific reagents (right). Michael addition of thiols to dehydroalanine generates PTM analogues, albeit with loss of stereochemical information. Chalker et al. devised a mild approach to generate dehydroalanine sites from cysteine residues, thereby bypassing the need for unnatural amino acid incorporation. 368 2,5-Dibromohexanediamide selectively bis-alkylates cysteine at sulfur and causes elimination (Figure 35). Subsequent derivatization of dehydroalanine with sulfur-containing nucleophiles provided analogues for acetyllysine, methyllysine, phosphoserine (i.e., phosphocysteine), and GlcNAc-serine. 369 Phosphocysteine at residue 10 of H3 was detected by an anti-S10ph antibody, H3K9Cac was deacetylated by HDAC1 and HDAC2, and even doubly modified histones carrying two copies of either dimethyllysine or acetyllysine analogues at positions 4 and 79 could be prepared, demonstrating the versatility of the approach. While this mild method steers clear of problems with methionine oxidation, PTMs are still incorporated as stereochemical mixtures, which may be of concern in certain instances. While amber suppression is typically limited to one residue per protein, more sophisticated approaches to increase the efficiency of incorporating specific 370 or multiple 371−376 unnatural amino acids have been developed. For example, by using in vitro translation systems with cell extracts derived from E. coli strains with deleted release factors, Mukai et al. were able to produce H4 acetylated simultaneously at K5, K8, K12, and K16. 375 In addition, the design of innovative systems to add complex building blocks, including phosphoserine 377 or photocaged amino acids, 378,379 to the genetic code further strengthens the potential of synthetic biology to contribute to chromatin biochemistry. 4.2.3 A Synthetic Biology Strategy To Probe Chromatin Structure in Vivo The key advantage of the synthetic biology approach is that, in principle, histones can be generated with site-specific modifications in vivo and studied in situ, as demonstrated by Neumann and co-workers in an impressive study on mitotic chromatin compaction. 380 Use of amber suppression allowed the introduction of a BPA photo-cross-linker at position 58 of H2A to report on chromatin condensation in live yeast (Figure 36). Upon UV irradiation, a cross-link to H4 formed, presumably mediated by the H4 tail engaging the acidic patch (see also section 4.1.1). Indeed, deletion of H4 tail fragments or mutagenesis of H4K16 to alanine reduced the magnitude of cross-linking. Intriguingly, the H2A–H4 interaction was highly dependent on cell cycle stage, peaking during M phase in synchrony with aurora B-dependent phosphorylation of H3S10, a classical mitotic marker. 381,382 In contrast, the K16ac mark is anticorrelated with cross-linking, that is, at its lowest during M phase, suggesting that this PTM prevents chromatin condensation in vivo. 380 In agreement with this conjecture, cross-linking in yeast carrying an H4K16R mutant was not cell-cycle dependent. Dissection of the signaling pathway led to a model where H3S10ph recruits the deacetylase Hst2p, and this interaction was required for H2A–H4 cross-link formation and hence chromatin compaction. This study traced the signaling pathway that governs chromatin condensation during cell division, and simultaneously demonstrated that synthetic biology strategies are well suited to investigate chromatin biology in live cells. Figure 36 Schematic of the signaling cascade that controls chromatin condensation during mitosis. Decompacted chromatin, partially labeled with a photo-cross-linker (BPA, red star), is acetylated at H4K16 (yellow flag). Upon entry into M phase, aurora B kinase phosphorylates H3S10 (blue lollipop), which recruits the HDAC Hst2p (blue). Once H4K16 is deacetylated, chromatin compacts, observed by an H2A–H4 cross-link (green star). 4.3 Chemical (Semi)-Synthesis of Histones Solid-phase synthesis has provided access to peptides containing essentially any histone PTM in the context of histone tails. The virtues of chemical synthesis can be extended to the manufacture of site-specifically modified histones and chromatin templates through convergent assembly strategies. Rather than synthesizing histones as a single linear chain, two or more fragments are prepared individually by SPPS and ligated upon purification. The archetypical methodology to achieve this feature is native chemical ligation (NCL), that is, the condensation of a synthetic peptide α-thioester with a second peptide carrying an N-terminal cysteine (Figure 37). 383,384 The reaction is initiated by trans-thioesterification to join the two peptide fragments, arranging the thioester intermediate for an intramolecular S-to-N acyl shift to yield a native amide bond. Importantly, thioesters, often used in nature for acyl group transfers (including in nonribosomal peptide synthesis), 385 are soft electrophiles. They are therefore uniquely activated toward thiol nucleophiles at neutral pH and react much more sluggishly with the harder O- and N-nucleophiles also present in proteins. 386,387 In addition, although trans-thioesterification can proceed with internal cysteine residues, this side reaction is reversible because the absence of a proximal amino group precludes stable amide bond formation. Because of its exquisite chemoselectivity, NCL can be performed with unprotected peptide segments in water. In this section, we will discuss various convergent synthetic and semisynthetic strategies to produce designer histones, as well as applications of these powerful reagents in chromatin biochemistry. Figure 37 Mechanism of native chemical ligation (NCL). 4.3.1 Semisynthesis of N-Terminally Modified Histones Because of the frequency of PTMs on the N-terminal histone tails, semisynthetic approaches are ideally suited for the preparation of designer histones. Semisynthesis integrates the expanded scope of peptide chemistry with the ease of generating biopolymers recombinantly (Figure 38). 388 Specifically, peptides encompassing the PTM site are chemically synthesized as α-thioesters and reacted with histone fragments carrying an N-terminal cysteine. Peptide α-thioesters can be directly synthesized on solid phase using Boc chemistry (Figure 39a), but special measures are required to access α-thioesters via Fmoc-SPPS due to the copious use of base necessary for Fmoc deprotection. 389 Most simply, fully protected peptides, synthesized on very acid-labile chlorotrityl resins, can be converted to α-thioesters in solution using HBTU in the presence of thiols (Figure 39b). Epimerization at the C-terminus is very common in this procedure, and much care is needed to avoid this unwanted side reaction. 390,391 Preferably, peptides can be synthesized on 2-hydroxy-3-mercaptopropionic acid, 392 diaminobenzoyl, 393 or hydrazine 394 linkers, allowing activation of the C-terminal residue via O-to-S acyl shift, as an acylurea or as an acylazide moiety, respectively (Figure 39c–e). Recombinantly produced histone fragments bearing an N-terminal cysteine are typically generated from a fusion protein using site-specific proteolysis. Factor Xa, 395 TEV protease, 396 Thrombin, 397 and SUMO 398 protease are most frequently used in this context. 399,400 Alternatively, in some cases precise removal of the N-terminal methionine residue in E. coli can provide protein fragments featuring a newly exposed N-terminal cysteine. 401 Given the scarcity of cysteine residues in histones, an engineered cysteine has to be placed strategically at a desired ligation junction. Figure 38 Protein semisynthesis by native chemical ligation. Figure 39 Comparison of peptide α-thioester synthesis by Boc- (a) and Fmoc-SPPS (b–e). (a) Synthesis of peptide α-thioesters on a mercaptopropionic acid linker by Boc-SPPS. (b) Direct conversion of a protected peptide acid into an α-thioester. (c) Latent thioester synthesis on a 2-hydroxy-3-mercaptopropionic acid linker. (d) α-Thioester synthesis through an acylthiourea intermediate. (e) Acyl-hydrazide method for α-thioester synthesis. Pg = protecting group. Figure 40 Semisynthesis of H3S10ph. A synthetic peptide is converted into an α-thioester in solution with HBTU and benzyl mercaptan (top). Simultaneously, a recombinant fragment with an N-terminal cysteine residue (in place of Thr32) is prepared by site-specific proteolysis using Factor Xa (middle). Joining of the two fragments by NCL yields full-length H3 site-specifically phosphorylated at Ser10. A T32C mutation remains at the ligation junction (below). pg = protecting group. In 2003, Shogren-Knaak et al. reported the first preparation and application of a semisynthetic histone, H3 containing S10ph (Figure 40). 402 A synthetic phosphopeptide encompassing residues 1–31 was synthesized by Fmoc-based SPPS and converted into an α-thioester following cleavage of the protected peptide from the resin. In parallel, an H3 fragment consisting of residues 33–135 furnished with an N-terminal cysteine in place of Thr32 was produced recombinantly. The fragments were joined by NCL, and purified by ion exchange chromatography. The resulting H3 variant as well as a control protein synthesized with unmodified Ser10 and also containing the T32C mutation were subsequently incorporated into nucleosome arrays. In agreement with previous reports based on peptide substrates, 403,404 semisynthetic chromatin modified with S10ph was acetylated more readily by GCN5 than the control array. 402 Surprisingly, however, in the context of chromatin, S10ph did not stimulate acetylation by the SAGA complex (which contains GCN5), suggesting that other subunits override the expected preference. Shortly thereafter, He et al. described the traceless semisynthesis of several methylated and acetylated histones. 401 Peptide α-thioesters corresponding to residues 1–24 of H3 or 1–14 of H4 were synthesized via Boc-SPPS on a mercaptopropionic acid linker or by Fmoc SPPS employing a postcleavage thioesterification. 405 NCL with appropriate recombinant fragments (H3 residues 26–135 and H4 residues 16–102, both with an additional N-terminal cysteine) yielded H3 variants containing K9me3 or acetyl marks at K4, K9, K14, K18, and K23, as well as H4 acetylated at K5 and K8 or K5, K8, and K12 (Figure 41a). 401 Notably, the ligation junctions were judiciously chosen to entail Ala-to-Cys substitutions. These mutations were reverted in the full length proteins through hydrogenolytic desulfurization of cysteine to alanine in the presence of Raney nickel. 406 The resulting modified histones and unmodified controls were successfully assembled into chromatin, attesting to their integrity. 401 To scrutinize the effect of histone acetylation on chromatin structure and remodeling, several research groups have produced site-specifically acetylated nucleosomes and arrays. In a landmark study, Peterson and co-workers harnessed histone semisynthesis to investigate the biophysical and biochemical consequences of acetylating H4K16, 407 a key modulator of chromatin structure and function in development and disease. 408 Because of the lack of alanine residues in proximity to K16, the authors chose to use an R23C mutation to mediate native chemical ligation. Nucleosome arrays carrying K16ac and R23C, but not control variants with only the R23C mutation, displayed drastically reduced propensity to compact and self-assemble. 407 In addition, H4K16ac stimulated acetylation on H3 by the SAGA complex in a bromodomain-dependent manner, 409 and slightly inhibited chromatin remodeling by the ACF complex. 407 In another example, Ferreira et al. studied Snf2-dependent remodeling of nucleosomes acetylated at H3 (at lysines 9, 14, 18, and 23) or H4 (at lysines 5, 8, 12, and 16). 410 To enable NCL, cysteine residues were engineered in place of Ser28 and Val21, respectively. Native gel electrophoresis was used to monitor the position of a nucleosome (which affects its electrophoretic mobility) within a short DNA fragment. Remodeling assays revealed that acetylation at H3, especially at Lys14, drastically increased recruitment of the enzyme complex to modified nucleosomes. This effect was later shown to be mediated by the bromodomain of SWI/SNF. 411 By contrast, acetylation of H4 inhibited remodeling of mononucleosomes by Chd1 and Isw2 in vitro. 410 No such inhibition could be detected in nucleosome arrays containing semisynthetic H4K16ac, however. 412 Nordenskiöld, Liu, and co-workers dissected the contribution of H4 acetylation marks to chromatin compaction and self-assembly. 413 H4 peptide α-thioesters, monoacetylated at K16, triacetylated at K5, K8, and K12, or tetra-acetylated at K5, K8, K12, and K16, were ligated to truncated H4 using a K20C mutation (Figure 41b). The ligation “scar” was neatly covered up by alkylation of the cysteine with bromoethylamine. Nucleosome arrays containing differentially acetylated H4 tails were subjected to compaction and self-assembly assays. Acetylation at K16 was specifically found to disrupt intra-array folding 413 (which is consistent with previous work) and mononucleosome aggregation. 414 In contrast, nucleosome array aggregation was dependent on the number of acetylation marks rather than their position. 413 These results suggest that intra-array folding and interarray assembly are governed by specific interactions and coarse electrostatic effects, respectively. Figure 41 Strategies to fix ligation scars in histone semisyntheses. (a) Conversion of Cys to Ala by desulfurization with Raney nickel in the semisynthesis of H3K9me3. (b) Alkylation of cysteine with bromoethylamine to produce thialysine in the semisynthesis of H4K16ac. (c) Radical-based desulfurization in the semisynthesis of H2B-S14ph. Semisynthesis of site-specifically acetylated and phosphorylated H2B informed on the ability of Mst1 kinase and a commercial antibody to recognize modified targets. 415 H2B tails corresponding to residues 1–16 were synthesized as latent thioesters containing S14ph, four acetyl marks, or a combination of these PTMs (Figure 41c). The truncated histone (residues 18–125) was produced in E. coli with a Met-Cys dipeptide leader. The initiator Met was spontaneously removed in vivo, and treatment of the protein with methoxylamine freed the N-terminal cysteine from thiazolidine adducts. Upon ligation, the full-length histones were desulfurized to restore Ala17, rendering the procedure traceless. In this process, the radical desulfurization approach developed by Danishefsky and co-workers 416 was found to be superior to Raney-nickel treatment. 415 Mst1 was able to phosphorylate unmodified and polyacetylated H2B in vitro. Immunoblots of semisynthetic H2B variants with an antibody directed against S14ph, however, clearly demonstrated that H2B acetylation masks the epitope of this antibody. This finding agrees with the general conclusions from peptide array-based mapping of antibody specificity (discussed in section 3.12), jointly raising the issue that PTM detection needs to be considered in the context of neighboring marks, that is, epitope occlusion. The examples described above illustrate the diversity of histones with N-terminal modifications that can be generated using protein semisynthesis. Recent developments in the synthesis of peptide α-thioesters using Fmoc chemistry and commercially available modified α-thioester peptides have made protein semisynthesis an amenable approach for molecular biology laboratories. When ligation junctions are chosen appropriately, chemical traces of semisynthesis can be removed through desulfurization strategies or covered up with the installation of amino acid analogues. 4.3.2 Semisynthesis of C-Terminally Modified Histones To access modifications at the C-terminal tails of histones, a short synthetic peptide bearing the PTM and an N-terminal cysteine is condensed with an α-thioester encompassing the majority of the histone sequence. Protein α-thioesters can be obtained recombinantly with the help of inteins, 417−419 protein domains that effect their own excision from a protein precursor. 388,420 The splicing reaction is initiated by an N-to-S acyl shift at a cysteine residue at the N-terminus of the intein (Figure 42a). Subsequently, the linear thioester is converted to a branched thioester by an acyl transfer to a cysteine residue adjacent to the intein domain. Amide bond cleavage through succinimide formation liberates an amine, onto which the acyl chain collapses to yield a new, native peptide bond joining the protein sequences that previously flanked the intein. When succinimide formation is precluded by an Asn-to-Ala mutation, the internal thioester intermediates can be captured by exogenously added thiols (Figure 42b). 421 The resulting protein α-thioester can be purified and used for NCL, a process usually referred to as expressed protein ligation (EPL). 388,420 Figure 42 Intein-mediated protein splicing. (a) Mechanism of intein autoprocessing. (b) Recombinant preparation of a protein α-thioester using a mutated intein. Thiolysis is mediated by a large excess of soluble thiol, such as sodium 2-mercaptoethanesulfonate (MesNa). Ottesen, Poirier, and co-workers have taken advantage of EPL to synthesize H3 and H4 variants with PTMs close to the C-termini. 422−425 Lys115, Lys 122, and Thr118 of H3 are all positioned in proximity to the nucleosome dyad axis where they form contacts with DNA, and Lys77 and Lys79 of H4 interact with DNA on the lateral surface (Figure 43a). Mutation of these residues affects DNA-templated processes in yeast, 426 and acetylation of the lysine residues or phosphorylation of Thr118 might similarly control nucleosomal functions. This conjecture was directly addressed by protein semisynthesis. An H3 fragment (residues 1–109) was fused to an intein to generate an α-thioester, which was subsequently ligated to a synthetic peptide carrying the K115ac, K122ac marks, taking advantage of the only conserved naturally occurring cysteine in histones (Cys110, Figure 43b). 425 Formation of nucleosomes with the modified H3 was possible, although the acetylation marks decreased the affinity of the histone octamer for DNA. Note that nucleosomes containing the corresponding K-to-Q mutations did not affect DNA binding, thereby illustrating the need for native PTMs rather than amino acid surrogates. Once formed, the acetylated nucleosomes were shown to slide more easily than unmodified counterparts without affecting DNA breathing. Nucleosomes phosphorylated at H3T118, prepared in the same fashion, displayed a similar phenotype. 424 DNA at the nucleosome dyad is substantially more accessible, and unusual nucleosome architectures were observed in the presence of H3T118ph. 423,424 These topologies might include structures where the DNA loops around two octamers to minimize crossing over the phosphorylated dyad axis. 427 The H4 variant was synthesized from a recombinant α-thioester (residues 1–75) and a synthetic peptide acetylated at K77 and K79, using Ala76Cys for the ligation (Figure 43c). 422 Upon ligation and desulfurization, the modified histone was incorporated into nucleosomes. FRET and DNA accessibility measurements demonstrated that acetylation of lysine residues at the lateral surface increases DNA breathing, as hypothesized. Figure 43 Preparation of modified histones by EPL. (a) Location of selected residues at the DNA binding surface. Residues on H3 and H4 are indicated with blue and green arrows, respectively. For clarity, labels are only placed on one copy of each histone. (b) Semisynthesis of H3 with acetyl marks close to the C-terminus using the native C110 for NCL. (c) EPL strategy to synthesize H4K77,79ac via an Ala76Cys mutation. Figure 44 Streamlined expressed protein ligation to synthesize H2B-K120ac. Intein self-assembly is harnessed for affinity purification in a column-format. α-Thioester intermediates are subsequently captured by washing the column with excess thiols. The isolated H2B α-thioester is condensed with a synthetic peptide containing an N-terminal cysteine and the K120ac modification. The ligation product is subsequently desulfurized to render the process traceless. Recent discoveries of ultrafast inteins that are naturally split, that is, they perform protein splicing from two separate polypeptides, have improved the production of recombinant protein α-thioesters. 428,429 Because the individual intein segments associate with high affinity, their assembly can be co-opted for purification. 430 In this streamlined EPL, the C-terminal intein fragment is immobilized on a solid support, and incubated with crude lysate from bacteria producing the protein of interest fused to the N-terminal intein portion (Figure 44). The intein segments assemble into a tight complex, allowing the removal of contaminating proteins by washing. Elution of the protein α-thioester is achieved by incubation with excess exogenous thiol. This expedited procedure was applied to access an H2B α-thioester (residues 1–116), which was subsequently ligated with a synthetic peptide acetylated at K120. Figure 45 Thiolated amino acid derivatives used for NCL. Development of strategies to extend NCL and EPL to involve noncysteine residues can alleviate some constraints on choosing ligation junctions. 431 Thiolated derivatives of lysine (e.g., 10) 432 and arginine (e.g., 11) 433 are particularly useful in the context of histones, and the ability to selectively desulfurize thiolated aspartate analogues (e.g., 12) in the presence of cysteine 434 may find use in the synthesis of native H3 (Figure 45). Amino acids related to 10–12 are compatible with genetic code expansion, suggesting that biosynthetic strategies can complement chemical methods to provide raw materials for NCL reactions at noncysteine sites. 435,436 A variation on the NCL theme is the use of α-thioacid capture to join histone fragments. 437 In this approach, an intein-derived α-thioester is converted into an α-thioacid with H2S. This nucleophile reacts rapidly with activated disulfides to form an acyldisulfide, which rearranges by an S-to-N acyl shift (Figure 46). Upon reduction, a cysteine residue with a native amide bond is obtained. The same strategy can be extended to the ligation of a synthetic peptide α-thioacid with a recombinant protein carrying an N-terminal cysteine, previously activated as an asymmetric disulfide with 2,2̀-dithiobis(5-nitropyridine). H3 variants with K4me2 and without modification have been synthesized using this strategy. In certain cases, the fast reaction rate (on the order of minutes) of the α-thioacid capture and the ensuing acyl transfer steps could outweigh the extra steps needed for this mode of protein semisynthesis. Figure 46 Histone semisynthesis using a thioacid capture strategy. A truncated histone-intein conjugate is converted to a thioacid with H2S (left). This fragment is coupled with a C-terminal peptide, activated as an asymmetric disulfide (right). Disulfide exchange is followed by an intramolecular acyl shift and reduction to a native cysteine residue. By combining intein technology with peptide chemistry, histones bearing PTMs at their C-terminal tails are readily accessible. Extensions and variations of these methodologies further broaden the scope of potential histone targets, thereby contributing to elucidating the molecular function of the flexible histone C-termini. 4.3.3 Enzyme-Assisted Semisynthesis of Modified Histones Enzyme-catalyzed ligation reactions harbor great potential for the semisynthesis of proteins. Sortases, bacterial transpeptidases that cross-link proteins of cell walls, are particularly promising. 438 Sortase A recognizes a pentapeptide stretch (LPxTG) at the C-terminus of its substrate. The enzyme cleaves the terminal glycine residue and concomitantly forms a thioester using an active site cysteine. The acyl group is thereafter transferred to a glycine-rich sequence of the peptidoglycan. By reengineering the substrate specificity of Sortase A to recognize an H3 peptide sequence (APATG, residues 29–33), Piotukh et al. achieved a traceless synthesis of full length H3 (Figure 47). 439 In this process, neither the synthetic peptide (residues 1–33) nor the truncated H3 variant with a native Gly-Gly N-terminus (residues 33–135) required preactivation. Further engineering to include different recognition motifs and to boost the rate and yield (currently below 50% in case of the engineered Sortase variant) of the transpeptidase reaction will boost the application of this intriguing biosynthetic tool to the generation of modified histones. Figure 47 Histone semisynthesis using an engineered Sortase variant. 4.3.4 Multistep Synthesis of Histones With the discovery of increasing numbers of PTMs that are positioned within the globular domains of histones, 189 chemical methods to access these site-specifically modified histones require refinement. Synthetic peptides with core modifications approach lengths prohibitive for routine SPPS if a two-piece ligation is attempted. Instead, convergent strategies to access full length histones from three or four fragments have been developed. In a synthetic tour de force, Ottesen and co-workers assembled histone H3 with K56ac using a three-piece ligation strategy (Figure 48a). 440 The ligation sites were chosen to exclusively involve Ala residues (A47, A91) because preliminary results indicated that the introduction of non-native cysteine residues in a test protein impacted nucleosome structure. All peptides were synthesized with N-α-Boc protected building blocks and, where appropriate, as α-thioesters on a mercaptopropionamide linker. The central segment (residues 47–90) with K56ac, and A47C protected as a thiazolidine species, was first ligated to the C-terminal piece (residues 91–135, A91C). Subsequently, C47 was unmasked with methoxylamine, and the N-terminal fragment (residues 1–46) was added. Because of the presence of Val46, the second ligation reaction proceeded sluggishly, requiring 4–6 days for completion. Finally, desulfurization provided access to fully synthetic H3 bearing K56ac and the popular C110A mutation in an overall 7% yield. FRET measurements with labeled nucleosomes containing K56ac confirmed that this PTM promotes DNA breathing. Figure 48 Multistep histone synthesis. (a) Total synthesis of H3K56ac using a three-step NCL procedure. (b) Total synthesis of H3K9me3 using Cys-Pro ester fragments, joined by NCL and direct aminolysis in the presence of Ag+ ions. (c) Three-piece semisynthesis to generate H3R42me2a. Initially, two synthetic peptides are joined by NCL. Subsequent activation of a C-terminal acyl hydrazide by oxidation enables a second NCL step to attach a recombinant fragment. pg denotes protecting groups. Aimoto and co-workers reported a three-piece total synthesis of H3 trimethylated at K9. 441 The authors prepared an N-terminal segment (residues 1–43) and a central segment (residues 44–95) as latent thioesters using the cysteine-proline ester autoactivation motif. This unit promotes transpeptidation by forming an α-thioester upon condensing into a diketopiperazide moiety (Figure 48b). 442 NCL of the middle segment, still N-α-Fmoc protected, with the unprotected C-terminal portion took advantage of Cys96, which naturally occurs in some isoforms of H3. 441 To prepare for the second ligation step, cysteine residues were protected as disulfides, lysine side-chains masked with Boc-OSu, and the N-terminal glycine residue exposed with piperidine treatment. This fragment was then joined to the protected N-terminal α-thioester fragment (containing a C-terminal proline residue, Pro43) by Ag+-promoted aminolysis. DTT and TFA were sequentially used to deprotect the cysteine and lysine residues, respectively, to yield H3K9me3 in an overall yield of approximately 6%. Although shown for the synthesis of H3K9me3, this multistep procedure is equally suitable to generate H3 variants with modifications in the central domain. Arginine 42 of H3 is situated close to the nucleosomal DNA entry site. This residue was recently identified by mass spectrometry to be dimethylated. 443 Given that the methyltransferases that install this PTM catalyze asymmetric dimethylation reactions at other sites, 124 R42 is most likely converted to the me2a form as well. To directly probe the biochemical function of R42 methylation, H3R42me2a was assembled in a three-step semisynthesis. 443 The N-terminal fragment (residues 1–28) was prepared by Boc chemistry as an α-thioester, while the central segment containing R42me2a (residues 29–46, A29C) was assembled by Fmoc chemistry as an acylhydrazide (Figure 48c). The C-terminal portion (residues 47–135, A47C) was produced recombinantly as a SUMO fusion. Cleavage by Ulp1 liberated the histone fragment bearing an N-terminal cysteine. The fragments were joined by NCL in an N-to-C direction. Importantly, the acylhydrazide functionality is inert under the conditions of the first ligation, but is activated by oxidation during the second step, 444 granting full regioselectivity to the sequential ligation process. Radical-based desulfurization concluded the semisynthesis of the homogeneously modified histone. Chromatin assembled with H3R42me2a was found to be more permissive to transcription as compared to unmodified congeners, supporting a direct role for arginine methylation in the control of DNA-templated processes, congruent with the position of this residue close to the DNA entry/exit site. The examples discussed above demonstrate the feasibility of chemically synthesizing entire histones. A selection of different ligation strategies exists to place modified residues at diverse positions within the histone core. While these approaches are technically much more challenging than the more prevalent two-component semisyntheses, they significantly broaden the scope of chemically accessible modified histones. Total synthesis or multistep semisynthesis represents the method of choice for preparing histones where centrally located modifications are inaccessible by state of the art biosynthetic means, and analogues are not available or are inadequate. In addition, and very importantly, sequential ligations allow multiple different modifications to be placed at distinct sites along the polypeptide chain. 4.3.5 Synthesis of Ubiquitylated Histones Investigating the functions of histone ubiquitylation poses unique chemical challenges. Given its size, the ubiquitin PTM cannot be installed as a single building block at the peptide synthesis stage. Instead, a convergent strategy drawing from expressed protein ligation can be harnessed to obtain site-selectively ubiquitylated peptides, a first step in the challenging paths toward H2A-K119ub and H2B-K120ub. 188 Chatterjee et al. produced ubiquitin recombinantly as an intein fusion to yield an α-thioester lacking the C-terminal residue, Gly76. A surrogate for this residue, bromoacetic acid, is attached to the ε-amine of an orthogonally protected lysine (corresponding to K120 of H2B) via an isopeptide bond on a synthetic histone peptide. The branch is reacted with an 1,2-amino-thiol auxiliary that provides the nitrogen atom of Gly76 and enables ligation of the ubiquitin α-thioester (Figure 49a). Upon ubiquitylation, photolysis cleaves the ligation handle, thereby restoring a native ubiquitin C-terminus, site-specifically attached to a single lysine residue. Figure 49 Auxiliary-based semisynthesis of uH2B. (a) Site-specific ubiquitylation of histone peptides. An amino-thiol ligation auxiliary permits ligation of a ubiquitin α-thioester to a glycine residue attached to the side-chain of Lys120. (b) Semisynthesis of native, full-length uH2B via a two-step ligation. MesNa = sodium 2-mercaptoethanesulfonate, TCEP = tris(2-carboxyethyl)phosphine. This procedure can be conveniently extended with a second ligation step to produce full-length H2B-K120ub. 445 An Ala117Cys mutation in H2B permits this extension, and protecting this residue with the photolabile o-nitrobenzyl group prevents double-ubiquitylation during the first ligation step (Figure 49b). Upon completion of the synthesis, desulfurization provided native H2B-K120ub in an overall yield of 20%. This valuable reagent enabled McGinty et al. to explore the functional consequences of histone ubiquitylation. 445 In vivo, H2B-K120ub is associated with increased HMT activity of the Set1 complex and Dot1 toward H3K4 and H3K79, respectively. 446−448 Methyltransferase assays with semisynthetic H2B-K120ub-containing nucleosomes confirmed that this mark directly stimulates the catalytic domains of Set1 449,450 and Dot1. 445 Nevertheless, despite its association with actively transcribed genes, H2B-K120ub did not enhance transcription in vitro. 449 Also, nucleosomes containing semisynthetic H2B-K120ub or enzymatically generated H2A-K119ub were used to define the substrate specificity of the deubiquitinase Calypso. 451 This polycomb group protein efficiently removed the ubiquitin mark from H2A-K119ub but not H2B-K120ub. McGinty et al. further subjected the stimulation of Dot1 by H2B-K120ub to detailed structure–activity relationship studies. To facilitate this process, an innocuous Gly76Ala mutation was introduced at the C-terminus of ubiquitin to streamline the synthesis of an H2B-K120ub analogue (Figure 50a). 452 In this strategy, residue 76 is attached to Lys120 of H2B as a cysteine, enabling direct ligation of the ubiquitin α-thioester at this site; the auxiliary is no longer necessary. Subsequent completion of the H2B sequence by a second NCL step is followed by desulfurization, providing tens of milligrams of H2B-K120ub with one additional methyl group. When incorporated into nucleosomes, this protein was found to be biochemically indistinguishable from the native structure. Several analogues of H2B-K120ub-containing nucleosomes were synthesized to test if Dot1 is stimulated by the canonical protein–protein interaction hotspots on nucleosomes: the H2A acidic patch (E64, N68), a basic stretch in the H4 tail (R17,19), and the ubiquitin hydrophobic patch (L8, I44). Methyltransferase assays revealed that Dot1 engages ubiquitylated nucleosomes through surfaces orthogonal to the acidic and hydrophobic patches of H2A and ubiquitin, respectively. Mutation of H4R17 and R19 to alanine, however, reduced Dot1 activity both on ubiquitylated and on unmodified nucleosomes, suggesting that Dot1 binds the H4 tail regardless of stimulation. The size of ubiquitin as a PTM raises the question if exact placement is essential for its diverse biochemical functions. To tackle this question, a semisynthesis of H2A-K119ub, a PTM associated with polycomb-dependent gene repression, was developed. 453,454 Borrowing from the H2B-K120ub strategy, Fierz et al. ligated a ubiquitin α-thioester (residues 1–75) to a cysteine residue attached to H2A-K119 on a synthetic peptide via an isopeptide bond (Figure 50b). The second ligation to complete the H2A sequence was mediated by a penicillamine residue in place of Val104. Upon desulfurization, this amino acid is converted to valine, restoring the native H2A sequence, and leaving the benign G76A ligation “scar” at the C-terminus of ubiquitin. 454 H2A-K119ub, in contrast to H2B-K120ub, did not stimulate Dot1-catalyzed methylation at H3K79. 453 The ubiquitin marks had only minor effects on the activity of PRC2 containing the core subunits, 453 although the presence of additional PRC2 modules increases HMT activity upon H2A ubiquitylation. 455 Figure 50 Streamlined semisyntheses of ubiquitylated histones. (a) Synthesis of H2B-K120ub containing the G76A mutation in ubiquitin. This mutation enables introduction of residue 76 of ubiquitin as a cysteine and subsequent NCL with a ubiquitin α-thioester. Finally, desulfurization converts Cys76 into an alanine residue. (b) Synthesis of H2A-K119ub containing the G76A mutation in ubiquitin. A penicillamine moiety permits NCL at a valine residue. (c) Disulfide-based conjugation of ubiquitin to H2B-K120C. In this approach, the ubiquitin α-thioester is reacted with cysteamine (top) and coupled to histones activated as disulfides (below), thus yielding disulfide-bonded analogues of H2B-K120ub (H2B-K120ubSS). Figure 51 Plasticity in the stimulation of Dot1 by histone ubiquitylation. The canonical ubiquitylation site of H2B is indicated in black (K120, white arrow), permissive sites in green (H2A-G22 and H2B-K125), prohibitive sites in red (H2B-K108 and H2B-K116). The substrate residue (H3K79) of Dot1 is highlighted in blue. To probe position-dependent biochemical effects of histone ubiquitylation in more detail, the synthesis of these reagents was further accelerated through the use of a disulfide-directed histone ubiquitylation scheme (Figure 50c). 456 In this approach, ubiquitin is furnished with a C-terminal sulfhydryl group by treatment of an intein-derived ubiquitin α-thioester with cysteamine. In parallel, Lys 120 of H2B was mutated to a cysteine, and this residue was activated with 2,2′-dithiobis(5-nitropyridine), enabling attachment of the thiolated ubiquitin to yield H2B-K120ubSS. The slightly elongated attachment handle did not affect Dot1 stimulation. Repositioning of the ubiquitin mark to side-chains adjacent to H2B-K120, using the appropriate histone cysteine mutants, revealed considerable plasticity in the regulation of Dot1. Disulfide-based transplantation of the ubiquitin mark to H2B-125 and H2A-22 was well-tolerated, but transfer to residues 108 or 116 of H2B, both of which lie closer to the methylation site at H3K79, was detrimental to stimulation (Figure 51). Figure 52 Homo-FRET assay to monitor chromatin compaction. In the extended conformation, fluorescein labels (yellow stars) are far apart, thus limiting the amount of homo-FRET. Upon compaction, the distance between fluorophores is decreased, resulting in homo-FRET, which is detected by a reduction in the steady-state anisotropy (SSA) of the system. What are the biophysical consequences of attaching an entire protein like ubiquitin to a nucleosome? On the mononucleosome level, semisynthetic H2A-K119ub and H2B-K120ub containing the G76A mutation are only marginally destabilizing compared to unmodified histones. 454 The impact of histone ubiquitylation on nucleosome arrays is much more dramatic. 457 Analytical ultracentrifugation revealed that H2B-K120ubSS inhibits array compaction. The mechanism of this effect was scrutinized using a homo-FRET-based compaction assay. H2A was fluorescently labeled at an engineered cysteine residue at position 110 with maleimide chemistry. In open chromatin, fluorescence polarization is high due to the long rotational correlation time of large arrays. Upon Mg2+-induced compaction, labeled H2A moieties approach each other, and homo-FRET occurs. Because of the relative orientation of the fluorophores, homo-FRET results in a decrease in polarization, monitored by the steady-state anisotropy (Figure 52). 458 H2B-K120ubSS specifically interfered with the later stages of compaction, but did not alter chromatin structure at low concentrations of MgCl2. 457 This effect could be completely reversed by treating H2B-K120ubSS-containing arrays with DTT. Interestingly, acetylated H4, generated by NCL, inhibited chromatin compaction even at low ionic strength, indicating that these modifications alter chromatin structure through different mechanisms. In addition to their impact on array compaction, these PTMs reduce the propensity for interstrand interactions. 407,413,457 Figure 53 Stepwise total synthesis of H2B-K34ub. This convergent ligation strategy involving four NCL steps commences with the synthesis of orthogonally protected histone H2B (top to bottom right) and finishes with ubiquitin conjugation and desulfurization (bottom left). Besides the canonical H2B ubiquitylation at K120, this PTM also occurs on K34, K46, K108, and K116. 459 Brik and co-workers developed a synthetic protocol to generate H2B site-specifically ubiquitylated at K34 to shed light on this lesser known modification. 460 This tour-de-force strategy entailed the convergent assembly of H2B-K34ub from five fragments (Figure 53). First, H2B containing a thiolated lysine analogue at position 34 was assembled. The synthesis was initiated by the ligation of an HA-tagged N-terminal fragment (residues 1–20) to residues 21–57. The latter fragment contained an A21C mutation and an o-nitrobenzyl protected δ-mercaptolysine at position 34. Lys57 was Nvoc protected to preclude lactamization upon oxidation of the C-terminal acylhydrazide moiety. Concurrently, the C-terminal fragment (residues 97–125, A97C) was condensed with residues 58–96, where A58 was replaced by a thiazolidine-protected cysteine residue. Upon deprotection of Cys58, residues 58–125 were coupled to residues 1–57, activated as an acyl azide. Photolysis liberated the δ-mercaptolysine residue, which was reacted with a ubiquitin α-thioester. Finally, desulfurization afforded native H2B-K34ub. To complement SPPS-based approaches, Virdee et al. reported biosynthetic incorporation of N-ε-protected δ-mercaptolysine residues into proteins. 435 Upon deprotection by photolysis, this residue can be reacted with ubiquitin α-thioesters by NCL. Subsequent desulfurization results in traceless, site-specifically ubiquitylated proteins. Chemical synthesis is a reliable method to grant access to diversely modified histones. With state of the art semisynthesis methods, any chemically stable PTM can be homogeneously incorporated at any position within histones, although modifications at the termini are preferable. Notably, histone semisynthesis allows the installation of many PTMs into a single protein, in particular if the sites are clustered at the termini. Semisynthetic nucleosomes have served to unravel the biochemical and biophysical consequences of a broad variety of histone modifications. Histone ubiquitylation in particular has been fertile ground for chemical explorations, resulting in an assortment of technologies that promise to find use far beyond chromatin biochemistry. 4.4 Synthesis of Chromatin with Defined PTM Patterns In cells, histone PTMs rarely manifest in isolation. 461,462 Instead, patterns of modifications co-occur within spatially defined chromatin states. 463,464 Thus, chromatin domains display global enrichment of specific histone PTMs and their combinations. Within each domain, however, histone modifications are not uniformly distributed but occur in specific patterns that help guide DNA-templated processes. 465 For example, methylation of H3K4 is focused at transcription start sites, while the abundance of H3K79me2 and H3K36me3 peaks early and late, respectively, within the coding sequence of actively transcribed regions. 466,467 These observations prompted the in vitro preparation of chromatin templates that carry selected patterns of histone PTMs. 4.4.1 Multivalent Recognition of Histone PTMs Individual reader domains are often sensitive to histone PTMs adjacent to their cognate mark. 165 To integrate multiple signals present on distal sites within the same tail or on separate histones, many nuclear proteins harbor several binding modules, or multiply their interaction capabilities through oligomerization. 464,468 Moreover, additional sensor domains are also frequently contributed by partners in multisubunit complexes. In some cases, cooperative binding of effectors to several PTMs can be measured using histone peptide substrates. Examples include the recognition of polyacetylated histone tails by certain bromodomains 53,54 (see section 3.1.2) or the interaction between a chromodomain dimer of CMT3 with H3 tails trimethylated at K9 and K27. 469 However, when PTMs are distributed to different histones, the use of site-specifically modified nucleosomes to investigate the binding mechanism becomes imperative. The coupled PHD-BD of BPTF (introduced in section 3.2.3) is the paradigm for this mode of multivalent PTM recognition. 470 Individually, the PHD finger specifically binds to H3K4me3, but the bromodomain promiscuously interacts with several acetylated H4 peptides on SPOT arrays and in solution. To test if coupling of the domains would impart specificity on the BD, Ruthenburg et al. assembled nucleosomes containing combinations of unmodified histones, semisynthetic H3K4me3, and H4 acetylated at K12, K16, or K20 (Figure 54). The BD alone was insufficient for binding any of the nucleosomes, but in the presence of the PHD and H3K4me3, selective interaction with H4K16ac-containing nucleosomes was observed. Synergistic binding depended on proper orientation of the PHD-BD: disruption of the helical linker or insertion of additional residues to alter the relative rotation between the domains abolished the cooperativity. This landmark study demonstrated that coupled binding modules are indeed capable of recognizing histone PTM patterns, which dramatically increases the depth of information that can be administered on nucleosomes. Figure 54 Bivalent recognition of doubly modified mononucleosomes by BPTF. The PHD finger of BPTF binds to H3K4me3 (gray arrows). In a nucleosomal context, this binding is reinforced through the recognition of H4K16ac by the adjacent bromodomain (black arrow). 4.4.2 Asymmetric Nucleosomes Histone octamers are inherently symmetric entities, but DNA sequence can render nucleosomes asymmetric. Upon post-translational modification of nucleosomes, one histone copy is most likely targeted first, providing an additional level of asymmetry. Little is known about the biological significance and functional consequence of asymmetrically modified nucleosomes, but recent progress based on studying laboratory manufactured asymmetric nucleosomes in vitro has provided insight into their recognition, and enabled the development of a pipeline to quantify symmetry of nucleosomes from cells. Pioneering work on dissecting the contribution of single tails to the biochemistry of nucleosome acetylation was performed in the Shogren-Knaak laboratory. 409,471 Prompted by the observation that H3 acetylation by the yeast SAGA complex was cooperative on nucleosome arrays and mononucleosome substrates but not histone peptides, this group prepared asymmetrically modified nucleosomes. 471 Histone octamers were refolded in the presence of a 10-fold excess of unmodified H3 over a modified His-tagged H3 variant (Figure 55a). Subsequent purification by metal affinity chromatography yielded complexes with one copy of wild-type H3 and either tail-less, Lys-to-Ala mutant (residues 9, 14, 18, 23), or tetra-acetylated H3. Histone acetyltransferase assays with SAGA revealed that cooperative acetylation hinges upon the presence of two acetylatable H3 tails (Figure 55b). Preacetylation of one H3 tail increased the affinity of SAGA for its substrate, and the bromodomain of GCN5 (the active subunit of the complex) was required for this effect. 409 Thus, the coupling of a BD with a histone acetyltransferase (HAT) domain leads to burst-like nucleosome acetylation to aid in transcription activation. Figure 55 Preparation and application of asymmetric mononucleosomes. (a) Synthesis of asymmetrically modified nucleosomes using a tagged, modified copy of H3 and an excess of an unmodified version. (b) The SAGA-complex is stimulated by its own mark. Nucleosomes that can only be acetylated on one H3 tail (tail-less and Lys9,14,18,23Ala) are poor SAGA substrates (gray arrows). Asymmetrically acetylated nucleosomes (right) recruit SAGA (dashed arrow) to promote acetylation of the unmodified H3 tail. Figure 56 Nucleosome asymmetry in vivo. (a) Assembly of asymmetric H3/H4 tetramers using a tandem affinity tag strategy. (b) Distribution of H3K27 methyl marks in ES cells into symmetric and asymmetric mononucleosomes. (c) Bivalent domains consist of asymmetric nucleosomes with one H3 tail di- or trimethylated at Lys4, and another tail marked with K27me2/3. Asymmetrically modified histones have also shed light on the mechanism of PTM binding and crosstalk. Partially ubiquitylated histones, for instance, revealed that each H2B-K120ub molecule stimulates Dot1 methylation on only one nucleosome face, presumably in an orthosteric fashion. 452 In addition, HP1 dimers bound mononucleosomes with only one H3K9me3 mark equally well as doubly modified variants. 331 It is therefore likely that each of the dimer’s CDs engages a distinct nucleosome to effect compaction of chromatin regions demarked with the heterochromatin mark H3K9me3. 334,472 Asymmetric nucleosomes produced in vitro also represent a gateway to quantify nucleosome asymmetry in vivo. 473 Voigt et al. harnessed tandem affinity purification of H3/H4 tetramers, where one copy of H3 is furnished with a trimethyllysine analogue at position 27 and a Strep tag, while the wild-type copy of H3 contains a His-tag (Figure 56a). Purification by Ni-NTA chromatography, followed by a streptactin column, provided access to mononucleosomes containing one copy of each H3 version. Immunoprecipitation of asymmetrically modified mononucleosomes, as well as unmodified and doubly methylated congeners with K27me3-specific antibodies and subsequent analysis of H3 variants by MS, confirmed the expected composition of each nucleosome batch: Asymmetric nucleosomes contained equal proportions of wild-type and K27me3, while symmetrically trimethylated nucleosomes contained only K27me3. Unmodified nucleosomes were not enriched by the pull-down step. When the same analytics workflow was applied to mononucleosomes extracted from embryonic stem cells, the presence of symmetrically and asymmetrically modified nucleosomes was established on the basis of the relative amounts of K27me2/3 and K27me0/1 before and after immunoprecipitation (Figure 56b). Approximately one-half of the total mononucleosomes contain K27 in the me0/1 methylation state in both H3 copies, and nucleosomes containing K27me2/3 on both tails are overrepresented as compared to asymmetric versions. A particularly interesting finding concerns so-called bivalent domains. These regions of chromatin are common in stem cells and contain H3K4me3 and H3K27me3, archetypal activating and repressive PTMs, respectively. Analysis of the symmetry of bivalent nucleosomes failed to detect histones that are simultaneously trimethylated at H3K4 and H3K27, suggesting that these marks are present on different tails within one nucleosome (Figure 56c). Consistent with this observation, nucleosome arrays containing cysteine-derived trimethyllysine analogues at H3K4 in only one H3 copy were substrates for the H3K27-specific methyltransferase PRC2. In contrast, arrays homogeneously carrying trimethylated thialysine at H3K4 could not be further methylated by PRC2 at H3K27. These results demonstrate that symmetrically and asymmetrically modified nucleosomes exist in vivo, and might exert unique biochemical downstream effects. 4.4.3 Synthesis of Sequence-Specific Oligonucleosome Arrays Many functional features of histone PTMs can be recapitulated at the mononucleosome level or with homogeneous arrays, but certain phenomena are dependent on the presence of neighboring sites with specific properties. Examples of such chromatin transactions include the multivalent recognition of heterotypic PTMs distributed on different nucleosomes, or deposition of homo- or heterotypic modifications upon stimulation in trans. Sequence-defined hetero-oligomers of chromatin are therefore indispensable to study the effect of histone PTMs on distinct nucleosomes. Their preparation and applications are detailed below. More than 10 years ago, Zheng and Hayes used asymmetric dinucleosomes that contain a phenylazide photo-cross-linking group on the N-terminal histone tails on one of the nucleosomes and a reporter on the DNA of the other to investigate internucleosomal contacts. 299 To construct these reagents, homogeneously modified mononucleosomes were linked using T4 DNA ligase (Figure 57a). Specificity in the linkage was configured by nonpalindromic single-stranded overhangs present on the mononucleosome starting material. Following UV irradiation and DNA cleavage, cross-linked products were detected by gel shift experiments, which informed on contacts between histone tails and DNA on a neighboring nucleosome. The results showed that the N-terminal tails of H2A and H2B, but not H3 and H4, partook in internucleosomal interactions in this dinucleosome system. Extending the nucleosome ligation technology to tetranucleosome arrays, Blacketer et al. investigated the contribution of H4 tails on interstrand association. 474 Installation of one to four nucleosomes lacking the H4 tails into sequence-defined tetranucleosome arrays revealed that the H4 tails cooperate to mediate chromatin self-assembly. Sequence-defined nucleosome arrays have also been harnessed to analyze the geometric preference of PTM binding and histone modifying enzymes. 324,445,470,475 For example, comparative assays using mono- and heterotypic dinucleosomes established that BPTF preferentially engages H3K4me3 and H4K16ac on the same nucleosome (Figure 57b). 470 Similarly, Dot1 was determined to act in an intranucleosomal cross-talk, but cannot methylate nucleosomes adjacent to H2B-K120ub (Figure 57c). 445 Figure 57 Oligonucleosome arrays. (a) Synthesis of asymmetric dinucleosomes using nonpalindromic DNA overhangs and their application in studying histone-DNA contacts. Cross-links are detected through a gel-shift of the 32P-labeled DNA. (b) BPTF binds its marks (H3K4me3, blue flag, and H4K16ac, green circle) in a mononucleosomal context (black arrow). (c) H2B-K120ub stimulates Dot1 in cis (black arrow), but not toward methylation of adjacent nucleosomes. (d) Clr4/Suv39-mediated spreading of the heterochromatin-associated H3K9me3 mark (red flag). (e) Rpd3s deacetylation is stimulated by H3K36 methylation (orange flag) in an intra- and internucleosomal fashion. Note that Rpd3S recognizes dinucleosomes more readily than similarly modified mononucleosomes. By contrast, several histone-modifying enzymes are controlled by histone PTMs in an internucleosomal fashion. For instance, the histone methyltransferase Clr4 (the yeast homologue of the human Suv39h1) mediates heterochromatin spreading 330,476 by propagating a methyl mark in trans. 324 This enzyme uses a SET domain to methylate H3K9, 70 and a chromodomain to bind to its product, H3K9me3, thus generating a positive feedback loop. 477 Narlikar and co-workers used dinucleosomes composed of one octamer with unmodified H3 and one octamer containing H3K9Cme3 to study Clr4 activation (Figure 57d). 324 The presence of the K9me3 mark stimulated methylation of the adjacent mononucleosome, but not when the different mononucleosomes were incubated in trans. Kinetic characterization demonstrated that this circuit operates by enhancing catalysis by Clr4, rather than binding to the hemimodified dinucleosomes. Another example of internucleosomal effects on enzymatic activity is provided by the histone deacetylase, Rpd3s. 478,479 This enzyme binds thialysine analogues of H3K36me3, preferentially in a dinucleosomal context. 478 In model ligated dinucleosomes, K36 methylation stimulated deacetylation both intra- and internucleosomally (Figure 57e), 479 thereby allowing Rpd3s to produce a hypo-acetylated microenvironment surrounding H3K36me3 marks to suppress aberrant transcription initiation within coding sequences. 480 Biochemical and biophysical analyses using ordered nucleosome arrays have thus provided insight into the geometric properties of a variety of chromatin-associated systems. Additional levels of sophistication in terms of size and composition of arrays to reflect the heteropolymeric complexity of chromatin in vivo will increase the resolution and diversity of future chromatin-related challenges that can be addressed in vitro. 4.5 Increasing the Throughput of Chromatin Biochemistry In the last 10 years, methods to produce site-specifically modified histones have burgeoned. Armed with these tools, chemists and biologists have captured the mechanistic essence of a broad range of phenomena operating on chromatin. Despite this tremendous progress, crafting “designer” chromatin and deploying these precious reagents in biochemical and biophysical investigations has remained a challenging pursuit. Therefore, strategies geared toward parallelized interrogation of the features that define the molecular circuits operating on chromatin and the nuclear proteome are extremely valuable. 4.5.1 Identification of PTM-Specific Chromatin Binding Proteins SILAC-based approaches (see also sections 3.2.3 and 3.11.2) represent appealing strategies to identify factors that interact with nucleosomes in a PTM-dependent fashion and on a proteome-wide level. Kouzarides and co-workers explored the possibility for crosstalk in the interactions mediated by histone and DNA methylation. 481 A set of three modified H3 variants was assembled by ligating peptide α-thioesters containing H3K4me3, K9me3, or K27me3 to a tail-less recombinant fragment with an N-terminal cysteine in place of Thr32. Subsequently, the semisynthetic histones were incorporated into nucleosomes in the presence or absence of methylation at C5 of cytosine in CpG dinucleotide sequences, artificially installed with a prokaryotic DNA methyltransferase. Initially, the approach was validated by screening known interactors of specific Kme3 marks such as HP1 for binding H3K9me3 or the PRC2 subunit Suz12 for (indirectly) engaging H3K27me3. By comparing the SILAC profiles between nucleosomes containing methylated DNA and/or methylated histones, some novel interactions were observed. For instance, several members of the origin recognition complex associated specifically with the heterochromatin marks H3K9me3 and H3K27me3. In certain cases, synergies and antagonisms between histone and DNA methylation were detected (Figure 58a). The former behavior is exemplified by UHRF where binding to H3K9me3-containing nucleosomes (discussed in detail in section 3.12.1) was reinforced by DNA methylation. By contrast, the interaction of PRC2 components with H3K27me3 was diminished when CpG groups were methylated. Nikolov et al. used site-specifically modified nucleosome arrays for SILAC-based identification of PTM-specific binders, and directly compared the results to screens performed with peptide-based affinity matrixes. 482 These analyses expanded the list of potential readers for the H3K4me3 and H3K9me3 marks, including the spindle-associated protein Spindlin1. The H3K4me3 binding properties of Spindlin1 have been corroborated by a contemporary study relying on biochemical analyses with nucleosomes containing methyllysine analogues and structural data. 483 Of note, the candidate list generated by using methylated chromatin partially overlaps with the predictions from mononucleosome and peptide-based SILAC assays, but each format yielded a remarkable set of unique proteins (Figure 58b). Possibly, the high concentrations of peptides that can be employed enable the isolation of weakly interacting modules. In contrast, chromatin-based templates provide additional contact opportunities, particularly in the case of nucleosome arrays, which are inherently polyvalent. Figure 58 Identification of PTM-binding factors using modified chromatin as bait. (a) Synergies and antagonisms between DNA and histone methylation recognition. (b) Peptide- and chromatin-based probes reveal partially overlapping interactors. Here only a few examples are shown: BPTF and CHD4 are associated with chromatin remodeling, ING1 is a transcriptional regulator, TFIID is a general transcription factor complex, SIN3 is a histone deacetylase, PCAF and CDYL are histone acetyltransferases, HP1 is a “glue” for heterochromatin, UHRF is a recruiter of DNA methyltransferase, and NUP93 is a member of the nuclear pore. (c) Examples of H2B-K120ub-binding complexes. (d) Synthesis of a hydrolase-resistant H2B-K120ub analogue. In related proteomic studies, arrays containing H2B-K120ub were used as a bait to discover specific binding proteins for this PTM. 484 Resilience against deubiquitination was imparted by methylation of the N-ε-amino group of Lys120 according to the method of Brik and co-workers. 485 Specifically, this modification was installed by on-resin alkylation of the N-ε-amino group of lysine 120 within a synthetic H2B peptide (residues 118–125), enabled by orthogonal protection of this residue. Subsequently, the secondary amine was acylated with the C-terminal Gly residue of ubiquitin, followed by SPPS of the C-terminal portion of ubiquitin. Iterative NCL and ensuing desulfurization provided the final, deubiquitinase resistant H2B-K120ub analogue. As anticipated, SILAC analysis confirmed known H2B-K120ub interacting modules, and suggested that several complexes associated with diverse chromatin transactions (gene expression, DNA replication, chromatin remodeling, etc.) recognize this moiety (Figure 58c). Follow-up experiments confirmed that ubiquitylated H2B interacts with SWI/SNF family chromatin remodelers, and that this association is important for gene regulation. By further altering the linkage between histones and ubiquitin, Long et al. were able to generate deubiquitinase-resistant H2A-K119ub and H2B-K120ub isoforms. 486 The authors joined a G76C mutant of ubiquitin to H2B-K120C or H2A-K119C via a dichloroacetone cross-link (Figure 58d). 486 This non-native linkage is hydrolase-resistant due to the absence of an isopeptide bond, the increased length of the module, and the presence of a carboxyl group. Notably, because of the hydrolytic stability of these compounds, H2A-K119ub/H2B or H2A/H2B-K120ub dimer baits could be used to isolate the deubiquitinase Usp15 from HeLa cell nuclear extracts. This enzyme was able to cleave semisynthetic H2B-K120ub containing the native isopeptide linkage, preferentially in the form of histone octamers rather than nucleosomes. Thus, the union of histone and chromatin substrates with modern proteomic approaches such as SILAC has provided substantial insight into how PTMs are recognized in a chromatin context on a proteome-wide level. Biochemical and genetic follow-up studies have confirmed several predicted interaction pairs, attesting to the general utility of these sophisticated screens. 4.5.2 Chromatin Biochemistry with DNA-Barcoded Nucleosome Libraries Peptide libraries have greatly increased the throughput in the analysis of signaling through histone modifications (section 3.12). Can similar strategies be harnessed to accelerate chromatin biochemistry at the nucleosome level? To realize this goal, two challenges must be overcome. Specifically, synthetic protocols to obtain nucleosome libraries and analytical methods to read out the desired biochemical properties of the library members need to be implemented. Nguyen et al. have recently solved these issues and developed a versatile platform to perform chromatin biochemistry with increased throughput. 487 Nucleosome libraries containing over 50 unique combinations of acetylation, methylation, and ubiquitylation signatures were assembled from semisynthetic histones on a microgram scale (Figure 59). Importantly, each nucleosome contained a unique hexanucleotide barcode that specified the histone variants from which the particular nucleosome is constructed. Library members were subjected to various biochemical assays, and desired variants isolated by affinity- or immunoprecipitation (IP) and analyzed by next generation DNA sequencing, which provides exquisite sensitivity. This workflow was applied to profile the specificity of PTM-specific antibodies, histone binding proteins, and histone modifying enzymes. For example, the coupled BD-PHD of BPTF (see also section 4.4.1) was found to display a marked preference for nucleosomes containing H3K4me3 and polyacetylated H4. Similarly, polyacetylated H4 mediated the recruitment of p300 to nucleosomes. This interaction stimulated p300-dependent acetylation of H3, as determined by performing pull-downs with antibodies selective for the p300 product H3K18ac. This positive feedback loop was also observed when nucleosome libraries were incubated with nuclear extracts: H4 acetylation promoted H3 acetylation. Figure 59 Schematic overview of a screening platform based on DNA-barcoded nucleosome libraries. Recombinant and semisynthetic histones are refolded into >50 different octamers in parallel, and assembled into mononucleosomes with barcoded DNA. Upon pooling, aliquots of the library are subjected to biochemical assays involving a pull-down step to enrich variants that exhibit certain traits. Subsequently, nucleosomal DNA is isolated and analyzed by next generation sequencing to provide a semiquantitative readout of hundreds to thousands of experiments. DNA barcoding enables storage of the synthetic and biochemical history of each MN variant in an easily interpretable format, which facilitates ultrasensitive and versatile readout of the molecular properties of the corresponding nucleosome. Accordingly, Nguyen et al. were able to decipher how multiple PTMs on chromatin are interpreted and converted into orthogonal signals. While shown for binding studies, acetyltransferase and methyltransferase reactions, DNA-barcoded nucleosome libraries may find application in many more areas of chromatin biochemistry and biophysics. 5 Summary and Future Perspectives Peptide and protein chemistry have become an integral part of chromatin research. Methods ranging from solid-phase synthesis to recombinant technology are available to construct site-specifically modified histone peptides and chromatin templates with distinct patterns. In particular, a plethora of histones carrying PTMs and their analogues have been generated in a chemically defined fashion (Figure 60). These reagents can directly feed into cutting edge biochemical and biophysical pipelines, including transcription assays 341 or structural studies based on X-ray crystallography, 282,488,489 electron microscopy, 283,490 or NMR spectroscopy. 491,492 Given these advances, what are the remaining challenges and opportunities for protein chemists to further contribute to unraveling the mechanism of histone-based signaling? Figure 60 Diagram of PTMs and their analogues that have been site-specifically incorporated into histones (as of September 2014). For clarity, connections are shown only to one copy of each histone. 5.1 Tackling the Combinatorial Complexity Several peptide- and mononucleosome based approaches have been developed to biochemically address the enormous combinatorial possibilities of histone PTM combinations. Nevertheless, methods to synthesize hundreds of proteins in parallel are still lacking. Innovative purification schemes or reliable fragment condensation protocols that allow bypassing of individual workup steps are needed to attain the level of throughput that peptide synthesis can achieve. Furthermore, can the resulting histone libraries be incorporated into templates that more closely reflect the heteropolymeric nature of chromatin fibers? Such arrays will enable dissection of spatial components that underlie the control of chromatin-templated processes, both on a biophysical and on a biochemical level. Specific aspects that remain largely unanswered include questions concerning how, which, and if defined PTM patterns alter structures synergistically, and whether these structural perturbations are propagated along the chromatin fiber beyond the actual installation site. Transcription is a vectorial process, and chromatin architecture contributes to defining the coordinates of the origin and direction of polymerase action. At which level do histone PTM gradients facilitate guidance of the transcription machinery, and in what ways do these modification patterns also contribute to local memory of transcriptional states? We believe that some of these issues can be addressed with next-generation chromatin biochemistry on the foundation of designer histones. 5.2 Beyond Histones The vast majority of contributions that protein chemistry has so far made in the chromatin biochemistry area have revolved around histone modifications. Yet, many other chromatin-associated proteins are hubs for PTMs, in particular RNA polymerase and coactivators such as p53. In addition, many enzymes characterized as histone methyl- and acetyltransferases also act on nonhistone targets. Thus, elucidating the mechanism of these processes requires that the chemical toolkit, originating in basic research, and since refined for histone synthesis, be extended to the manufacture of other cellular factors that are considerably larger than histones. Semisyntheses of p53 (ref (493)), a bacterial RNA polymerase, 494 as well as the p300 HAT domain 495 have already been achieved, laying the groundwork for systems-wide analysis of how PTM-based nuclear signaling affects transcription. 5.3 Synthetic Chromatin Chemistry in Live Cells Modified histones have contributed immensely to biochemical and biophysical analyses in vitro. What are the prospects of implementing these reagents in vivo to elucidate the mechanism of their action? Currently, access to specifically modified histones in vivo is mainly limited to genetic strategies. Typical examples include site-directed mutagenesis of a target histone residue, such as Lys-to-Gln or Lys-to-Arg substitutions to mimic acetyllysine side-chains and preclude methylation or acetylation at that position, respectively. 290,496 Alternatively, overexpression of a histone-modifying enzyme can result in global accumulation of a desired PTM. Upon targeting enzymes to specific genomic sites (using the Gal4 system, or perhaps CRISPR-CAS9), perturbations can be localized to genetic reporters, enabling more defined functional assignments of histone modifying activities. In conjunction with reversible dimerization modules, these targeting strategies can shed light on the kinetics of the formation and interpretation of histone PTMs. 497 Artificial expansion of the genetic code further diversifies the scope of genetic approaches to study the effect of histone modifications, as exemplified by the recent success in tracking structural consequences of mitotic histone phosphorylation. 380 A multitude of bioorthogonal reactions, 498 as well as the ability to perform protein trans-splicing in vivo, 499,500 might also aid in generating designer chromatin in living cells. Fueled by diverse success stories since the late 1960s, the journey for chemical biologists into the chromatin field continues. Many exciting milestones still lie ahead, with key challenges involving the vastness of the combinatorial landscape of histone modifications and the complexity of their interpretation in a cellular context. Whether these routes lead to high-throughput biochemistry or meander into cells, the journey promises to be extremely fruitful. These efforts will likely contribute to the rich tradition at the intersection of peptide and protein chemistry with histone biology, and target a systems-level understanding of how cellular signaling converges on chromatin and is relayed into functional outputs.

Related collections

Most cited references 395

Record: found
Abstract: found
Article: found

Is Open Access

Rapid planetesimal formation in turbulent circumstellar discs

Anders Johansen, Jeffrey S. Oishi, Mordecai-Mark Mac Low … (2007)

The initial stages of planet formation in circumstellar gas discs proceed via dust grains that collide and build up larger and larger bodies (Safronov 1969). How this process continues from metre-sized boulders to kilometre-scale planetesimals is a major unsolved problem (Dominik et al. 2007): boulders stick together poorly (Benz 2000), and spiral into the protostar in a few hundred orbits due to a head wind from the slower rotating gas (Weidenschilling 1977). Gravitational collapse of the solid component has been suggested to overcome this barrier (Safronov 1969, Goldreich & Ward 1973, Youdin & Shu 2002). Even low levels of turbulence, however, inhibit sedimentation of solids to a sufficiently dense midplane layer (Weidenschilling & Cuzzi 1993, Dominik et al. 2007), but turbulence must be present to explain observed gas accretion in protostellar discs (Hartmann 1998). Here we report the discovery of efficient gravitational collapse of boulders in locally overdense regions in the midplane. The boulders concentrate initially in transient high pressures in the turbulent gas (Johansen, Klahr, & Henning 2006), and these concentrations are augmented a further order of magnitude by a streaming instability (Youdin & Goodman 2005, Johansen, Henning, & Klahr 2006, Johansen & Youdin 2007) driven by the relative flow of gas and solids. We find that gravitationally bound clusters form with masses comparable to dwarf planets and containing a distribution of boulder sizes. Gravitational collapse happens much faster than radial drift, offering a possible path to planetesimal formation in accreting circumstellar discs.

0 comments Cited 2503 times – based on 0 reviews

Preprint

     Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

The Dicke Quantum Phase Transition with a Superfluid Gas in an Optical Cavity

Ferdinand Brennecke, Tilman Esslinger, Christine Guerlin … (2009)

A phase transition describes the sudden change of state in a physical system, such as the transition between a fluid and a solid. Quantum gases provide the opportunity to establish a direct link between experiment and generic models which capture the underlying physics. A fundamental concept to describe the collective matter-light interaction is the Dicke model which has been predicted to show an intriguing quantum phase transition. Here we realize the Dicke quantum phase transition in an open system formed by a Bose-Einstein condensate coupled to an optical cavity, and observe the emergence of a self-organized supersolid phase. The phase transition is driven by infinitely long-ranged interactions between the condensed atoms. These are induced by two-photon processes involving the cavity mode and a pump field. We show that the phase transition is described by the Dicke Hamiltonian, including counter-rotating coupling terms, and that the supersolid phase is associated with a spontaneously broken spatial symmetry. The boundary of the phase transition is mapped out in quantitative agreement with the Dicke model. The work opens the field of quantum gases with long-ranged interactions, and provides access to novel quantum phases.

0 comments Cited 1675 times – based on 0 reviews

Preprint

     Review now

Bookmark

Record: found
Abstract: found
Article: not found

Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics.

Shao-En Ong, Blagoy Blagoev, Irina Kratchmarova … (2002)

Quantitative proteomics has traditionally been performed by two-dimensional gel electrophoresis, but recently, mass spectrometric methods based on stable isotope quantitation have shown great promise for the simultaneous and automated identification and quantitation of complex protein mixtures. Here we describe a method, termed SILAC, for stable isotope labeling by amino acids in cell culture, for the in vivo incorporation of specific amino acids into all mammalian proteins. Mammalian cell lines are grown in media lacking a standard essential amino acid but supplemented with a non-radioactive, isotopically labeled form of that amino acid, in this case deuterated leucine (Leu-d3). We find that growth of cells maintained in these media is no different from growth in normal media as evidenced by cell morphology, doubling time, and ability to differentiate. Complete incorporation of Leu-d3 occurred after five doublings in the cell lines and proteins studied. Protein populations from experimental and control samples are mixed directly after harvesting, and mass spectrometric identification is straightforward as every leucine-containing peptide incorporates either all normal leucine or all Leu-d3. We have applied this technique to the relative quantitation of changes in protein expression during the process of muscle cell differentiation. Proteins that were found to be up-regulated during this process include glyceraldehyde-3-phosphate dehydrogenase, fibronectin, and pyruvate kinase M2. SILAC is a simple, inexpensive, and accurate procedure that can be used as a quantitative proteomic approach in any cell culture system.

0 comments Cited 1046 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Chem Rev

Journal ID (iso-abbrev): Chem. Rev

Journal ID (publisher-id): cr

Journal ID (coden): chreay

Title: Chemical Reviews

Publisher: American Chemical Society

ISSN (Print): 0009-2665

ISSN (Electronic): 1520-6890

Publication date PMC-release: 20 October 2015

Publication date (Electronic): 20 October 2014

Publication date (Print): 25 March 2015

Volume: 115

Issue: 6 , 2015 Epigenetics

Pages: 2296-2349

Affiliations

[1]Department of Chemistry, Princeton University , Frick Laboratory, Princeton, New Jersey 08544, United States

Author notes

[* ]E-mail: muir@ 123456princeton.edu .

Article

DOI: 10.1021/cr5003529

PMC ID: 4378460

PubMed ID: 25330018

SO-VID: c777c600-588a-4ebf-890a-c64d2c6776de

License:

This is an open access article published under an ACS AuthorChoice License, which permits copying and redistribution of the article or any adaptations for non-commercial purposes.

History

Date received : 02 July 2014

Funding

National Institutes of Health, United States

Custom metadata

document-id-old-9 cr5003529

document-id-new-14 cr-2014-003529

ccc-price

ScienceOpen disciplines: Chemistry

Data availability:

ScienceOpen disciplines: Chemistry

Comments

Comment on this article

scite_

Cited by 49

See all cited by

Most referenced authors 14,601

See all reference authors

Histones: At the Crossroads of Peptide and Protein Chemistry

Read this article at

Abstract

Related collections

Journal of Circulating Biomarkers

Most cited references 395

Rapid planetesimal formation in turbulent circumstellar discs

The Dicke Quantum Phase Transition with a Superfluid Gas in an Optical Cavity

Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics.

Author and article information

Journal

Affiliations

Author notes

Article

History

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 17

Cited by 49

Most referenced authors 14,601