26
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      pKID Binds to KIX via an Unstructured Transition State with Nonnative Interactions

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Understanding the detailed mechanism of interaction of intrinsically disordered proteins with their partners is crucial to comprehend their functions in signaling and transcription. Through its interaction with KIX, the disordered pKID region of CREB protein is central in the transcription of cAMP responsive genes, including those involved in long-term memory. Numerous simulation studies have investigated these interactions. Combined with experimental results, these can provide valuable and comprehensive understanding of the mechanisms involved. Here, we probe the transition state of this interaction experimentally through analyzing the kinetic effect of mutating both interface and solvent exposed residues in pKID. We show that very few specific interactions between pKID and KIX are required in the initial binding process. Only a small number of weak interactions are formed at the transition state, including nonnative interactions, and most of the folding occurs after the initial binding event. These properties are consistent with computational results and also the majority of experimental studies of intrinsically disordered protein coupled folding and binding in other protein systems, suggesting that these may be common features.

          Related collections

          Most cited references36

          • Record: found
          • Abstract: found
          • Article: not found

          Mechanism of coupled folding and binding of an intrinsically disordered protein.

          Protein folding and binding are analogous processes, in which the protein 'searches' for favourable intramolecular or intermolecular interactions on a funnelled energy landscape. Many eukaryotic proteins are disordered under physiological conditions, and fold into ordered structures only on binding to their cellular targets. The mechanism by which folding is coupled to binding is poorly understood, but it has been hypothesized on theoretical grounds that the binding kinetics may be enhanced by a 'fly-casting' effect, where the disordered protein binds weakly and non-specifically to its target and folds as it approaches the cognate binding site. Here we show, using NMR titrations and (15)N relaxation dispersion, that the phosphorylated kinase inducible activation domain (pKID) of the transcription factor CREB forms an ensemble of transient encounter complexes on binding to the KIX domain of the CREB binding protein. The encounter complexes are stabilized primarily by non-specific hydrophobic contacts, and evolve by way of an intermediate to the fully bound state without dissociation from KIX. The carboxy-terminal helix of pKID is only partially folded in the intermediate, and becomes stabilized by intermolecular interactions formed in the final bound state. Future applications of our method will provide new understanding of the molecular mechanisms by which intrinsically disordered proteins perform their diverse biological functions.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Intrinsic Disorder Is a Common Feature of Hub Proteins from Four Eukaryotic Interactomes

            Introduction Systematic binary protein–protein interaction maps with various percentages of proteome coverage are currently available for S. cerevisiae [1,2], C. elegans [3], D. melanogaster [4], H. pylori [5], and, most recently, for H. sapiens [6,7]. As a result of these studies, it is now proposed that most networks within the cell have similar overall broad-scale topology where most proteins interact with just a few partners and a small number of proteins interact with many partners. Although all currently available networks represent only samples of the complete interactomes [8], the investigation of such partial networks is a first step toward a systems-biology understanding of cells and organisms. While much has been learned to date about the general mechanisms of protein–protein interactions, the specific structural features that account for differences in protein interactivity are still unknown. It has recently been suggested that intrinsically disordered (ID) proteins play an important role in protein–protein interactions [9,10]. ID proteins and protein regions lack a unique 3-D structure and exist in a dynamic ensemble of conformations. More than 427 proteins containing 802 disordered regions have been annotated (http://www.disprot.org). Computational estimates suggest that eukaryotic proteomes have a significantly higher occurrence of ID proteins relative to prokaryotic proteomes [11,12]. The prevalence of ID proteins in eukaryotes is likely to be due to more complex signaling and regulatory pathways that heavily rely on disordered proteins [13]. Many ID proteins have been shown to mediate interactions through a disorder-to-order transition upon binding to their biological targets [14,15]. The lack of prior structure provides several advantages to ID-mediated protein interactions relative to interactions between folded proteins, such as decoupling of specificity and affinity, and the ability to recognize multiple binding partners with distinct interaction surfaces. In addition, the interaction interface areas of ID proteins is in general much larger per residue [16], which suggests that ID proteins would make more efficient hub proteins relative to ordered proteins [17]. Recent reviews [18–20] discuss the importance of intrinsic disorder for protein–protein interactions that involve binding to multiple partners. These reviews focus on individual examples of hub proteins with known disordered regions. However, no systematic study of organism-specific protein interaction networks that investigate their disorder content is currently available. The hypothesis that intrinsic structural disorder may be an important attribute that can distinguish between hub and end proteins is tested in the present study. The prediction of disorder in the interaction networks from four eukaryotic organisms is carried out using PONDR VL-XT [21,22]. The comparison of proteins from these networks shows that while the disorder content varies between organisms, hub proteins are consistently found to be more disordered than end proteins in all organisms. Results Datasets Characterization Protein interaction datasets from four eukaryotes, C. elegans (WORM), H. sapiens (HUMAN), D. melanogaster (FLY), and S. cerevisiae (YEAST) were selected for this study (Table 1). High-throughput datasets with experimentally demonstrated verification rates between 75% and 80% were selected for WORM and HUMAN; the literature-curated low-throughput dataset was selected for YEAST; and the literature-curated dataset that also included the high-throughput interactions was selected for FLY (Materials and Methods). Although another FLY dataset consisting of only high-confidence high-throughput interactions also was available [4], it was not particularly useful because highly connected proteins (i.e., hubs) were removed with the intent of reducing the number of nonspecific interactions. Subsequently, four additional datasets (WORM BioGRID, HUMAN HPRD, FLY BioGRID, and YEAST BioGRID) from two public databases [23,24], to which no confidence-based filtering have been applied (Materials and Methods, Table S1), were investigated for comparison. From these datasets, ends and hubs were defined as proteins with one and ten or more interacting partners, respectively. Although this cutoff was chosen somewhat arbitrarily, the results of future analysis did not depend significantly on the cutoff value (unpublished data). The gap in the definition between hubs and ends was intended to buffer the classes, and should be considered as a conservative classification of hubs and ends. As shown in Table 1, the number of ends is between ~2-fold to 10-fold greater than the number of hubs, which is consistent with a scale-free network topology. Analysis of Disorder Predictions Predictions of intrinsic structural disorder were carried out on four datasets using PONDR VL-XT [21,22]. As shown in Figure 1, significant differences between hubs and ends in the percentages of proteins containing predicted disordered regions of various lengths are observed. For example, 78% of hub proteins in WORM carry predicted disordered regions of ≥30 consecutive residues, whereas only 58% of end proteins have this characteristic. The prediction error rate of PONDR VL-XT (i.e., the prediction of disorder on the completely ordered dataset O_PDB_S25, see Materials and Methods) for this disorder length is ~13%, and it gradually decreases as the length of the predicted disordered region increases. The significant differences in the disorder content between WORM hubs and ends are observed for most disorder lengths, thereby indicating that WORM hub proteins are overall more disordered than WORM end proteins. Similar conclusions arise for two other organisms, HUMAN and FLY. In YEAST, however, significant differences in the percentages of proteins with predicted disordered regions are observed for only two disorder lengths (≥40 and ≥70). By comparison, the analysis of a much larger YEAST BioGRID dataset shows that the disorder content of hubs and ends is significant for all disorder lengths for this organism (Figure S1). In addition, the results of the disorder predictions for the remaining BioGRID and HPRD datasets (Table S1) are also consistent with a significantly greater amount of disorder in hubs relative to ends (Figure S1). When hubs from all four organisms are compared with each other, HUMAN hubs have the overall highest disorder content (i.e., higher percentage of proteins with predicted disordered regions) for all disorder lengths, whereas YEAST hubs have the lowest. Interestingly, when ends from all four organisms are compared with each other, HUMAN ends again have the highest disorder content. This suggests that the HUMAN interaction network has the highest disorder content among all studied organisms, in agreement with the predicted disorder content of the entire human proteome [12]. It should also be noted that disorder predictions for proteins with an intermediate number of partners (from 2 to 9) generally fall in between the predictions for hubs and ends (Figures S2 and S3). Since PONDR VL-XT predicts disorder on a per-residue basis, it is important to account for the differences in protein lengths when comparing predictions for entire datasets, because longer proteins are expected to have a greater number, as well as longer regions, of predicted disorder in comparison with shorter proteins. To compensate for the length dependency of disorder predictions, the per-residue disorder predictions were normalized by protein length. The percentages of disordered residues within segments of all possible lengths (starting from one and ending with the longest disordered region in the dataset) were calculated for all proteins, and then plotted against the predicted disordered region length (Figure 2). The same procedure was repeated using a completely ordered set of proteins (O_PDB_S25) to estimate the error rate of the predictions. The length-normalized predictions further confirm the differences in the disorder content of hubs and ends. The percentages of predicted disordered residues in hubs are generally higher than in ends (Figure 2), although the differences between hubs and ends are more apparent for the HUMAN and FLY than for the WORM and YEAST datasets. Furthermore, WORM hubs and ends have similar percentages of predicted disordered residues within long segments of disorder (80 residues and longer). When length-normalized predictions are considered, the proportion of predicted disordered residues is highest in the HUMAN dataset, and lowest in the YEAST dataset (Figure 2). Analysis of Various Disorder Parameters To determine which specific disorder attributes contribute toward the differences observed between hubs and ends in each dataset, seven additional disorder parameters were calculated (see Materials and Methods for details). The results of a t-test for three representative disorder attributes (RdisAA, avgScore, and RnumDR) are shown in Table 2. The average disorder scores for hubs and ends were significantly different in all four organisms (p Yeast > Worm” [11] (note that human genome was not available at that time), and “Human > Fly > Yeast > Worm” [12]. Interestingly, when the prediction of disorder was carried out on all proteins (hubs, ends, and proteins with two to nine partners) from the networks in the present study (unpublished data), the ranking “Human > Fly > Yeast > Worm” agreed with the previous studies that were carried out on complete genomes. At the same time, the relative percentages of predicted disorder in the networks were generally higher than those reported previously for the complete genomes [11], even though the same predictor PONDR VL-XT was used in both studies. This result may indicate that proteins that interact with other proteins are on average more disordered than proteins that interact with ligands, such as nucleic acids, small molecules, lipids, etc. Another interesting observation that follows from comparison of the networks to the complete genomes is that the disorder content of the proteomes is closer to the disorder content of ends than to the disorder content of hubs (unpublished data). Although differing views regarding the scale-free nature of the protein interaction networks exist [40,41], it is still tempting to speculate that this bias could be explained by a potentially higher fraction of ends as compared with hubs in all genomes. We previously determined that human cell signaling and cancer-associated proteins are significantly more disordered than proteins from other functional categories [13]. Interestingly, the disorder content of HUMAN hubs (Figure 1) is very similar to that of human regulatory and cancer-associated proteins, suggesting that many cell signaling and regulatory proteins are network hubs. The high disorder content of hubs relates directly to their function. Intrinsic disorder provides several important functional benefits for interactions with multiple partners. First, it allows hubs to adapt to the structure of a variety of differently shaped binding partners. Such structural malleability is especially important for hubs that interact with their partners using the same or overlapping binding surfaces. Second, disorder may enable a hub protein to elicit both inhibiting and activating effects on different partners, as was recently noted for moonlighting proteins [42]. Third, structural plasticity may enable some proteins to serve as hubs in multiple and distinct signaling networks. One example of such a hub is glycogen synthase kinase 3β, which uses two different ID regions to participate in two unrelated signaling pathways, Wnt and insulin signaling [18]. While intrinsic disorder is an important feature of hub proteins, many ordered hub proteins also exist [18]. Interestingly, it has been recently proposed that ordered hubs have higher surface charge than nonhub proteins, and that this increased charge is likely to have an impact on their binding ability [43]. Furthermore, it has been noted that the binding partners of several ordered hubs are intrinsically disordered [18]. The examples include the partners of 14-3-3 proteins [44] the partners of β-catenin [45], and the partners of several other proteins (such as calmodulin, actin, and Cdk) [18]. The results of the present study suggest that wholly ordered hubs, as defined by the CDF/CH consensus classification, constitute a substantial fraction of all hub proteins and are especially prevalent in the YEAST dataset (Table 3). Among all the networks examined here, the YEAST interaction network appears to exhibit the smallest difference between hubs and ends in terms of predicted disorder, at least when literature-curated interactions are considered (Figure 1, Tables 2 and 3, compare with Figures S1 and S3). Notably, the amino acid composition of proteins from the YEAST network appears to be the least similar to the three other organisms (Figure 3). In addition, the proportion of wholly ordered proteins within both YEAST hubs and YEAST ends is the highest among the four datasets (Table 3, Table S3). A plausible explanation of the smaller differences in disorder content of YEAST hubs and ends is that the interactomes of the unicellular organisms are inherently simpler than metazoan interactomes due to less sophisticated signaling and regulation pathways. Because of their greater simplicity, these yeast pathways may rely less heavily on disorder than the networks of higher eukaryotes. In summary, the present study shows that intrinsic structural disorder is a distinctive and common characteristic of eukaryotic hub proteins, and it suggests that disorder may serve as a determinant of protein interactivity. In the future, it would be interesting to compare more specialized signaling and metabolic networks to each other to determine whether the high disorder content of hubs is a common feature of all cellular networks. In addition, it would certainly be interesting to perform the disorder analysis on the complete interactomes (when they are available) to determine whether similar conclusions are reached. Materials and Methods Datasets. The protein–protein interaction datasets for each organism (Table 1) were constructed as follows: (i) The interaction dataset for C. elegans (WORM) corresponds to the “First-Pass” interactions of the worm interactome version 5, or “WI5” [3]; (ii) The interaction dataset for H. sapiens (HUMAN) represents a union of the CCSB human interactome version 1, or “CCSB-HI1” extracted from Rual et al. [6] and high-confidence interactions with three or more quality points extracted from Stelzl et al. [7]; (iii) The interaction dataset for D. melanogaster (FLY) represents a union of literature-curated Drosophila interactions stored in the BIND (http://www.bind.ca), DIP (http://dip.doe-mbi.ucla.edu), and MINT (http://mint.bio.uniroma2.it/mint) interactions databases; (iv) The interaction dataset for S. cerevisiae (YEAST) represents the union of literature-curated yeast interactions stored in the BIND, DIP, and MINT interactions databases; (v) The dataset O_PDB_S25 contains only ordered parts of proteins extracted from the database PDB Select 25 [28]. The disorder predictions on this mostly nonredundant dataset served as a control for estimating the false-positive prediction error rate; (vi) DisProt dataset consists of experimentally verified disordered protein regions extracted from the DisProt database [27]. Four additional datasets, WORM BioGRID, HUMAN HPRD, FLY BioGRID, and YEAST BioGRID (Table S1), to which no confidence-based filtering have been applied, were extracted from BioGRID [23] and HPRD [24] and used for comparison. The redundancy removal from all datasets did not significantly reduce the number of interactions. On average, only 2.2% of interactions were removed at 70% protein sequence identity level, and 15.6% of interactions were removed at 30% protein sequence identity level (unpublished data). Therefore, the original datasets were used in the present study. Since a clear definition of a hub protein, in terms of a number of interacting partners, is not well-established, and since the definition might vary from one dataset to another, we somewhat arbitrarily chose ten partners as a cutoff value and defined proteins with ≥10 partners as hubs. Proteins with one interacting partner are defined here as ends. However, it should be mentioned that varying the cutoffs of hub definition gives rise to similar results (Figures S2 and S3). Disorder predictions. Predictions of intrinsic disorder were carried out using a well-characterized disorder predictor PONDR VL-XT [21,22]. This predictor was trained on the experimentally (X-ray and NMR) confirmed disordered protein regions, while the ordered training set included completely ordered proteins extracted from the nonredundant set of proteins from PDB Select 25 [28]. The accuracy of this predictor, benchmarked on the 42 CASP5 targets, reached 72.8% [46]. PONDR VL-XT is currently being used successfully to guide the removal of disordered regions that interfere with crystallization of problematic proteins for high-throughput structure determination [47]. Access to PONDR VL-XT (http://www.pondr.com) was provided by Molecular Kinetics (Indianapolis, Indiana, United States). Disorder parameters. The following disorder parameters (Table 2) have been calculated for all studied datasets: (i) disAA, the number of predicted disordered residues in the protein; (ii) avgScore, the average disorder prediction score for an entire protein; (iii) shortDR, the number of continuous, predicted disordered regions of length 10–30 amino acids; (iv) medDR, the number of continuous, predicted disordered regions of length 31–60 amino acids; (v) longDR, the number of continuous, predicted disordered regions of length 61–longest DR; (vi) numDR, the number of continuous, predicted disordered regions of length 10–longest DR; (vii) maxDR, the longest predicted disordered region in the protein. To eliminate the dependency of calculated parameters on protein length, the relative values of the attributes (RdisAA, RshortDR, RmedDR, RlongDR, RallDR, and RmaxDR) were derived by dividing the numerical value of each attribute by the protein length. Student's t-test was used to calculate p-values in Table 2. Consensus classification. Predictions of wholly ordered and wholly disordered proteins (Table 3) were made as previously described [25]. Briefly, these predictions assume that proteins fall into one of two classes: wholly disordered or wholly ordered. PONDR VL-XT CDF classification [11] and CH classification [48] were used to make predictions based on the consensus between the two methods. A degree of confidence was derived for both methods, and, for the purposes of consensus prediction, predictions were taken as being either high or low confidence. If both methods agree, a protein is assigned to that class. If one method gives a high confidence prediction and the other a low confidence prediction, a protein is assigned the class indicated by the high confidence prediction. Finally, if the methods disagree and both give either high confidence or low confidence prediction, the protein is left unclassified. The normal test for two binomial proportions was used to calculate 95% confidence intervals and p-values for Table 3. Amino acid composition. The amino acid composition analysis was performed as previously described [22]. Briefly, the mole fraction of the amino acid in a database was calculated as: where Pji is the frequency of amino acid j in sequence i of length ni . The variances of the amino acids in the dataset were calculated as: where Var(Pji ) = Pji (1 – Pji )/ni . The fractional difference in composition between two datasets a and b was calculated as . The variances for these ratios were calculated as: where is the mole fraction of amino acid j in the dataset a, and is the variance of amino acid j in the dataset a. GO annotations. Gene Ontology (GO) [49] annotations for S. cerevisiae [29] were obtained from the GOA database [50]. The correlation between PONDR VL-XT disorder predictions and process/function/localization GO annotations were determined using an approach related to Fisher's permutation test [51]. This approach has been previously used to examine the association of disorder predictions and GO annotations [12]. In this test, a null distribution, which assumes no association between disorder predictions and annotations, is generated. Disorder predictions for adjacent residues are highly correlated due to overlapping compositional windows. To partially account for this, the observed disordered regions (rather than individual residue predictions) were permuted. Predicted disordered regions were randomly distributed 10,000 times for hubs and ends separately, and the number of disordered residues associated with specific annotations was counted. This null distribution was used to calculate a Z-score for the observed counts for each annotation, and significance was evaluated based on the number of trials that contradicted the hypothesis indicated by the Z-score. The calculated p-values have not been corrected for multiple testing. High-level GO annotations of interest were selected prior to testing, and results were restricted to annotations with at least five examples in each of the hubs and ends sets. Supporting Information Figure S1 The Percentages of Hub and End Proteins from BioGRID and HPRD with ≥30 to ≥100 Consecutive Residues Predicted to Be Disordered 95% confidence intervals were calculated using normal test for two binomial proportions. (911 KB EPS) Click here for additional data file. Figure S2 The Percentages of All Interacting Proteins from Four Datasets with ≥30 to ≥100 Consecutive Residues Predicted to Be Disordered 95% confidence intervals were calculated using normal test for two binomial proportions. (982 KB EPS) Click here for additional data file. Figure S3 The Percentages of All Interacting Proteins from BioGRID and HPRD with ≥30 to ≥100 Consecutive Residues Predicted to Be Disordered 95% confidence intervals were calculated using normal test for two binomial proportions. (913 KB EPS) Click here for additional data file. Table S1 Properties of Protein Interaction Datasets Derived from BioGRID and HPRD (19 KB XLS) Click here for additional data file. Table S2 Disorder Attributes Calculated for Four Datasets (23 KB XLS) Click here for additional data file. Table S3 Results of a Binary Classification Using Consensus Method on BioGRID and HPRD Datasets The percentages of ordered, disordered, and unclassified proteins in each dataset are shown. (19 KB XLS) Click here for additional data file. Accession Numbers Swiss-Prot (http://www.ebi.ac.uk/swissprot) accession numbers for proteins mentioned in this paper are: Abp1p (P15891), Act1p (P60010), Arp2 and Arp3 (P32381, P47117), Cmd1p (P06787), FlgM (P26477), Las17p (Q12446), Rvs167p (P39743), and Sla1p (P32790).
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The interplay between structure and function in intrinsically unstructured proteins.

              Intrinsically unstructured proteins (IUPs) are common in various proteomes and occupy a unique structural and functional niche in which function is directly linked to structural disorder. The evidence that these proteins exist without a well-defined folded structure in vitro is compelling, and justifies considering them a separate class within the protein world. In this paper, novel advances in the rapidly advancing field of IUPs are reviewed, with the major attention directed to the evidence of their unfolded character in vivo, the interplay of their residual structure and their various functional modes and the functional benefits their malleable structural state provides. Via all these details, it is demonstrated that in only a couple of years after its conception, the idea of protein disorder has already come of age and transformed our basic concepts of protein structure and function.
                Bookmark

                Author and article information

                Contributors
                Journal
                Biophys J
                Biophys. J
                Biophysical Journal
                The Biophysical Society
                0006-3495
                1542-0086
                19 December 2017
                19 December 2017
                : 113
                : 12
                : 2713-2722
                Affiliations
                [1 ]Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
                Author notes
                []Corresponding author sarah.shammas@ 123456bioch.ox.ac.uk
                [∗∗ ]Corresponding author jc162@ 123456cam.ac.uk
                Article
                S0006-3495(17)31131-1
                10.1016/j.bpj.2017.10.016
                5770965
                29262364
                c3a8ac25-c741-4de6-9df9-5b72dedd2266
                © 2017 Biophysical Society.

                This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

                History
                : 12 July 2017
                : 10 October 2017
                Categories
                Proteins

                Biophysics
                Biophysics

                Comments

                Comment on this article