89
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Prediction of Protein Binding Regions in Disordered Proteins

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Many disordered proteins function via binding to a structured partner and undergo a disorder-to-order transition. The coupled folding and binding can confer several functional advantages such as the precise control of binding specificity without increased affinity. Additionally, the inherent flexibility allows the binding site to adopt various conformations and to bind to multiple partners. These features explain the prevalence of such binding elements in signaling and regulatory processes. In this work, we report ANCHOR, a method for the prediction of disordered binding regions. ANCHOR relies on the pairwise energy estimation approach that is the basis of IUPred, a previous general disorder prediction method. In order to predict disordered binding regions, we seek to identify segments that are in disordered regions, cannot form enough favorable intrachain interactions to fold on their own, and are likely to gain stabilizing energy by interacting with a globular protein partner. The performance of ANCHOR was found to be largely independent from the amino acid composition and adopted secondary structure. Longer binding sites generally were predicted to be segmented, in agreement with available experimentally characterized examples. Scanning several hundred proteomes showed that the occurrence of disordered binding sites increased with the complexity of the organisms even compared to disordered regions in general. Furthermore, the length distribution of binding sites was different from disordered protein regions in general and was dominated by shorter segments. These results underline the importance of disordered proteins and protein segments in establishing new binding regions. Due to their specific biophysical properties, disordered binding sites generally carry a robust sequence signal, and this signal is efficiently captured by our method. Through its generality, ANCHOR opens new ways to study the essential functional sites of disordered proteins.

          Author Summary

          Intrinsically unstructured/disordered proteins (IUPs/IDPs) do not adopt a stable structure in isolation but exist as a highly flexible ensemble of conformations. Despite the lack of a well-defined structure these proteins carry out important functions. Many IUPs/IDPs function via binding specifically to other macromolecules that involves a disorder-to-order transition. The molecular recognition functions of IUPs/IDPs include regulatory and signaling interactions where binding to multiple partners and high-specificity/low-affinity interactions play a crucial role. Due to their specific functional and structural properties, these binding regions have distinct properties compared to both globular proteins and disordered regions in general. Here, we present a general method to identify disordered binding regions from the amino acid sequence. Our method targets the essential feature of these regions: they behave in a characteristically different manner in isolation than bound to their partner protein. This prediction method allows us to compare the binding properties of short and long binding sites. The evolutionary relationship between the amount of disordered binding regions and general disordered regions in various organisms was also analyzed. Our results suggest that disordered binding regions can be recognized even without taking into account their adopted secondary structure or their specific binding partner.

          Related collections

          Most cited references71

          • Record: found
          • Abstract: found
          • Article: not found

          Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm.

          A major challenge in the post-genome era will be determination of the functions of the encoded protein sequences. Since it is generally assumed that the function of a protein is closely linked to its three-dimensional structure, prediction or experimental determination of the library of protein structures is a matter of high priority. However, a large proportion of gene sequences appear to code not for folded, globular proteins, but for long stretches of amino acids that are likely to be either unfolded in solution or adopt non-globular structures of unknown conformation. Characterization of the conformational propensities and function of the non-globular protein sequences represents a major challenge. The high proportion of these sequences in the genomes of all organisms studied to date argues for important, as yet unknown functions, since there could be no other reason for their persistence throughout evolution. Clearly the assumption that a folded three-dimensional structure is necessary for function needs to be re-examined. Although the functions of many proteins are directly related to their three-dimensional structures, numerous proteins that lack intrinsic globular structure under physiological conditions have now been recognized. Such proteins are frequently involved in some of the most important regulatory functions in the cell, and the lack of intrinsic structure in many cases is relieved when the protein binds to its target molecule. The intrinsic lack of structure can confer functional advantages on a protein, including the ability to bind to several different targets. It also allows precise control over the thermodynamics of the binding process and provides a simple mechanism for inducibility by phosphorylation or through interaction with other components of the cellular machinery. Numerous examples of domains that are unstructured in solution but which become structured upon binding to the target have been noted in the areas of cell cycle control and both transcriptional and translational regulation, and unstructured domains are present in proteins that are targeted for rapid destruction. Since such proteins participate in critical cellular control mechanisms, it appears likely that their rapid turnover, aided by their unstructured nature in the unbound state, provides a level of control that allows rapid and accurate responses of the cell to changing environmental conditions. Copyright 1999 Academic Press.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins.

            The structural stability of a protein requires a large number of interresidue interactions. The energetic contribution of these can be approximated by low-resolution force fields extracted from known structures, based on observed amino acid pairing frequencies. The summation of such energies, however, cannot be carried out for proteins whose structure is not known or for intrinsically unstructured proteins. To overcome these limitations, we present a novel method for estimating the total pairwise interaction energy, based on a quadratic form in the amino acid composition of the protein. This approach is validated by the good correlation of the estimated and actual energies of proteins of known structure and by a clear separation of folded and disordered proteins in the energy space it defines. As the novel algorithm has not been trained on unstructured proteins, it substantiates the concept of protein disorder, i.e. that the inability to form a well-defined 3D structure is an intrinsic property of many proteins and protein domains. This property is encoded in their sequence, because their biased amino acid composition does not allow sufficient stabilizing interactions to form. By limiting the calculation to a predefined sequential neighborhood, the algorithm was turned into a position-specific scoring scheme that characterizes the tendency of a given amino acid to fall into an ordered or disordered region. This application we term IUPred and compare its performance with three generally accepted predictors, PONDR VL3H, DISOPRED2 and GlobPlot on a database of disordered proteins.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The molecular architecture of the nuclear pore complex.

              Nuclear pore complexes (NPCs) are proteinaceous assemblies of approximately 50 MDa that selectively transport cargoes across the nuclear envelope. To determine the molecular architecture of the yeast NPC, we collected a diverse set of biophysical and proteomic data, and developed a method for using these data to localize the NPC's 456 constituent proteins (see the accompanying paper). Our structure reveals that half of the NPC is made up of a core scaffold, which is structurally analogous to vesicle-coating complexes. This scaffold forms an interlaced network that coats the entire curved surface of the nuclear envelope membrane within which the NPC is embedded. The selective barrier for transport is formed by large numbers of proteins with disordered regions that line the inner face of the scaffold. The NPC consists of only a few structural modules that resemble each other in terms of the configuration of their homologous constituents, the most striking of these being a 16-fold repetition of 'columns'. These findings provide clues to the evolutionary origins of the NPC.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                1553-734X
                1553-7358
                May 2009
                May 2009
                1 May 2009
                : 5
                : 5
                : e1000376
                Affiliations
                [1]Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest, Hungary
                University of Bologna, Italy
                Author notes

                Conceived and designed the experiments: BM IS ZD. Performed the experiments: BM. Analyzed the data: BM IS ZD. Wrote the paper: BM IS ZD.

                Article
                08-PLCB-RA-1127R2
                10.1371/journal.pcbi.1000376
                2671142
                19412530
                f81975e6-7031-4829-b5aa-1d8c47ec9da2
                Mészáros et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 11 December 2008
                : 30 March 2009
                Page count
                Pages: 18
                Categories
                Research Article
                Biochemistry/Bioinformatics
                Biochemistry/Theory and Simulation
                Computational Biology
                Computational Biology/Macromolecular Sequence Analysis
                Computational Biology/Systems Biology
                Evolutionary Biology/Bioinformatics
                Molecular Biology/Bioinformatics

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article