Prediction of Protein Binding Regions in Disordered Proteins

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Many disordered proteins function via binding to a structured partner and undergo a disorder-to-order transition. The coupled folding and binding can confer several functional advantages such as the precise control of binding specificity without increased affinity. Additionally, the inherent flexibility allows the binding site to adopt various conformations and to bind to multiple partners. These features explain the prevalence of such binding elements in signaling and regulatory processes. In this work, we report ANCHOR, a method for the prediction of disordered binding regions. ANCHOR relies on the pairwise energy estimation approach that is the basis of IUPred, a previous general disorder prediction method. In order to predict disordered binding regions, we seek to identify segments that are in disordered regions, cannot form enough favorable intrachain interactions to fold on their own, and are likely to gain stabilizing energy by interacting with a globular protein partner. The performance of ANCHOR was found to be largely independent from the amino acid composition and adopted secondary structure. Longer binding sites generally were predicted to be segmented, in agreement with available experimentally characterized examples. Scanning several hundred proteomes showed that the occurrence of disordered binding sites increased with the complexity of the organisms even compared to disordered regions in general. Furthermore, the length distribution of binding sites was different from disordered protein regions in general and was dominated by shorter segments. These results underline the importance of disordered proteins and protein segments in establishing new binding regions. Due to their specific biophysical properties, disordered binding sites generally carry a robust sequence signal, and this signal is efficiently captured by our method. Through its generality, ANCHOR opens new ways to study the essential functional sites of disordered proteins.

Author Summary

Intrinsically unstructured/disordered proteins (IUPs/IDPs) do not adopt a stable structure in isolation but exist as a highly flexible ensemble of conformations. Despite the lack of a well-defined structure these proteins carry out important functions. Many IUPs/IDPs function via binding specifically to other macromolecules that involves a disorder-to-order transition. The molecular recognition functions of IUPs/IDPs include regulatory and signaling interactions where binding to multiple partners and high-specificity/low-affinity interactions play a crucial role. Due to their specific functional and structural properties, these binding regions have distinct properties compared to both globular proteins and disordered regions in general. Here, we present a general method to identify disordered binding regions from the amino acid sequence. Our method targets the essential feature of these regions: they behave in a characteristically different manner in isolation than bound to their partner protein. This prediction method allows us to compare the binding properties of short and long binding sites. The evolutionary relationship between the amount of disordered binding regions and general disordered regions in various organisms was also analyzed. Our results suggest that disordered binding regions can be recognized even without taking into account their adopted secondary structure or their specific binding partner.

Related collections

Most cited references 71

Record: found
Abstract: found
Article: not found

Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm.

Peter E. Wright, H.Jane Dyson (1999)

A major challenge in the post-genome era will be determination of the functions of the encoded protein sequences. Since it is generally assumed that the function of a protein is closely linked to its three-dimensional structure, prediction or experimental determination of the library of protein structures is a matter of high priority. However, a large proportion of gene sequences appear to code not for folded, globular proteins, but for long stretches of amino acids that are likely to be either unfolded in solution or adopt non-globular structures of unknown conformation. Characterization of the conformational propensities and function of the non-globular protein sequences represents a major challenge. The high proportion of these sequences in the genomes of all organisms studied to date argues for important, as yet unknown functions, since there could be no other reason for their persistence throughout evolution. Clearly the assumption that a folded three-dimensional structure is necessary for function needs to be re-examined. Although the functions of many proteins are directly related to their three-dimensional structures, numerous proteins that lack intrinsic globular structure under physiological conditions have now been recognized. Such proteins are frequently involved in some of the most important regulatory functions in the cell, and the lack of intrinsic structure in many cases is relieved when the protein binds to its target molecule. The intrinsic lack of structure can confer functional advantages on a protein, including the ability to bind to several different targets. It also allows precise control over the thermodynamics of the binding process and provides a simple mechanism for inducibility by phosphorylation or through interaction with other components of the cellular machinery. Numerous examples of domains that are unstructured in solution but which become structured upon binding to the target have been noted in the areas of cell cycle control and both transcriptional and translational regulation, and unstructured domains are present in proteins that are targeted for rapid destruction. Since such proteins participate in critical cellular control mechanisms, it appears likely that their rapid turnover, aided by their unstructured nature in the unbound state, provides a level of control that allows rapid and accurate responses of the cell to changing environmental conditions. Copyright 1999 Academic Press.

0 comments Cited 669 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins.

Zsuzsanna Dosztányi, Veronika Csizmok, Peter Tompa … (2005)

The structural stability of a protein requires a large number of interresidue interactions. The energetic contribution of these can be approximated by low-resolution force fields extracted from known structures, based on observed amino acid pairing frequencies. The summation of such energies, however, cannot be carried out for proteins whose structure is not known or for intrinsically unstructured proteins. To overcome these limitations, we present a novel method for estimating the total pairwise interaction energy, based on a quadratic form in the amino acid composition of the protein. This approach is validated by the good correlation of the estimated and actual energies of proteins of known structure and by a clear separation of folded and disordered proteins in the energy space it defines. As the novel algorithm has not been trained on unstructured proteins, it substantiates the concept of protein disorder, i.e. that the inability to form a well-defined 3D structure is an intrinsic property of many proteins and protein domains. This property is encoded in their sequence, because their biased amino acid composition does not allow sufficient stabilizing interactions to form. By limiting the calculation to a predefined sequential neighborhood, the algorithm was turned into a position-specific scoring scheme that characterizes the tendency of a given amino acid to fall into an ordered or disordered region. This application we term IUPred and compare its performance with three generally accepted predictors, PONDR VL3H, DISOPRED2 and GlobPlot on a database of disordered proteins.

0 comments Cited 368 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The molecular architecture of the nuclear pore complex.

Frank Alber, Svetlana Dokudovskaya, Liesbeth Veenhoff … (2007)

Nuclear pore complexes (NPCs) are proteinaceous assemblies of approximately 50 MDa that selectively transport cargoes across the nuclear envelope. To determine the molecular architecture of the yeast NPC, we collected a diverse set of biophysical and proteomic data, and developed a method for using these data to localize the NPC's 456 constituent proteins (see the accompanying paper). Our structure reveals that half of the NPC is made up of a core scaffold, which is structurally analogous to vesicle-coating complexes. This scaffold forms an interlaced network that coats the entire curved surface of the nuclear envelope membrane within which the NPC is embedded. The selective barrier for transport is formed by large numbers of proteins with disordered regions that line the inner face of the scaffold. The NPC consists of only a few structural modules that resemble each other in terms of the configuration of their homologous constituents, the most striking of these being a 16-fold repetition of 'columns'. These findings provide clues to the evolutionary origins of the NPC.

0 comments Cited 297 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

: Role: Editor

Journal

Journal ID (nlm-ta): PLoS Comput Biol

Journal ID (publisher-id): plos

Journal ID (pmc): ploscomp

Title: PLoS Computational Biology

Publisher: Public Library of Science (San Francisco, USA )

ISSN (Print): 1553-734X

ISSN (Electronic): 1553-7358

Publication date Collection: May 2009

Publication date (Print): May 2009

Publication date (Electronic): 1 May 2009

Volume: 5

Issue: 5

Electronic Location Identifier: e1000376

Affiliations

[1]Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest, Hungary

University of Bologna, Italy

Author notes

* E-mail: zsuzsa@ 123456enzim.hu

Conceived and designed the experiments: BM IS ZD. Performed the experiments: BM. Analyzed the data: BM IS ZD. Wrote the paper: BM IS ZD.

Article

Publisher ID: 08-PLCB-RA-1127R2

DOI: 10.1371/journal.pcbi.1000376

PMC ID: 2671142

PubMed ID: 19412530

SO-VID: f81975e6-7031-4829-b5aa-1d8c47ec9da2

Copyright © Mészáros et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

History

Date received : 11 December 2008

Date accepted : 30 March 2009

Page count

Pages: 18

Comments

Comment on this article

scite_

Cited by 203

See all cited by

Most referenced authors 2,617

See all reference authors

Prediction of Protein Binding Regions in Disordered Proteins

Read this article at

Abstract

Author Summary

Related collections

Genomic Prediction

Most cited references 71

Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm.

The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins.

The molecular architecture of the nuclear pore complex.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 12

Cited by 203

Most referenced authors 2,617