The formation of protein-protein complexes is essential for proteins to perform their physiological functions in the cell. Mutations that prevent the proper formation of the correct complexes can have serious consequences for the associated cellular processes. Since experimental determination of protein-protein binding affinity remains difficult when performed on a large scale, computational methods for predicting the consequences of mutations on binding affinity are highly desirable. We show that a scoring function based on interface structure profiles collected from analogous protein-protein interactions in the PDB is a powerful predictor of protein binding affinity changes upon mutation. As a standalone feature, the differences between the interface profile score of the mutant and wild-type proteins has an accuracy equivalent to the best all-atom potentials, despite being two orders of magnitude faster once the profile has been constructed. Due to its unique sensitivity in collecting the evolutionary profiles of analogous binding interactions and the high speed of calculation, the interface profile score has additional advantages as a complementary feature to combine with physics-based potentials for improving the accuracy of composite scoring approaches. By incorporating the sequence-derived and residue-level coarse-grained potentials with the interface structure profile score, a composite model was constructed through the random forest training, which generates a Pearson correlation coefficient >0.8 between the predicted and observed binding free-energy changes upon mutation. This accuracy is comparable to, or outperforms in most cases, the current best methods, but does not require high-resolution full-atomic models of the mutant structures. The binding interface profiling approach should find useful application in human-disease mutation recognition and protein interface design studies.
Few proteins carry out their tasks in isolation. Instead, proteins combine with each other in complicated ways that can be affected by either the natural genetic variation that occurs among people or by disease causing mutations such as those that occur in cancer or in genetic disorders. To understand how these mutations affect our health, it is necessary to understand how mutations can affect the strength of the interactions that bind proteins together. This is a difficult task to do in a laboratory on a large scale and scientists are increasingly turning to computational methods to predict these effects in advance. We show that by looking at the multiple alignments of similar protein-protein complex structures at the interface regions, new constraints based on the evolution of the three dimensional structures of proteins can be made to predict which mutations are compatible with two proteins interacting and which are not.