Reconstructing Protein Structures by Neural Network Pairwise Interaction Fields and Iterative Decoy Set Construction

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Predicting the fold of a protein from its amino acid sequence is one of the grand problems in computational biology. While there has been progress towards a solution, especially when a protein can be modelled based on one or more known structures (templates), in the absence of templates, even the best predictions are generally much less reliable. In this paper, we present an approach for predicting the three-dimensional structure of a protein from the sequence alone, when templates of known structure are not available. This approach relies on a simple reconstruction procedure guided by a novel knowledge-based evaluation function implemented as a class of artificial neural networks that we have designed: Neural Network Pairwise Interaction Fields (NNPIF). This evaluation function takes into account the contextual information for each residue and is trained to identify native-like conformations from non-native-like ones by using large sets of decoys as a training set. The training set is generated and then iteratively expanded during successive folding simulations. As NNPIF are fast at evaluating conformations, thousands of models can be processed in a short amount of time, and clustering techniques can be adopted for model selection. Although the results we present here are very preliminary, we consider them to be promising, with predictions being generated at state-of-the-art levels in some of the cases.

Related collections

Most cited references 53

Record: found
Abstract: found
Article: not found

Funnels, pathways, and the energy landscape of protein folding: a synthesis.

J Bryngelson, J. N. Onuchic, N. Socci … (1995)

The understanding, and even the description of protein folding is impeded by the complexity of the process. Much of this complexity can be described and understood by taking a statistical approach to the energetics of protein conformation, that is, to the energy landscape. The statistical energy landscape approach explains when and why unique behaviors, such as specific folding pathways, occur in some proteins and more generally explains the distinction between folding processes common to all sequences and those peculiar to individual sequences. This approach also gives new, quantitative insights into the interpretation of experiments and simulations of protein folding thermodynamics and kinetics. Specifically, the picture provides simple explanations for folding as a two-state first-order phase transition, for the origin of metastable collapsed unfolded states and for the curved Arrhenius plots observed in both laboratory experiments and discrete lattice simulations. The relation of these quantitative ideas to folding pathways, to uniexponential vs. multiexponential behavior in protein folding experiments and to the effect of mutations on folding is also discussed. The success of energy landscape ideas in protein structure prediction is also described. The use of the energy landscape approach for analyzing data is illustrated with a quantitative analysis of some recent simulations, and a qualitative analysis of experiments on the folding of three proteins. The work unifies several previously proposed ideas concerning the mechanism protein folding and delimits the regions of validity of these ideas under different thermodynamic conditions.

0 comments Cited 473 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments.

David T. W. Jones, Daniel Buchan, Domenico Cozzetto … (2012)

The accurate prediction of residue-residue contacts, critical for maintaining the native fold of a protein, remains an open problem in the field of structural bioinformatics. Interest in this long-standing problem has increased recently with algorithmic improvements and the rapid growth in the sizes of sequence families. Progress could have major impacts in both structure and function prediction to name but two benefits. Sequence-based contact predictions are usually made by identifying correlated mutations within multiple sequence alignments (MSAs), most commonly through the information-theoretic approach of calculating mutual information between pairs of sites in proteins. These predictions are often inaccurate because the true covariation signal in the MSA is often masked by biases from many ancillary indirect-coupling or phylogenetic effects. Here we present a novel method, PSICOV, which introduces the use of sparse inverse covariance estimation to the problem of protein contact prediction. Our method builds on work which had previously demonstrated corrections for phylogenetic and entropic correlation noise and allows accurate discrimination of direct from indirectly coupled mutation correlations in the MSA. PSICOV displays a mean precision substantially better than the best performing normalized mutual information approach and Bayesian networks. For 118 out of 150 targets, the L/5 (i.e. top-L/5 predictions for a protein of length L) precision for long-range contacts (sequence separation >23) was ≥ 0.5, which represents an improvement sufficient to be of significant benefit in protein structure prediction or model quality assessment. The PSICOV source code can be downloaded from http://bioinf.cs.ucl.ac.uk/downloads/PSICOV.

0 comments Cited 347 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Protein structure prediction and structural genomics.

D. Baker, A. Sali (2001)

Genome sequencing projects are producing linear amino acid sequences, but full understanding of the biological role of these proteins will require knowledge of their structure and function. Although experimental structure determination methods are providing high-resolution structure information about a subset of the proteins, computational structure prediction methods will provide valuable information for the large fraction of sequences whose structures will not be determined experimentally. The first class of protein structure prediction methods, including threading and comparative modeling, rely on detectable similarity spanning most of the modeled sequence and at least one known structure. The second class of methods, de novo or ab initio methods, predict the structure from sequence alone, without relying on similarity at the fold level between the modeled sequence and any of the known structures. In this Viewpoint, we begin by describing the essential features of the methods, the accuracy of the models, and their application to the prediction and understanding of protein function, both for single proteins and on the scale of whole genomes. We then discuss the important role that protein structure prediction methods play in the growing worldwide effort in structural genomics.

0 comments Cited 330 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Biomolecules

Journal ID (iso-abbrev): Biomolecules

Title: Biomolecules

Publisher: MDPI

ISSN (Electronic): 2218-273X

Publication date Collection: March 2014

Publication date (Electronic): 10 February 2014

Volume: 4

Issue: 1

Pages: 160-180

Affiliations

[1 ]School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland

[2 ]Complex and Adaptive Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland

Author notes

[* ] Author to whom correspondence should be addressed; E-Mail: gianluca.pollastri@ 123456ucd.ie ; Tel.: +353-1-716-5382.

Article

Publisher ID: biomolecules-04-00160

DOI: 10.3390/biom4010160

PMC ID: 4030983

SO-VID: f5383ef1-5c49-47ce-a837-e28ed80d677c

License:

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license ( http://creativecommons.org/licenses/by/3.0/).

History

Date received : 24 December 2013

Date revision received : 22 January 2014

Date accepted : 30 January 2014

Reconstructing Protein Structures by Neural Network Pairwise Interaction Fields and Iterative Decoy Set Construction

Read this article at

Abstract

Related collections

Data-Driven Civil Engineering

Most cited references 53

Funnels, pathways, and the energy landscape of protein folding: a synthesis.

PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments.

Protein structure prediction and structural genomics.

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 69

Cited by 1

Most referenced authors 1,643