SHIFTX2: significantly improved protein chemical shift prediction

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

A new computer program, called SHIFTX2, is described which is capable of rapidly and accurately calculating diamagnetic ¹H, ¹³C and ¹⁵N chemical shifts from protein coordinate data. Compared to its predecessor (SHIFTX) and to other existing protein chemical shift prediction programs, SHIFTX2 is substantially more accurate (up to 26% better by correlation coefficient with an RMS error that is up to 3.3× smaller) than the next best performing program. It also provides significantly more coverage (up to 10% more), is significantly faster (up to 8.5×) and capable of calculating a wider variety of backbone and side chain chemical shifts (up to 6×) than many other shift predictors. In particular, SHIFTX2 is able to attain correlation coefficients between experimentally observed and predicted backbone chemical shifts of 0.9800 ( ¹⁵N), 0.9959 ( ¹³Cα), 0.9992 ( ¹³Cβ), 0.9676 ( ¹³C′), 0.9714 ( ¹HN), 0.9744 ( ¹Hα) and RMS errors of 1.1169, 0.4412, 0.5163, 0.5330, 0.1711, and 0.1231 ppm, respectively. The correlation between SHIFTX2’s predicted and observed side chain chemical shifts is 0.9787 ( ¹³C) and 0.9482 ( ¹H) with RMS errors of 0.9754 and 0.1723 ppm, respectively. SHIFTX2 is able to achieve such a high level of accuracy by using a large, high quality database of training proteins (>190), by utilizing advanced machine learning techniques, by incorporating many more features (χ ₂ and χ ₃ angles, solvent accessibility, H-bond geometry, pH, temperature), and by combining sequence-based with structure-based chemical shift prediction techniques. With this substantial improvement in accuracy we believe that SHIFTX2 will open the door to many long-anticipated applications of chemical shift prediction to protein structure determination, refinement and validation. SHIFTX2 is available both as a standalone program and as a web server ( http://www.shiftx2.ca).

Electronic supplementary material

The online version of this article (doi:10.1007/s10858-011-9478-4) contains supplementary material, which is available to authorized users.

Related collections

Most cited references 38

Record: found
Abstract: found
Article: not found

1H, 13C and 15N chemical shift referencing in biomolecular NMR.

D Wishart, C Bigam, J. Yao … (1995)

A considerable degree of variability exists in the way that 1H, 13C and 15N chemical shifts are reported and referenced for biomolecules. In this article we explore some of the reasons for this situation and propose guidelines for future chemical shift referencing and for conversion from many common 1H, 13C and 15N chemical shift standards, now used in biomolecular NMR, to those proposed here.

0 comments Cited 305 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Data mining in bioinformatics using Weka.

Eibe Frank, Mark CS Hall, Len Trigg … (2004)

The Weka machine learning workbench provides a general-purpose environment for automatic classification, regression, clustering and feature selection-common data mining problems in bioinformatics research. It contains an extensive collection of machine learning algorithms and data pre-processing methods complemented by graphical user interfaces for data exploration and the experimental comparison of different machine learning techniques on the same problem. Weka can process data given in the form of a single relational table. Its main objectives are to (a) assist users in extracting useful information from data and (b) enable them to easily identify a suitable algorithm for generating an accurate predictive model from it. http://www.cs.waikato.ac.nz/ml/weka.

0 comments Cited 290 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

VADAR: a web server for quantitative evaluation of protein structure quality.

Leigh Willard, Anuj Ranjan, Haiyan Zhang … (2003)

VADAR (Volume Area Dihedral Angle Reporter) is a comprehensive web server for quantitative protein structure evaluation. It accepts Protein Data Bank (PDB) formatted files or PDB accession numbers as input and calculates, identifies, graphs, reports and/or evaluates a large number (>30) of key structural parameters both for individual residues and for the entire protein. These include excluded volume, accessible surface area, backbone and side chain dihedral angles, secondary structure, hydrogen bonding partners, hydrogen bond energies, steric quality, solvation free energy as well as local and overall fold quality. These derived parameters can be used to rapidly identify both general and residue-specific problems within newly determined protein structures. The VADAR web server is freely accessible at http://redpoll.pharmacy.ualberta.ca/vadar.

0 comments Cited 206 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

: +780-492-0383 , david.wishart@ualberta.ca

Journal

Journal ID (nlm-ta): J Biomol NMR

Title: Journal of Biomolecular Nmr

Publisher: Springer Netherlands (Dordrecht )

ISSN (Print): 0925-2738

ISSN (Electronic): 1573-5001

Publication date (Electronic): 30 March 2011

Publication date PMC-release: 30 March 2011

Publication date (Print): May 2011

Volume: 50

Issue: 1

Pages: 43-57

Affiliations

[1 ]Department of Computing Science, University of Alberta, Edmonton, AB Canada

[2 ]Department of Biological Sciences, University of Alberta, Edmonton, AB Canada

[3 ]National Research Council, National Institute for Nanotechnology (NINT), Edmonton, AB T6G 2E8 Canada

[4 ]Department of Molecular Biology, Division of Bioinformatics, Center of Applied Molecular Engineering, University of Salzburg, Hellbrunnerstr. 34/3.OG, 5020 Salzburg, Austria

Article

Publisher ID: 9478

DOI: 10.1007/s10858-011-9478-4

PMC ID: 3085061

PubMed ID: 21448735

SO-VID: d551a689-8d80-462e-81eb-e6804afe58f6

History

Date received : 22 December 2010

Date accepted : 28 January 2011

Custom metadata

ScienceOpen disciplines: Molecular biology

Keywords: machine learning,chemical shift,protein,nmr

Data availability:

ScienceOpen disciplines: Molecular biology

Keywords: machine learning, chemical shift, protein, nmr

Comments

Comment on this article

scite_

Cited by 153

See all cited by

Most referenced authors 1,632

See all reference authors

- Version 1

SHIFTX2: significantly improved protein chemical shift prediction

Read this article at

Abstract

Electronic supplementary material

Related collections

Annual Reviews AI, Machine Learning, and Society

Most cited references 38

1H, 13C and 15N chemical shift referencing in biomolecular NMR.

Data mining in bioinformatics using Weka.

VADAR: a web server for quantitative evaluation of protein structure quality.

Author and article information

Contributors

Journal

Affiliations

Article

History

Categories

Custom metadata

Comments

Comment on this article

Similar content 52

Cited by 153

Most referenced authors 1,632