The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The MPI Bioinformatics Toolkit ( http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment.

Related collections

Most cited references 35

Record: found
Abstract: found
Article: found

Is Open Access

PredictProtein—an open resource for online prediction of protein structural and functional features

Guy Yachdav, Edda Kloppmann, László Kaján … (2014)

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.

0 comments Cited 244 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Kalign – an accurate and fast multiple sequence alignment algorithm

Timo Lassmann, Erik Sonnhammer (2005)

Background The alignment of multiple protein sequences is a fundamental step in the analysis of biological data. It has traditionally been applied to analyzing protein families for conserved motifs, phylogeny, structural properties, and to improve sensitivity in homology searching. The availability of complete genome sequences has increased the demands on multiple sequence alignment (MSA) programs. Current MSA methods suffer from being either too inaccurate or too computationally expensive to be applied effectively in large-scale comparative genomics. Results We developed Kalign, a method employing the Wu-Manber string-matching algorithm, to improve both the accuracy and speed of multiple sequence alignment. We compared the speed and accuracy of Kalign to other popular methods using Balibase, Prefab, and a new large test set. Kalign was as accurate as the best other methods on small alignments, but significantly more accurate when aligning large and distantly related sets of sequences. In our comparisons, Kalign was about 10 times faster than ClustalW and, depending on the alignment size, up to 50 times faster than popular iterative methods. Conclusion Kalign is a fast and robust alignment method. It is especially well suited for the increasingly important task of aligning large numbers of sequences.

0 comments Cited 215 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

An HMM model for coiled-coil domains and a comparison with PSSM-based predictions.

Mauro Delorenzi, Terry Speed (2002)

Large-scale sequence data require methods for the automated annotation of protein domains. Many of the predictive methods are based either on a Position Specific Scoring Matrix (PSSM) of fixed length or on a window-less Hidden Markov Model (HMM). The performance of the two approaches is tested for Coiled-Coil Domains (CCDs). The prediction of CCDs is used frequently, and its optimization seems worthwhile. We have conceived MARCOIL, an HMM for the recognition of proteins with a CCD on a genomic scale. A cross-validated study suggests that MARCOIL improves predictions compared to the traditional PSSM algorithm, especially for some protein families and for short CCDs. The study was designed to reveal differences inherent in the two methods. Potential confounding factors such as differences in the dimension of parameter space and in the parameter values were avoided by using the same amino acid propensities and by keeping the transition probabilities of the HMM constant during cross-validation. The prediction program and the databases are available at http://www.wehi.edu.au/bioweb/Mauro/Marcoil

0 comments Cited 193 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (iso-abbrev): Nucleic Acids Res

Journal ID (hwp): nar

Journal ID (publisher-id): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date (Print): 08 July 2016

Publication date (Electronic): 29 April 2016

Publication date PMC-release: 29 April 2016

Volume: 44

Issue: Web Server issue

Pages: W410-W415

Affiliations

[1 ]Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen D-72076, Germany

[2 ]Group for Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen D-37077, Germany

Author notes

[* ]To whom correspondence should be addressed. Tel: +49 7071 601 341; Fax: +49 7071 601 352; Email: andrei.lupas@ 123456tuebingen.mpg.de

Correspondence may also be addressed to Vikram Alva. Tel: +49 7071 601 451; Fax: +49 7071 601 352; Email: vikram.alva@ 123456tuebingen.mpg.de

Article

DOI: 10.1093/nar/gkw348

PMC ID: 4987908

PubMed ID: 27131380

SO-VID: 1cd7d832-b8bf-4f7d-be50-ddd5603773f0

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@ 123456oup.com

History

Date accepted : 19 April 2016

Date revision received : 08 April 2016

Date received : 14 February 2016

Page count

Pages: 6

Custom metadata

cover-date 08 July 2016

ScienceOpen disciplines: Genetics

Data availability:

ScienceOpen disciplines: Genetics

Comments

Comment on this article

scite_

Cited by 172

See all cited by

Most referenced authors 2,411

See all reference authors

The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis

Read this article at

Abstract

Related collections

G3: Genes|Genomes|Genetics

Most cited references 35

PredictProtein—an open resource for online prediction of protein structural and functional features

Kalign – an accurate and fast multiple sequence alignment algorithm

An HMM model for coiled-coil domains and a comparison with PSSM-based predictions.

Author and article information

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Custom metadata

Comments

Comment on this article

Similar content 321

Cited by 172

Most referenced authors 2,411