230
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The MPI Bioinformatics Toolkit ( http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment.

          Related collections

          Most cited references35

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          PredictProtein—an open resource for online prediction of protein structural and functional features

          PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Kalign – an accurate and fast multiple sequence alignment algorithm

            Background The alignment of multiple protein sequences is a fundamental step in the analysis of biological data. It has traditionally been applied to analyzing protein families for conserved motifs, phylogeny, structural properties, and to improve sensitivity in homology searching. The availability of complete genome sequences has increased the demands on multiple sequence alignment (MSA) programs. Current MSA methods suffer from being either too inaccurate or too computationally expensive to be applied effectively in large-scale comparative genomics. Results We developed Kalign, a method employing the Wu-Manber string-matching algorithm, to improve both the accuracy and speed of multiple sequence alignment. We compared the speed and accuracy of Kalign to other popular methods using Balibase, Prefab, and a new large test set. Kalign was as accurate as the best other methods on small alignments, but significantly more accurate when aligning large and distantly related sets of sequences. In our comparisons, Kalign was about 10 times faster than ClustalW and, depending on the alignment size, up to 50 times faster than popular iterative methods. Conclusion Kalign is a fast and robust alignment method. It is especially well suited for the increasingly important task of aligning large numbers of sequences.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              An HMM model for coiled-coil domains and a comparison with PSSM-based predictions.

              Large-scale sequence data require methods for the automated annotation of protein domains. Many of the predictive methods are based either on a Position Specific Scoring Matrix (PSSM) of fixed length or on a window-less Hidden Markov Model (HMM). The performance of the two approaches is tested for Coiled-Coil Domains (CCDs). The prediction of CCDs is used frequently, and its optimization seems worthwhile. We have conceived MARCOIL, an HMM for the recognition of proteins with a CCD on a genomic scale. A cross-validated study suggests that MARCOIL improves predictions compared to the traditional PSSM algorithm, especially for some protein families and for short CCDs. The study was designed to reveal differences inherent in the two methods. Potential confounding factors such as differences in the dimension of parameter space and in the parameter values were avoided by using the same amino acid propensities and by keeping the transition probabilities of the HMM constant during cross-validation. The prediction program and the databases are available at http://www.wehi.edu.au/bioweb/Mauro/Marcoil
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                08 July 2016
                29 April 2016
                29 April 2016
                : 44
                : Web Server issue
                : W410-W415
                Affiliations
                [1 ]Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen D-72076, Germany
                [2 ]Group for Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen D-37077, Germany
                Author notes
                [* ]To whom correspondence should be addressed. Tel: +49 7071 601 341; Fax: +49 7071 601 352; Email: andrei.lupas@ 123456tuebingen.mpg.de
                Correspondence may also be addressed to Vikram Alva. Tel: +49 7071 601 451; Fax: +49 7071 601 352; Email: vikram.alva@ 123456tuebingen.mpg.de
                Article
                10.1093/nar/gkw348
                4987908
                27131380
                1cd7d832-b8bf-4f7d-be50-ddd5603773f0
                © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@ 123456oup.com

                History
                : 19 April 2016
                : 08 April 2016
                : 14 February 2016
                Page count
                Pages: 6
                Categories
                Web Server issue
                Custom metadata
                08 July 2016

                Genetics
                Genetics

                Comments

                Comment on this article