2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      SSpro/ACCpro 6: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, deep learning and structural similarity

      1 , 2 , 2 , 1 , 2
      Bioinformatics
      Oxford University Press (OUP)

      Read this article at

      ScienceOpenPublisherPubMed
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Accurately predicting protein secondary structure and relative solvent accessibility is important for the study of protein evolution, structure and an early-stage component of typical protein 3D structure prediction pipelines.

          Results

          We present a new improved version of the SSpro/ACCpro suite of predictors for the prediction of protein secondary structure (in three and eight classes) and relative solvent accessibility. The changes include improved, TensorFlow-trained, deep learning predictors, a richer set of profile features (232 features per residue position) and sequence-only features (71 features per position), a more recent Protein Data Bank (PDB) snapshot for training, better hyperparameter tuning and improvements made to the HOMOLpro module, which leverages structural information from protein segment homologs in the PDB. The new SSpro 6 outperforms the previous version (SSpro 5) by 3–4% in Q3 accuracy and, when used with HOMOLPRO, reaches accuracy in the 95–100% range.

          Availability and implementation

          The predictors’ software, data and web servers are available through the SCRATCH suite of protein structure predictors at http://scratch.proteomics.ics.uci.edu. To maximize comptatibility and ease of use, the deep learning predictors are re-implemented as pure Python/numpy code without TensorFlow dependency.

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references5

          • Record: found
          • Abstract: found
          • Article: not found

          The Protein Data Bank.

          The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Scalable web services for the PSIPRED Protein Analysis Workbench

            Here, we present the new UCL Bioinformatics Group’s PSIPRED Protein Analysis Workbench. The Workbench unites all of our previously available analysis methods into a single web-based framework. The new web portal provides a greatly streamlined user interface with a number of new features to allow users to better explore their results. We offer a number of additional services to enable computationally scalable execution of our prediction methods; these include SOAP and XML-RPC web server access and new HADOOP packages. All software and services are available via the UCL Bioinformatics Group website at http://bioinf.cs.ucl.ac.uk/.
              • Record: found
              • Abstract: found
              • Article: not found

              SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity.

              Accurately predicting protein secondary structure and relative solvent accessibility is important for the study of protein evolution, structure and function and as a component of protein 3D structure prediction pipelines. Most predictors use a combination of machine learning and profiles, and thus must be retrained and assessed periodically as the number of available protein sequences and structures continues to grow.

                Author and article information

                Contributors
                Journal
                Bioinformatics
                Oxford University Press (OUP)
                1367-4803
                1460-2059
                April 01 2022
                March 28 2022
                February 02 2022
                April 01 2022
                March 28 2022
                February 02 2022
                : 38
                : 7
                : 2064-2065
                Affiliations
                [1 ]Department of Computer Science, School of Information and Computer Sciences, University of California, Irvine, CA 92697 USA
                [2 ]Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, USA
                Article
                10.1093/bioinformatics/btac019
                35108364
                267980c1-3861-4182-bdf2-ddeae1e0b4cc
                © 2022

                https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model

                History

                Comments

                Comment on this article

                Related Documents Log