22
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Experimentally-determined or computationally-predicted protein phosphorylation sites for distinctive species are becoming increasingly common. In this paper, we compare the predictive performance of a novel classification algorithm with different encoding schemes to develop a rice-specific protein phosphorylation site predictor. Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site. A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms. We have used SVM with AF-CKSAAP to construct a rice-specific protein phosphorylation sites predictor, Rice_Phospho 1.0 ( http://bioinformatics.fafu.edu.cn/rice_phospho1.0). We measure the Accuracy (ACC) and Matthews Correlation Coefficient (MCC) of Rice_Phospho 1.0 to be 82.0% and 0.64, significantly higher than those measures for other predictors such as Scansite, Musite, PlantPhos and PhosphoRice. Rice_Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC_Os03g51600.1, a protein sequence which did not appear in the training dataset. In summary, Rice_phospho 1.0 outputs reliable predictions of protein phosphorylation sites in rice, and will serve as a useful tool to the community.

          Related collections

          Most cited references 24

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          A generic method for assignment of reliability scores applied to solvent accessibility predictions

          Background Estimation of the reliability of specific real value predictions is nontrivial and the efficacy of this is often questionable. It is important to know if you can trust a given prediction and therefore the best methods associate a prediction with a reliability score or index. For discrete qualitative predictions, the reliability is conventionally estimated as the difference between output scores of selected classes. Such an approach is not feasible for methods that predict a biological feature as a single real value rather than a classification. As a solution to this challenge, we have implemented a method that predicts the relative surface accessibility of an amino acid and simultaneously predicts the reliability for each prediction, in the form of a Z-score. Results An ensemble of artificial neural networks has been trained on a set of experimentally solved protein structures to predict the relative exposure of the amino acids. The method assigns a reliability score to each surface accessibility prediction as an inherent part of the training process. This is in contrast to the most commonly used procedures where reliabilities are obtained by post-processing the output. Conclusion The performance of the neural networks was evaluated on a commonly used set of sequences known as the CB513 set. An overall Pearson's correlation coefficient of 0.72 was obtained, which is comparable to the performance of the currently best public available method, Real-SPINE. Both methods associate a reliability score with the individual predictions. However, our implementation of reliability scores in the form of a Z-score is shown to be the more informative measure for discriminating good predictions from bad ones in the entire range from completely buried to fully exposed amino acids. This is evident when comparing the Pearson's correlation coefficient for the upper 20% of predictions sorted according to reliability. For this subset, values of 0.79 and 0.74 are obtained using our and the compared method, respectively. This tendency is true for any selected subset.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            KinasePhos: a web tool for identifying protein kinase-specific phosphorylation sites

            KinasePhos is a novel web server for computationally identifying catalytic kinase-specific phosphorylation sites. The known phosphorylation sites from public domain data sources are categorized by their annotated protein kinases. Based on the profile hidden Markov model, computational models are learned from the kinase-specific groups of the phosphorylation sites. After evaluating the learned models, the model with highest accuracy was selected from each kinase-specific group, for use in a web-based prediction tool for identifying protein phosphorylation sites. Therefore, this work developed a kinase-specific phosphorylation site prediction tool with both high sensitivity and specificity. The prediction tool is freely available at .
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Large-scale comparative phosphoproteomics identifies conserved phosphorylation sites in plants.

              Knowledge of phosphorylation events and their regulation is crucial to understand the functional biology of plants. Here, we report a large-scale phosphoproteome analysis in the model monocot rice (Oryza sativa japonica 'Nipponbare'), an economically important crop. Using unfractionated whole-cell lysates of rice cells, we identified 6,919 phosphopeptides from 3,393 proteins. To investigate the conservation of phosphoproteomes between plant species, we developed a novel phosphorylation-site evaluation method and performed a comparative analysis of rice and Arabidopsis (Arabidopsis thaliana). The ratio of tyrosine phosphorylation in the phosphoresidues of rice was equivalent to those in Arabidopsis and human. Furthermore, despite the phylogenetic distance and the use of different cell types, more than 50% of the phosphoproteins identified in rice and Arabidopsis, which possessed ortholog(s), had an orthologous phosphoprotein in the other species. Moreover, nearly half of the phosphorylated orthologous pairs were phosphorylated at equivalent sites. Further comparative analyses against the Medicago phosphoproteome also showed similar results. These data provide direct evidence for conserved regulatory mechanisms based on phosphorylation in plants. We also assessed the phosphorylation sites on nucleotide-binding leucine-rich repeat proteins and identified novel conserved phosphorylation sites that may regulate this class of proteins.
                Bookmark

                Author and article information

                Journal
                Sci Rep
                Sci Rep
                Scientific Reports
                Nature Publishing Group
                2045-2322
                07 July 2015
                2015
                : 5
                Affiliations
                [1 ]College of Life Sciences, Fujian Agriculture and Forestry University , Fuzhou 350002, China
                [2 ]Department of Mathematical Sciences, University of Essex , Wivenhoe Park, Colchester, CO4 3SQ, UK
                Author notes
                [*]

                These authors contributed equally to this work.

                Article
                srep11940
                10.1038/srep11940
                4493637
                26149854
                Copyright © 2015, Macmillan Publishers Limited

                This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

                Categories
                Article

                Uncategorized

                Comments

                Comment on this article