46
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      PseAAC-General: Fast Building Various Modes of General Form of Chou’s Pseudo-Amino Acid Composition for Large-Scale Protein Datasets

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The general form pseudo-amino acid composition (PseAAC) has been widely used to represent protein sequences in predicting protein structural and functional attributes. We developed the program PseAAC-General to generate various different modes of Chou’s general PseAAC, such as the gene ontology mode, the functional domain mode, and the sequential evolution mode. This program allows the users to define their own desired modes. In every mode, 544 physicochemical properties of the amino acids are available for choosing. The computing efficiency is at least 100 times that of existing programs, which makes it able to facilitate the extensive studies on proteins and peptides. The PseAAC-General is freely available via SourceForge. It runs on both Linux and Windows.

          Related collections

          Most cited references86

          • Record: found
          • Abstract: found
          • Article: not found

          Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms.

          Information on subcellular localization of proteins is important to molecular cell biology, proteomics, system biology and drug discovery. To provide the vast majority of experimental scientists with a user-friendly tool in these areas, we present a package of Web servers developed recently by hybridizing the 'higher level' approach with the ab initio approach. The package is called Cell-PLoc and contains the following six predictors: Euk-mPLoc, Hum-mPLoc, Plant-PLoc, Gpos-PLoc, Gneg-PLoc and Virus-PLoc, specialized for eukaryotic, human, plant, Gram-positive bacterial, Gram-negative bacterial and viral proteins, respectively. Using these Web servers, one can easily get the desired prediction results with a high expected accuracy, as demonstrated by a series of cross-validation tests on the benchmark data sets that covered up to 22 subcellular location sites and in which none of the proteins included had > or =25% sequence identity to any other protein in the same subcellular-location subset. Some of these Web servers can be particularly used to deal with multiplex proteins as well, which may simultaneously exist at, or move between, two or more different subcellular locations. Proteins with multiple locations or dynamic features of this kind are particularly interesting, because they may have some special biological functions intriguing to investigators in both basic research and drug discovery. This protocol is a step-by-step guide on how to use the Web-server predictors in the Cell-PLoc package. The computational time for each prediction is less than 5 s in most cases. The Cell-PLoc package is freely accessible at http://chou.med.harvard.edu/bioinf/Cell-PLoc.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            propy: a tool to generate various modes of Chou's PseAAC.

            Sequence-derived structural and physiochemical features have been frequently used for analysing and predicting structural, functional, expression and interaction profiles of proteins and peptides. To facilitate extensive studies of proteins and peptides, we developed a freely available, open source python package called protein in python (propy) for calculating the widely used structural and physicochemical features of proteins and peptides from amino acid sequence. It computes five feature groups composed of 13 features, including amino acid composition, dipeptide composition, tripeptide composition, normalized Moreau-Broto autocorrelation, Moran autocorrelation, Geary autocorrelation, sequence-order-coupling number, quasi-sequence-order descriptors, composition, transition and distribution of various structural and physicochemical properties and two types of pseudo amino acid composition (PseAAC) descriptors. These features could be generally regarded as different Chou's PseAAC modes. In addition, it can also easily compute the previous descriptors based on user-defined properties, which are automatically available from the AAindex database. The python package, propy, is freely available via http://code.google.com/p/protpy/downloads/list, and it runs on Linux and MS-Windows. Supplementary data are available at Bioinformatics online.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition.

              The pseudo amino acid (PseAA) composition can represent a protein sequence in a discrete model without completely losing its sequence-order information, and hence has been widely applied for improving the prediction quality for various protein attributes. However, dealing with different problems may need different kinds of PseAA composition. Here, we present a web-server called PseAAC at http://chou.med.harvard.edu/bioinf/PseAA/, by which users can generate various kinds of PseAA composition to best fit their need.
                Bookmark

                Author and article information

                Journal
                Int J Mol Sci
                Int J Mol Sci
                ijms
                International Journal of Molecular Sciences
                Molecular Diversity Preservation International (MDPI)
                1422-0067
                March 2014
                26 February 2014
                : 15
                : 3
                : 3495-3506
                Affiliations
                [1 ]School of Computer Science and Technology, Tianjin University, Tianjin 300072, China; E-Mails: shuwanggu@ 123456gmail.com (S.G.); yasenjiao@ 123456gmail.com (Y.J.)
                [2 ]Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin University, Tianjin 300072, China
                [3 ]Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
                Author notes
                [* ]Author to whom correspondence should be addressed; E-Mail: pufengdu@ 123456gmail.com ; Tel./Fax: +86-22-2368-9450.
                Article
                ijms-15-03495
                10.3390/ijms15033495
                3975349
                24577312
                f42b8ed2-b2dc-4fc8-940b-40fd72bf3091
                © 2014 by the authors; licensee MDPI, Basel, Switzerland

                This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license ( http://creativecommons.org/licenses/by/3.0/).

                History
                : 20 January 2014
                : 13 February 2014
                : 14 February 2014
                Categories
                Technical Note

                Molecular biology
                general form,large-scale datasets,pseudo-amino acid composition
                Molecular biology
                general form, large-scale datasets, pseudo-amino acid composition

                Comments

                Comment on this article