Record: found
Abstract: found
Article: not found

Searching for discrimination rules in protease proteolytic cleavage activity using genetic programming with a min-max scoring function.

Author(s): Ajit Narayanan, Q. Z. Yang, Sara K Doyle, T. Charlie Hodgman, J Dry, Rebecca Thomson, X. Wu

Publication date: 2003-10-31

Keywords: Algorithms, Artificial Intelligence, Binding Sites, Catalysis, Endopeptidases, chemistry, Enzyme Activation, Hydrolysis, Neural Networks (Computer), Oligopeptides, Peptide Fragments, chemical synthesis, Peptide Hydrolases, Protein Binding, Reproducibility of Results, Sensitivity and Specificity, Sequence Alignment, methods, Sequence Analysis, Protein, Structure-Activity Relationship

Read this article at

ScienceOpenPubMed

Bookmark

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

This paper presents an algorithm which is able to extract discriminant rules from oligopeptides for protease proteolytic cleavage activity prediction. The algorithm is developed using genetic programming. Three important components in the algorithm are a min-max scoring function, the reverse Polish notation (RPN) and the use of minimum description length. The min-max scoring function is developed using amino acid similarity matrices for measuring the similarity between an oligopeptide and a rule, which is a complex algebraic equation of amino acids rather than a simple pattern sequence. The Fisher ratio is then calculated on the scoring values using the class label associated with the oligopeptides. The discriminant ability of each rule can therefore be evaluated. The use of RPN makes the evolutionary operations simpler and therefore reduces the computational cost. To prevent overfitting, the concept of minimum description length is used to penalize over-complicated rules. A fitness function is therefore composed of the Fisher ratio and the use of minimum description length for an efficient evolutionary process. In the application to four protease datasets (Trypsin, Factor Xa, Hepatitis C Virus and HIV protease cleavage site prediction), our algorithm is superior to C5, a conventional method for deriving decision trees.

Related collections

Author and article information

Journal

PubMed ID:: 14642665

ScienceOpen disciplines: Chemistry

Keywords: Algorithms,Artificial Intelligence,Binding Sites,Catalysis,Endopeptidases,chemistry,Enzyme Activation,Hydrolysis,Neural Networks (Computer),Oligopeptides,Peptide Fragments,chemical synthesis,Peptide Hydrolases,Protein Binding,Reproducibility of Results,Sensitivity and Specificity,Sequence Alignment,methods,Sequence Analysis, Protein,Structure-Activity Relationship

Data availability:

ScienceOpen disciplines: Chemistry

Searching for discrimination rules in protease proteolytic cleavage activity using genetic programming with a min-max scoring function.

Read this article at

Abstract

Related collections

Methods by AKJournals

Author and article information

Journal

Comments

Comment on this article

Similar content 213

Cited by 3