11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Protein similarity from knot theory: geometric convolution and line weavings.

      Journal of computational biology : a journal of computational molecular cell biology
      Algorithms, Biometry, Databases, Factual, Models, Molecular, Protein Structure, Tertiary, Proteins, chemistry, Sequence Alignment, Software

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Shape similarity is one of the most elusive and intriguing questions of nature and mathematics. Proteins provide a rich domain in which to test theories of shape similarity. Proteins can match at different scales and in different arrangements. Sometimes the detection of common local structure is sufficient to infer global alignment of two proteins; at other times it provides false information. Proteins with very low sequence identity may share large substructures, or perhaps just a central core. There are even examples of proteins with nearly identical primary sequences in which alpha-helices have become beta-sheets. Shape similarity can be formulated (i) in terms of global metrics, such as RMSD or Hausdorff distance, (ii) in terms of subgraph isomorphisms, such as the detection of shared substructures with similar relative locations, or (iii) purely topologically, in terms of structure preserving transformations. Existing protein structure detection programs are built on the first two types of similarity. The third forms the foundations of knot theory. The thesis of this paper is this: Protein similarity detection leads naturally to algorithms operating at the metric, relational, and isotopic scales. The paper introduces a definition of similarity based on atomic motions that preserve local backbone topology without incurring significant distance errors. Such motions are motivated by the physical requirements for rearranging subsequences of a protein. Similarity detection then seeks rigid body motions able to overlay pairs of substructures, each related by a substructure-preserving motion, without necessarily requiring global structure preservation. This definition is general enough to span a wide range of questions: One can ask for full rearrangement of one protein into another while preserving global topology, as in drug design; or one can ask for rearrangements of sets of smaller substructures, preserving local but not global topology, as in protein evolution. In the appendix, we exhibit an algorithm for answering the general rearrangement question. That algorithm has the complexity of robot motion planning. In the text, we consider a more common case in which one seeks protein similarity by rearrangements of relatively short peptide segments. We exhibit two algorithms, one based on writhing numbers and one based on line weavings. The algorithms have time complexities O(n (4)) and O(s (11)), respectively, where n is the maximum number of residues in the proteins being compared and s is the number of secondary structure elements. In practice, the running times were nearly interactive. We report results obtained with a dozen pairs of proteins, exhibiting a range of typical features.

          Related collections

          Author and article information

          Journal
          16108707
          10.1089/cmb.2005.12.609

          Chemistry
          Algorithms,Biometry,Databases, Factual,Models, Molecular,Protein Structure, Tertiary,Proteins,chemistry,Sequence Alignment,Software

          Comments

          Comment on this article