The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized and uncharacterized enzyme superfamilies.
Enzymes are biological molecules essential for catalyzing the chemical reactions in living systems, allowing organisms to convert nutrients into usable forms and convert harmful or unneeded molecules into forms that can be reused or excreted. During enzyme evolution, enzymes maintain the ability to perform some aspects of their function while other aspects change to accommodate changing environmental conditions. In analogy to studies of enzyme evolution focused on conservation of sequence and structural motifs, we have examined a large number of enzyme superfamilies using a new computational analysis of patterns of substrate conservation. The results provide a more nuanced picture of enzyme evolution than obtained either by detailed small-scale studies or by large-scale studies that have provided only general descriptions of function and substrate similarity. The superfamilies in our set fall along the entire spectrum from the conserved substructure being mostly reacting to mostly nonreacting, with most superfamilies falling in the intermediate range. This view of enzyme evolution suggests more complex patterns of functional divergence than those that have been proposed by previous theories of enzyme evolution. The method has been automated to facilitate large-scale annotation of enzymes discovered in sequencing and structural genomics projects.