O-linked glycosylation is an important post-translational modification of mucin-type protein, changes to which are important biomarkers of cancer. For this study of the enzymes of O-glycosylation, we developed a shorthand notation for representing GalNAc-linked oligosaccharides, a method for their graphical interpretation, and a pattern-matching algorithm that generates networks of enzyme-catalysed reactions. Software for generating glycans from the enzyme activities is presented, and is also available online. The degree distributions of the resulting enzyme-reaction networks were found to be Poisson in nature. Simple graph-theoretic measures were used to characterise the resulting reaction networks. From a study of in-silico single-enzyme knockouts of each of 25 enzymes known to be involved in mucin O-glycan biosynthesis, six of them, β-1,4-galactosyltransferase ( β4Gal-T4), four glycosyltransferases and one sulfotransferase, play the dominant role in determining O-glycan heterogeneity. In the absence of β4Gal-T4, all Lewis X, sialyl-Lewis X, Lewis Y and Sd a/Cad glycoforms were eliminated, in contrast to knockouts of the N-acetylglucosaminyltransferases, which did not affect the relative abundances of O-glycans expressing these epitopes. A set of 244 experimentally determined mucin-type O-glycans obtained from the literature was used to validate the method, which was able to predict up to 98% of the most common structures obtained from human and engineered CHO cell glycoforms.
Our objective being to model the enzymes of mucin-type O-linked glycosylation, we first developed a model language to represent O-glycan structures succinctly in linear string form, to which a set of pattern-matching rules was then applied to simulate the activities of a set of 25 glycosyltransferase and sulfotransferase enzymes. The modelling language (a formal language), together with the set of transformation rules representing the enzymes of the model. comprise the deductive apparatus of a formal system. The system, implemented in software, was able to predict a highly heterogeneous set of structures when all enzymes were allowed to act, including many clinically important epitopes such as sialyl-Lewis X. We studied the effects of single-enzyme knockouts on the properties of the resulting enzyme-catalysed reaction networks and determined the enzymes most likely to be responsible for heterogeneity.