INTRODUCTION
Lysozyme (EC 3.2.1.17) is a ubiquitous enzyme. Several different types have been characterized, chicken (c-), goose (g-), phage-, invertebrate (i-), plant-, bacterial-types (for reviews, see [1]). The most studied lysozymes were the c-type enzymes and nearly 100 amino acid sequences have been established [2]. These enzymes share a high degree of similarity in their primary and tertiary structures. Their mechanism of action is very similar: they are considered to be involved in the antibacterial defense mechanism and in certain groups of mammals (ruminants, colobine monkeys) c-type lysozyme was recruited in the stomach and became a digestive enzyme [3].
Human lysozyme is synthesized in the secretory cells of a variety of exocrine glands and high concentrations were, as examples, detected in tears or mother's milk. The human lysozyme gene, its sequence organization and chromosomal localization have been described in detail by Peters et al. [4]. It is constituted by four exons and three introns. But other lysozyme genes have later been described as, for example, from hen (Gallus gallus) [5], rat (Rattus norvegicus) [6], cow (Bos taurus) [7], or pig (Sus scrofa) [8]. The present paper is devoted to their introns, more particularly to their amino acid sequences after translation which have so far not been studied.
METHODS
Translation and BLAST searches were performed according to Altschul et al. [9]. Hydrophobic cluster analysis (HCA) was achieved as described by Callebaut et al. [10].
RESULTS AND DISCUSSION
We were interested to investigate whether parts of lysozyme introns translated into amino acid sequences had closely related counterparts in biologically active, well-defined proteins: only longer sequences (30–45 amino acids) with E-values < 1 e-01, identities higher than 55% and satisfactorily HCA profiles were taken into consideration.
Human lysozyme
After translation, introns 1, 2, and 3 gave rise to peptide chains of 521, 646, and 284 amino acids, respectively. Only intron 1 (5′3′ frame 1 and frame 3) and intron 3 (3′5′ frame 2) had counterparts as defined above in various proteins. The presence of a Stop codon did not constitute an obstacle. Closely related fragments to translated intron 1 were present in human zinc finger protein (O14628), human serine/threonine-protein kinase Nek4 (P51957), human thromboxane A2 receptor (P21731), and human nitrogen-activated protein kinase 1 (O96J02). Table 1 illustrates these data when translated intron 1 is considered.
A) Human zinc finger protein (E = 2 e-08; Ident. = 74%) | ||
214 | EMGFHHVGQAGLELLASNDLPTSASQSGRITGVNHCTQP | 253 |
EMGFHH QA LELL S DLP SASQS ITGVNH QP | ||
76 | EMGFHHATQACLELLGSSDLPASASQSAGITGVNHRAQP | 114 |
B) Human thromboxane A2 receptor (E = 3 e-05; Ident. = 65%) | ||
159 | KTVSLCGPGWSAVA*SQLTATSAFWAQVILVLQPSE*L*LQ | 200 |
VSLCGP WS VA S LTATSA Q ILV QP E L LQ | ||
328 | RRVSLCGPAWSTVARSRLTATSASRVQAILVPQPPEQLGLQ | 368 |
C) Human serine/threonine-protein kinase NeK4 (E = 1 e-07; Ident. = 67%) | ||
161 | SLTVWPRLECSGMISAHCNLCLLGSSDSRASAF*VAVTTGVYHHTQ | 207 |
SL P LECSG I AH NL LLGSSDS ASA VA TGV HH Q | ||
457 | SLALSPKLECSGTILAHSNLRLLGSSDSPASASRVAGITGVCHHAQ | 502 |
D) Human mitogen-activated protein kinase kinase 1 (E = 7 e-05; Ident. = 69%) | ||
166 | PRLECSGMISAHCNLCLLGSSDSRASAF*VAVTTGV | 202 |
PRLECSG IS HCNL L GSS S ASA VA TG | ||
798 | PRLECSGTISPHCNLLLPGSSNSPASASRVAGITGL | 833 |
The numbers indicate the location of the fragment in the translated intron or in the protein.
Corresponds to a Stop codon.
Not only the sequences reported in Table 1 are related, but also the secondary structures as indicated in Figure 1 where HCA diagrams corresponding to closely related sequences (Table 1) are shown.
It should be emphasized that a high number of other translated lysozyme intron sequences with lower E-values but nevertheless significant identities could be characterized in various proteins. All the peptides described above were situated in the first half of the translated introns 1 and 3 where was located an Alu sequence. We were thus interested to extend the study to c-lysozymes of other origins.
Cow-, hen-, pig-, and rat lysozymes
The genes of the four lysozymes contain again three introns; however, the latter were devoid of an Alu sequence. This did not prevent that after translation, but to a lesser extent, some sequences, generally shorter than in the case of human introns, corresponded to sequences contained in well-defined biologically active proteins: the identities were again around 60% but with more variable E-values. Some examples are quoted in Table 2.
Rat lysoyme, intron 2, 5′3′ frame 3 compared to ubiquitin-protein ligase Nedd-4 (E = 4 e-05; Ident. = 48%) | ||
380 | QGSRAPGTGVTDSCELPCGCWESTPL---EEHPVLLASELLSS | 419 |
G PG VTD CE PCGCWE P EEH A SS | ||
26 | EGGGSPGSDVTDTCEPPCGCWELNPSSLEEEHVLFTAESIISS | 68 |
Rat lysozyme, intron 3, 3′5′ frame 1 compared to tumor necrosis factor ligand superfamily member 13B (B-cell activating factor) (E = 0.003; Ident. = 75%) | ||
39 | SDEDVELSAPPAPCLPGCCH | 58 |
DV LSAPPAPCLPGC H | ||
134 | TEQDVDLSAPPAPCLPGCRH | 153 |
Pig lysozyme, intron 1, 5′3′ frame 3 compared to major surface antigen precursor (E = 4.1; Ident. = 50%) | ||
506 | KFSW-SCSVPMAQWFKNLTPVAWVTA | 531 |
FSW S VP QWF L P W A | ||
343 | RFSWLSLLVPFVQWFVGLSPTVWLSA | 368 |
CONCLUSION
The present data constitute a contribution to studies devoted to the amino acid sequences of translated introns. These sequences seem to have a similar behavior as those corresponding to exons when the occurrence of the different amino acids (hydrophilic and hydrophobic) as well as the secondary structures are considered. They demonstrate also that these intron sequences contain a high number of short but in some cases also long sequences corresponding to the parts of biologically active proteins.