1,200
views
0
recommends
+1 Recommend
2 collections
    1
    shares

      2023 Scopus CiteScore is 2.3, SNIP 0.757, ranking 15/35 in Category "Veterinary (Miscellaneous)" and 219/344 "Medicine (Infectious Diseases)".  

      Interested in becoming a Zoonoses published author? Check out the call for papers on our website https://zoonoses-journal.org/index.php/2023/04/26/zoonoses-call-for-papers-2/

      • Platinum Open Access with no APCs & Fast peer review/Fast publication online after article acceptance
      scite_
       
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      In Silico Analysis of Cross-Species Sequence Variability in Host Interferon Antiviral Pathway Proteins and SARS-CoV-2 Susceptibility

      Published
      original-article
      Bookmark

            Abstract

            Background:

            Zoonotic transmission of severe acute respiratory coronavirus 2 (SARS-CoV-2) has been found to result in infections in more than 30 mammalian species. The SARS-CoV-2 spike protein binds to the host’s angiotensin converting enzyme 2 (ACE2) cell surface receptor to gain entry into the cell. ACE2 protein sequence conservation has therefore been evaluated across species, and species with amino acid substitutions in ACE2 were ranked low for susceptibility to SARS-CoV-2 infection. However, many of these species have become infected by the virus.

            Methods:

            This study investigated the conservation of 24 host protein targets, including the entry proteins ACE2 and transmembrane serine protease 2 (TMPRSS2); 21 proteins in the interferon-I (IFN-I) antiviral response pathway; and tethrin, a protein that suppresses new virion release from cells. Bioinformatics approaches including Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS), Molecular Operating Environment (MOE), and iCn3D software were used to compare protein sequence similarity, conserved domains, and critical amino acids for host-viral protein-protein interactions. The types of bonding interactions were scored, and the results were compared with empirical data indicating which species have or have not become infected.

            Results:

            This pathway approach revealed that 1) 13 proteins were conserved, whereas five lacked data sufficient to determine specific critical amino acids; 2) variation in protein-protein interfaces is tolerated for many amino acid substitutions, and these substitutions follow taxonomic clades rather than correlating with empirically determined species infection status; and 3) four proteins (MDA5, NEMO, IRF3, and ISG15) contained potential domains or specific amino acids whose substitution may result in PPI disruption.

            Conclusion:

            This work provides evidence that certain substitutions in four IFN-I antiviral pathway proteins appear able to disrupt interactions and may be distinctive to resistant species, thus potentially aiding in determining species’ likelihood of transmitting SARS-CoV-2.

            Main article text

            INTRODUCTION

            Many proteins in the innate immune system become involved in interactions with pathogens that enter host cells. Initiation of the interferon-I (IFN-I) antiviral response is necessary for expression of IFNs, which in turn promote the expression of interferon stimulated genes (ISGs) and thereby inhibit or limit viral replication. Early in infection, IFN-I is the main component of the innate immune system that is suppressed by the SARS-CoV-2 coronavirus [1,2]. Unchecked viral replication increases disease severity, host-to-host transmission, and the potential for mutations creating more virulent strains. Recent empirical investigations have shown that most studied mammals allow viral entry [3,4], whereas outcomes differ regarding the generation of viral loads, and intra- and interspecies transmission [5,6]. These findings have implicated factors in the antiviral pathway as important determinants of viral replication and the spread of infection, after viral entry into host cells.

            In this study, 24 proteins were investigated to understand their conservation across species. These proteins included entry proteins; proteins in the IFN-I antiviral response pathway; and tethrin (bone marrow stromal antigen 2 [BST-2]), a last defense to restrict virion release from cells (Table 1, S1 Table). The objective of this study was to evaluate pathway conservation and whether amino acid substitutions in IFN-I pathway proteins might help predict species susceptibility to SARS-CoV-2 infection. Our hypothesis was that amino acid substitutions affect not only entry proteins but also IFN-I protein target interactions with viral stressors, thereby creating cross-species variation in susceptibility to SARS-CoV-2.

            TABLE 1 |

            Host protein target and protein-protein interaction information.

            SARS-CoV-2 viral protein stressorHost proteinHuman protein length (amino acids)National Center for Biotechnology Information (NCBI) AccessionUniProt AccessionConserved domainsa Critical contact amino acidsb (numbered according to human host protein sequence)
            Non-structural protein (NSP) 2 4EHP (EIF4E2): Eukaryotic translation initiation factor 4E type 2245 NP_004837.1 O60573 Not available (NA)Not available (NA)
            40S ribosomal subunit, uS3 (S3) 243 NP_000996.2 P23396 pfam00189: Ribosomal_S3_C (105–188)Bond: 12, 113, 114, 116, 117, 143, 148
            Distance: 115, 142, 150, 177
            NSP1 40S ribosomal subunit, uS5 (S2) 293 NP_002943.2 P15880 NABond: 106, 146, 147
            Distance: 105, 107, 109, 111, 113, 122, 124, 145, 148, 151
            S (spike protein receptor binding domain [RBD]) ACE2: Angiotensin converting enzyme 2805 NP_068576.1 Q9BYF1 NABond: 19, 24, 30, 31, 35, 38, 41, 42, 83, 353
            Distance:27, 28, 34, 37, 45, 79, 82, 330, 354, 355, 357, 393
            Open reading frame (ORF) 7a BST-2: Bone marrow stromal antigen 2 (tethrin)180 NP_004326.1 Q10589 NAPredicted pocket: 48, 49, 50, 51, 52, 53, 84, 85, 87, 88, 89, 91, 92, 105, 106, 108, 109, 110, 112, 113, 117
            N (nucleocapsid) G3BP1: ras GTPase-activating protein SH3 domain–binding protein466 NP_005745.1 Q13283 cd00780: NTF2 (Nuclear transport factor domain 2) (7–135)Bond: 15, 18, 32, 34, 117, 122, 123, 124
            Distance: 6, 10, 11, 14, 33, 58, 121, 125
            NSP2 GIGYF2: GRB10-interacting GYF protein 21299 NP_056390.2 Q6Y7W6 NAPredicted binding region: 864, 877, 881-896
            NSP3 Papain-like protease (Plpro) IRF3: Interferon regulatory factor 3427 NP_001562.1 Q14653 cd00103: IRF TF DNA recognition site (5–110)
            pfam10401: IRF-3 Superfamily res (202–380)
            Bond: 211, 263, 285, 288, 290, 313, 349, 350, 351, 360
            Distance: 260, 264, 287, 289, 292, 362
            Cleavage site: 267, 268, 269, 270, 271, 272, 273
            NSP3 Papain-like protease (Plpro) ISG15: Ubiquitin-like IFN stimulated gene 15165 NP_005092.1 P05161 cd01792: Ubl1_ISG15 (3–77)
            cd01810: Ubl2_ISG15 (82–155)
            Bond: 22, 123, 127, 128, 130, 153, 154, 155, 156
            Distance: 20, 23, 27, 30, 121, 129, 131, 132, 151, 152
            ORF6 KPNA2: Karyopherin Subunit Alpha 2 (Importin subunit alpha-1)529 NP_001307540.1 P52292 pfam01749: IBB importin beta binding domain (13–98)NA
            M (membrane) MAVS: Mitochondrial antiviral signaling protein540 NP_065797.2 Q7Z434 cd08811: CARD_IPS1 (3–93)Bond: 436, 440, 441, 442, 456, 457, 458, 459
            Distance: 438, 439, 453, 455
            NSP3 Papain-like protease (Plpro) MDA5: Melanoma differentiation-associated gene 5 (Interferon-induced helicase C domain-containing protein 1)1025 NP_071451.2 Q9BYX4 cd08818: CARD_MDA5_ r1 (8–99) cd08819: CARD_MDA5_r2 (111–201)
            cd12090: MDA5_ID (insert domain) (548–694) cd15807: MDA5_C (C-term) (900–1015)
            ISGylation: 23, 43, 68
            CARD oligomerization: 74, 75
            Serine phosphorylation: 88, 104
            ORF9b NEMO: Nuclear factor kappa-B (NF-κB) essential modulator419 NP_003630.1 Q9Y6K9 cd09803: UBAN (polyubiquitin binding domain of NEMO) (258–344)NA
            NSP5 3-chymotrypsin-like protease (3CLpro) NEMO Bond: 226, 228, 229, 230, 231, 232, 233, 234
            Distance: 227, 235
            NSP1 POLA1: DNA polymerase alpha 1, catalytic subunit1462 NP_058633.2 P09884 cd05776: DNA_polB_alpha_exo (535–767)Bond: 610, 613, 616, 617, 655
            Distance: 594, 611, 614, 620, 621, 624, 625, 657
            ORF6 RAE1: Ribonucleic acid export 1368 NP_003601.1 P78406 NABond: 239, 256, 257, 258, 305, 306, 307, 309, 310
            Distance: 255, 300, 301, 302, 308, 365
            N (nucleocapsid) RIG-I: Retinoic acid-inducible gene I925 NP_055129.2 O95786 cd08816: CARD_RIG-I_r1 (N-term) (2–92) cd08817: CARD_RIG-I_r2 (100–189) pfam04851: RES III (241–410)NA
            NSP9 SRP19: Signal recognition particle 19144 NP_003126.1 P09132 NANA
            NSP8 SRP54: Signal recognition particle504 NP_003127.1 P61011 NANA
            N (nucleocapsid), S (spike), NSP13 (helicase) STAT1: Signal transducer and activator of transcription 1750 NP_009330.1 P42224 cd10372: SH2_STAT1 (557–707)Bond: 628, 629, 630, 631, 632, 651, 652
            Distance: 584, 585, 588, 616, 627, 633, 634, 653
            Phosphorylation sites: 701, 702, 703, 704, 705, 706, 707, 724, 727
            N (nucleocapsid) Orf7a, Orf7b, NSP6, NSP13 STAT2: Signal transducer and activator of transcription 2851 NP_005410.1 P52630 cd10373: SH2_STAT2 (556–706)Phosphorylation sites: 583, 601, 627, 629, 690
            NSP5 (3CLpro), ORF3a STING: Stimulator of interferon genes379 NP_938023.1 Q86WV6 cd12146: STING_C (C-terminal binding domain of STING) (155–337)NA
            NSP6 TBK1: TANK-binding kinase1729 NP_037386.1 Q9UHD2 NANA
            NSP13 TBK1 cd13988: STKc_TBK1 (15–330)Predicted from model: 50, 51, 52, 54, 123, 127, 132, 133, 134, 135, 159, 177, 180, 181, 182, 183, 200, 201, 202, 203, 204, 206, 207, 232, 235, 289, 290, 294, 295
            S (spike protein) TMPRSS2: Transmembrane serine protease 2492 NP_005647.3 O15393 cd00190: Tryp_SPc (Trypsin-like serine protease active/cleavage site) (256–487)Bond: 296, 299, 338, 340, 341, 342, 389, 391, 392, 436, 438, 439, 441, 460, 462, 464
            Distance: 280, 300, 419, 435, 437, 440, 459, 461, 463, 465, 470, 471, 472
            ORF9b TOMM70: Translocase of outer mitochondrial membrane608 NP_055635.3 O94826 NABond: 219, 225, 379, 381, 413, 447, 477, 484, 544, 545, 556, 580, 594
            Distance: 215, 256, 259, 260, 340, 341, 375, 378, 409, 410, 412, 414, 443, 480, 481, 511, 515, 518, 521, 522, 546, 549, 550, 553, 557, 559, 561, 579, 583, 587, 590, 591, 598

            aFor conserved domains, amino acids within the domain are given in parentheses, numbered according to the human protein; see the Conserved Domain Database (ncbi.nlm.nih.gov/cdd).

            bCritical amino acid interactions are identified by the type of contact as “bond” (hydrogen, metal, ionic, arene, or covalent) or “distance” (e.g., Van der Waals forces), or by function (see Materials and Methods).

            To gain entry to cells, the SARS-CoV-2 viral spike (S) protein binds the angiotensin converting enzyme 2 (ACE2) receptor; subsequently the S protein is primed by transmembrane serine protease 2 (TMPRSS2), which proteolytically cleaves and activates viral envelope glycoproteins, thus enabling membrane fusion [7]. ACE2 has been the sole subject of studies aimed at predicting alternate host susceptibility. In the first report of a SARS-CoV virus in 2003 [8], and reports of the original SARS-CoV-2 Wuhan strain [3,913] and its variants [14,15], researchers have compared ACE2 sequences across species, with respect to interacting amino acid residues. However, wild and domestic mammals have been reported to contract COVID-19 regardless of the wide range of substitutions in the ACE2-S interacting amino acids [4,16].

            Extensive protein-protein interaction (PPI) networks have been demonstrated among SARS-CoV-2 viral RNA, non-structural proteins (NSPs), structural and accessory (open reading frame [ORF]) proteins, and host molecules [17]. The viral structural proteins include the S, envelope (E), membrane (M), and nucleocapsid (N) proteins. Viral-host PPIs in the IFN-I pathway (S1 Fig), as well as proteins involved in viral entry and “exit,” have been investigated through comparison of full protein sequences, conserved functional domains, and individual amino acid residues across species (Table 1). The criteria for the selection of the proteins investigated herein were 1) evidence of PPI between a SARS-CoV-2 protein and the host protein, and 2) experimentally demonstrated inhibition of IFN and ISG expression caused by the viral-host PPI.

            New Approach Methods (NAMs) are being developed in toxicology to avoid the use of live animals in chemical safety evaluations [18]. These methods include in vitro and in silico methods that can be used to predict cross-species susceptibility to chemicals. In silico NAMs are well suited to cross-species evaluations of PPIs in pathogen infection. The US Environmental Protection Agency’s Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool (https://www.epa.gov/chemical-research/sequence-alignment-predict-across-species-susceptibility) [19] was applied to evaluate conservation of the PPIs between host and SARS-CoV-2 pathogen protein stressors across species. The SeqAPASS tool (Version 6.1) assesses protein sequence data across species, by comparing the full or primary amino acid sequence (Level 1), conserved portions of a sequence representing functional domain(s) (Level 2), and individual amino acid positions, to determine whether the critical contact residues are conserved (Level 3). Each level requires a more in-depth preliminary investigation of the protein to identify features and residues involved in interactions with a stressor. Empirical data found through literature searches, protein sequence and structural databases, and computational molecular models were used to identify available sequences and species known to have tested positive for SARS-CoV-2 RNA or antibodies (seroconversion). In addition, the literature was mined to identify conserved binding domains, and interacting amino acid residues between SARS-CoV-2 proteins and host protein targets. As empirically determined SARS-CoV-2 infections across many species of mammals continue to be reported [16], cell entry apparently occurs in these species.

            Most viral protein mutations in the recent SARS-CoV-2 variants have not had their structures elucidated beyond the ACE2-S protein interaction. Therefore, this analysis included only evaluations involving the original Wuhan strain of SARS-CoV-2.

            The pathway approach described herein offers a novel and comprehensive perspective on the crucial epidemiological concerns regarding various species’ propensity to allow viral production and transmission.

            MATERIALS AND METHODS

            Host protein target selection

            To determine protein targets of SARS-CoV-2, the first criterion for protein selection was evidence of PPI between a SARS-CoV-2 protein and the host protein. These proteins included the entry proteins ACE2 and TMPRSS2, and the exit protein tethrin. Second, IFN-I proteins were the study focus because of their key roles in initial antiviral defense as a response pathway. Therefore, proteins in the canonical innate immune pathways involved in general responses to viruses—particularly those known to be involved in responses to coronaviruses including the first SARS-CoV—were considered in our literature review. The literature searches were conducted in the PubMed and Google Scholar databases. A wide variety of search terms were used in conjunction with “SARS-CoV-2” and “COVID-19” to identify information on specific protein interactions, such as “SARS-CoV-2 nsp3+IRF3” (with both acronyms and full terms used in search strings). This process identified proteins that are crucial components of the IFN-I pathway signaling cascade that have also been shown to interact with SARS-CoV-2 proteins (S1A and B Fig). Searches were then performed in the Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) structure database (RCSB.org), National Center for Biotechnology Information (NCBI; ncbi.nlm.nih.gov/protein), and UniProt (uniprot.org) protein databanks, as well as the CDD conserved domain database (ncbi.nlm.nih.gov/cdd) [20,21], to find structural, functional, and conserved domain information for the identified host protein targets of SARS-CoV-2 proteins. Critical amino acids for binding were determined for targets for which experimental data—such as site-directed mutagenesis, solved structures, or structural modeling—were available. The 24 proteins meeting these criteria are listed in Table 1.

            Determination of critical amino acids for protein-protein interaction

            Critical amino acids for host-viral PPI were determined from the literature for several targets for which experiments such as truncation (MAVS) or cleavage (IRF3, NEMO) had been performed (Table 1, S1 Table). In some cases, e.g., IRF3, structures had previously been determined for PPI between IRF3 and its adaptor proteins STING, MAVS, and TRIF, which bind at the same location as rotavirus non-structural protein 1 (NSP1). The binding locations of these normal host-host PPIs were used to infer potential binding locations for SARS-CoV-2 proteins with the host target.

            Structural modeling had previously been conducted for TBK1 [22] and BST-2 [23] to propose interacting residues, through modeling strategies explained in those publications. The likely pocket residues for BST-2 were confirmed with the Molecular Operating Environment (MOE) software (CCG, Montreal, CA) Site Finder tool, using the unbound crystal structure PDB 3NWH.

            For targets for which the PPI had been structurally determined, e.g., by X-ray diffraction or cryo-EM, the interacting amino acids were identified using iCn3D (https://www.ncbi.nlm.nih.gov/Structure/icn3d/icn3d.html) [24], as well as the MOE software contacts tool (default settings), to view inter-chain interactions. Both iCn3D and MOE identify the contact type as “bond” (hydrogen, metal, ionic, arene, or covalent) or “distance” (e.g., Van der Waals forces). The energy (Gibbs free energy) of the interaction is given in kcal/mol, and the distance of contact is given in angstroms (Å) with a default cut-off for MOE of < 4.5 Å and iCn3D set to 5 Å. Interactions were compared for each structure, and across any additional structures for which more than one structure was available per target in the RCSB PDB. Additionally, any contacts specified by the researchers who published the structure were also compared. All amino acids forming bonds, as indicated by any method, were selected as critical bond contact amino acids. Distance contact amino acids were selected when agreement was found among most methods and/or the MOE contact data indicated that the host residue contacted two or more residues in the viral protein. Distance contacts without agreement among the majority of methods and with single contact residues were selected only if either the distance was < 3.8 Å or the energy was < −0.5 kcal/mol. Contacts not meeting these criteria were not included in the SeqAPASS Level 3 analyses.

            SeqAPASS evaluations and clustered heatmaps

            SeqAPASS queries for each protein were run according to the SeqAPASS User Guide (Version 6.1; epa.gov/chemical-research/seqapass-user-guide), based on information available for each selected host protein. If no information was available on specific residues, only SeqAPASS Level 1 was performed, and Level 2 was performed if specific hits in the CDD corresponded to protein binding locations. For the proteins for which critical amino acids for binding were also determined, SeqAPASS Levels 1, 2, and 3 were performed. Briefly, a query protein sequence or NCBI accession number was used to begin a query. Herein, the human protein was used as the query species for all SeqAPASS jobs. The SeqAPASS algorithms use information from the NCBI protein database, conserved domains database, and taxonomy database, and align sequences with the Stand-Alone Basic Local Alignment Search Tool for proteins (BLASTp) and the Constraint-based Multiple Alignment Tool (COBALT). We did not use the susceptibility cut-off and ortholog candidate predictions; instead, our focus was on the percentage similarity to the query (human) sequence. SeqAPASS calculates the percent similarity by dividing the best hit BLAST bitscore for each hit species by the query species BLAST bitscore, and multiplying by 100 [19].

            Normally, SeqAPASS Level 3 evaluations yield a final “yes” or “no” susceptibility call based on the conservation of individual, critical amino acids. However, each PPI within the SARS-CoV-2 infection pathway is mediated by multiple amino acid residue contacts. Therefore, to account for each of these interacting amino acids, we converted the SeqAPASS Level 3 evaluation results through an external analysis not connected to the SeqAPASS tool into a similarity score with a custom R (v4.1.3) function. Within SeqAPASS Level 3, amino acid conservation is estimated according to three parameters: residue identity, side chain classification, and molecular weight, which are given either a “yes” or “no” result based on comparison with a reference protein sequence [19]. Of note, a match for molecular weight is an amino acid with a difference of ≤ 30 g/mol with respect to the template amino acid (as described in the SeqAPASS User Guide). The similarity scoring function summed all “yes” calls for each interacting amino acid across these three parameters, and this count was doubled to weight the stronger bonds (for amino acids involved in hydrogen, arene, or ionic bonding). The summed value of the bond and distance residues was then normalized through conversion into a percentage of the total possible score (i.e., the score if all parameters were “yes” for each amino acid), thus yielding the “percentage similarity score.” This procedure was performed for each set of bond and distance interacting amino acids in each selected host protein in the study with crystal or cryo-electron microscopy (cryoEM) structures (Table 1), across all species. Empirical data capturing infection, replication, and transmission from the literature were compared with relevant protein sequence data.

            Heatmap generation and hierarchical clustering were performed with percentage similarity data generated through SeqAPASS for Levels 1 and 2, and the above-described similarity scores for Level 3, after data curation. The R package “ComplexHeatmap” [25] was used, wherein dendrogram clusters were determined through Euclidean distance calculations. Sidebar annotations of viral RNA, seroconversion, and transmission were added according to results from previous studies.

            RESULTS

            Critical contact amino acids in host target proteins

            The amino acids in the host proteins involved in PPI or in post-translational modifications were determined, because substitution at these positions in other species might cause disruption of the PPI, potentially resulting in differences in SARS-CoV-2 infection and transmissibility. Of the 24 proteins evaluated, 13 had, at the time of the study, been empirically elucidated with physiochemical methods including X-ray crystallography or cryoEM, and some viral-host protein complexes had more than one structure available in the RCSB PDB (S1 Table). All structures were of high quality, with resolutions below 3 Å, except for the horseshoe bat ACE2-S (PDB 7XA7), which had a resolution of 3.3 Å. Several lines of evidence of PPI were gathered through structural analyses in MOE software and iCn3D web-based structure viewer (S2 and S3 Tables). Both the MOE and iCn3D structural analyses indicated the types of binding interactions as hydrogen, covalent, ionic, or arene. Most bonds were hydrogen, some ionic, and arene or pi-stacking bonds also occurred for residues with aromatic ring structures, such as in phenylalanine. For this study, all of these were considered together as “bond” interactions. Interactions within a default cutoff of 4.5 Å (e.g., potential van der Waals forces) were designated as “distance” interactions by MOE, whereas a cutoff setting of 5 Å was used in iCn3D. The following lines of evidence were compared: interactions identified by 1) MOE and 2) iCn3D; 3) interactions determined by researchers who empirically solved the structures; and 4) interactions across different solved structures for the same complexed proteins. The results for these lines of evidence were similar but not in complete agreement (S2 and S3 Tables). A similar number of bond and distance interactions was identified for each of these structures (Table 1), with an average of 8.6 bond and 9.6 distance residues per protein. This finding was somewhat skewed by TOMM70, which had 13 bond and 33 distance residues. Of the 13 host proteins with solved structures, nine of the bound structures were host protein and SARS-CoV-2 protein complexes (40S ribosomal subunits uS3 and uS5-NSP1, ACE2-S, G3BP1-N, ISG15-NSP3, NEMO-NSP5, POLA1-NSP1, RAE1-Orf6, and TOMM70-Orf9b). In the four other structures, the host target protein was complexed with activating proteins (such as IRF3 with STING, MAVS, and TRIF; and MAVS with IRF3 and TRAF6), proteins from other viruses (IRF3 bound to rotavirus NSP1; STAT1 bound to Vaccinia [pox] virus protein 018), or an inhibitor bound to the active site (TMPRSS2 complex with nafamostat). These findings were considered to plausibly indicate that these sites are binding regions for the SARS-CoV-2 proteins. Additional post-translational modification sites were studied for ACE2 (N-glycosylation sites) [9], IRF3 (Plpro cleavage site) [26], and MDA5 and STAT1 and 2 (phosphorylation sites). Predicted binding residues from the literature were investigated for five additional proteins without bound crystal or cryoEM structures: BST-2, GIGYF2, MDA5, STAT2, and TBK1 (Table 1).

            For ACE2, two human [27,28] and seven other mammalian ACE2-receptor-binding domain (RBD) structures (original Wuhan strain) have been solved. These additional mammals include the dog [29], cat [12], sea lion [30], pangolin [31], intermediate horseshoe bat [32], horse [33], and minke whale [30] (Table 2). We used MOE to investigate contacts in the mammalian structures, and identified 22 amino acids showing bond or distance interactions in most species (S4 Table). Highly conserved positions included S19, E35, Y83, K353, and D355; K31 and Q42 showed substitutions in only bats (asparagine) and minke whales (arginine), respectively. Most species had substitutions at Q24, D30, H34, D38, and M82. Bonding occurred in all species at K353, and at D38 regardless of D38E substitutions in all species but the minke whale. At all other positions, either bond or distance interactions were observed for the identical residues in different species. At least one amino acid in the RBD was paired in common in all species at each ACE2 binding position, regardless of the identity of the amino acid in the ACE2 position for that species, except at E35 and M82. The most interactions that occurred in common were G496, N501, G502, and Y505 in the S protein RBD with the conserved K353 in ACE2 in all host species. In variable positions, most substitutions were a partial match, but at H34, sea lions, pangolins, and horses have serine, and at M82, dogs, cats, horses, sea lions, and minke whales have threonine non-conservative substitutions (Table 2).

            TABLE 2 |

            Structural basis for SARS-CoV-2 spike protein receptor-binding domain (RBD) amino acid interactions with specific angiotensin converting enzyme 2 (ACE2) receptor amino acids in several mammals.

            SpeciesHuman
            DogCatSea lionPangolinBatHorseMinke whale
            Study[27][28][29][12][30][31][32][33][30]
            PDB6VW16M0J7E3J7C8D7WSH7DHX7XA77FC57WSE
            Kd mammal 123 85.70 170.69 34.97 448.3 11.04 175.98
            (nM) ± SENA19.1632.244.3170NA40.81
            Kd human 18.8 21.73 25.68 10.19 22.4 4.39 25.68
            (nM) ± SENA1.546.820.282.7NA6.82

            Spike RBD position Host Angiotensin Converting Enzyme 2 (ACE2) amino acid position

            Glu406Arg34

            Lys417Arg34
            Lys417 Asp30 Asp30 Glu30 Glu30 Glu30 Glu30 Asp30 Glu30 Gln30

            Gly446Gln42 Gln42 Gln42 Gln42 Gln42Gln42Gln42 Gln42 Arg42

            Tyr449 Asp38 Asp38 Glu38 Glu38Glu38Glu38Glu38Glu38Asp38
            Tyr449Gln42Gln42Gln42Gln42 Gln42 Gln42 Gln42 Gln42 Arg42

            Tyr453His34His34Tyr34His34ɸSer34Ser34Arg34Ser34His34

            Leu455Asp30Asp30Glu30Glu30Glu30Glu30Asp30Glu30Gln30
            Leu455Lys31Lys31Lys31Lys31Lys31Asn31Lys31Lys31
            Leu455His34His34ɸTyr34His34Ser34Ser34Arg34Ser34His34

            Phe456Asp30Asp30 Glu30 Glu30Glu30Glu30Asp30Glu30Gln30
            Phe456Lys31Lys31Lys31Lys31Lys31Lys31Asn31Lys31Lys31

            Ala475 Ser19 Ser19Ser19Ser19Ser19 Ser19 Ser19Ser19Ser19
            Ala475Gln24Gln24Leu24Leu24Leu24Glu24Arg24Leu24Gln24

            Gly476Ser19Ser19Ser19Ser19Ser19
            Gly476Gln24Gln24Leu24Leu24Leu24Glu24Arg24Leu24Gln24

            Ser477Ser19Ser19
            Ser477Gln24Gln24

            Glu484Lys31Lys31Lys31Lys31

            Phe486Met82Met82Thr82Thr82Asn82Asn82Thr82Thr82
            Phe486Tyr83Tyr83Tyr83Tyr83Tyr83Tyr83Tyr83

            Asn487 Gln24 Gln24 Leu24Leu24Leu24 Glu24 Arg24Leu24 Gln24
            Asn487 Tyr83 Tyr83 Tyr83Tyr83 Tyr83 Tyr83 Tyr83 Tyr83 Tyr83

            Tyr489Gln24Gln24Leu24Leu24Glu24Leu24
            Tyr489Lys31Lys31Lys31Lys31Lys31Lys31Asn31Lys31Lys31
            Tyr489Tyr83Tyr83Tyr83Tyr83Tyr83Tyr83Tyr83Tyr83Tyr83

            Gln493Lys31 Lys31 Lys31Lys31 Lys31 Asn31 Lys31 Lys31
            Gln493His34His34 His34 Ser34Ser34Arg34 Ser34 His34
            Gln493 Glu35 Glu35 Glu35Glu35Glu35 Glu35 Glu35

            Ser494Glu38His34

            Tyr495Glu38His34

            Gly496Asp38Asp38 Glu38 Glu38 Glu38 Glu38 Glu38 Glu38 Asp38
            Gly496 Lys353 Lys353 Lys353 Lys353 Lys353 Lys353 Lys353 Lys353 Lys353

            Gln498Asp38Asp38 Glu38 Glu38 Glu38 Glu38 Glu38Asp38
            Gln498 Gln42 Gln42Gln42Gln42Gln42Gln42 Gln42 Arg42
            Gln498Lys353 Lys353 Lys353 Lys353 Lys353 Lys353Lys353

            Thr500Asp355Asp355 Asp355 Asp355 Asp355Asp355Asp355 Asp355 Asp355

            Asn501Lys353Lys353 Lys353 Lys353Lys353 Lys353 Lys353Lys353 Lys353
            Asn501Asp355Asp355

            Gly502 Lys353 Lys353 Lys353 Lys353 Lys353 Lys353 Lys353 Lys353 Lys353
            Gly502Asp355Asp355Asp355Asp355Asp355Asp355Asp355Asp355

            Tyr505Lys353ɸLys353Lys353Lys353Lys353Lys353ɸLys353ɸLys353Lys353ɸ

            Interactions detected in the indicated structures from the Protein Data Bank (PDB) with the Molecular Operating Environment (MOE) software, contacts tool, limited to amino acids with hydrogen, ionic, or arene bonds (S1 File). Types of interactions are identified as follows: underline = hydrogen bond, bold = ionic bond, ɸ = arene/stacking bond, and all interacting pairs had distance bonds within 4.5 Angstroms (Å) (i.e., potential Van der Waals forces). Empty cells indicate no interaction detected. Blue = partial match to residue in human query sequence (conservative substitution). Yellow = not a match to human (non-conservative substitution), on the basis of SeqAPASS amino acid categorization. Green = indicated in literature as important contacts for binding (Damas et al., 2020; Luan et al., 2020). Binding affinity (Kd) between the ACE2 and SARS-CoV-2 spike protein receptor-binding domain (RBD) was determined by the authors of each referenced study for humans and the respective species of interest; nM = nanomoles/L, ± standard error (SE); NA = no SE reported.

            Empirically determined species infections, seroconversion, and transmission

            Since the start of the COVID-19 pandemic in late 2019, efforts have been made to identify potential intermediate or reservoir hosts. Bats are generally accepted to be a reservoir species [7]. Although speculation regarding the origin of the COVID-19 pandemic is ongoing, many mammals have been found to be susceptible and to act as intermediate hosts. Reports including confirmed natural infections as well as experimentally exposed animals with outcomes of either resistance, or infection with or without symptoms and transmission, have been compiled by Nielsen et al. (2023) and Nerpel et al. (2022) [4,16]. From this information and further investigation of the primary literature, we generated a list of species of interest (with available data on exposure outcomes) (Table 3). Species determined to transmit the virus in addition to primates include the prairie deer mouse (Peromyscus maniculatus bairdii), golden hamster, white-tailed deer, raccoon dog, red fox, domestic cat, domestic ferret, American mink, and Egyptian rousette bat. Species in which oral or nasal viral shedding was found but direct transmission was not investigated, include the woodrat, skunk [34], and red fox [35]. Notable species not found to transmit the virus include the house mouse (Mus musculus), pig, domestic dog, cow, and raccoon. Some recent data, with possible caveats, are as follows. The house mouse resists viral entry of the original Wuhan strain [36]. Of 39 Norway rats sampled in the sewer system in Antwerp, Belgium, none had SARS-CoV-2 RNA or antibodies while the original virus was circulating [37]. More than 600 racehorses in California were tested through 2020 for viral presence in nasal secretions (qPCR) and serum antibodies (ELISA), and of these, 0% positive qPCR tests and 5.9% positive tests for serum antibodies to SARS-CoV-2 were reported [38]. Seroconversion was not detected for big brown bats, and was inconsistently observed among pigs and ferrets (Table 3).

            TABLE 3 |

            Empirical data indicating confirmed measures of SARS-CoV-2 infection and transmission for known tested species.

            FamilyGenus/speciesCommon nameViral RNASero-conversionTransmission or sheddingSequences available

            Class Mammalia, superorder Euarchontoglires
            Order Primates
            Hominidae Homo sapiens HumanYesYesYesYes
            Gorilla gorilla gorilla Western lowland gorillaYesYesYesYes
            Cercopithecidae Chlorocebus sabaeus Green monkeyYesYesYesYes
            Papio anubis Olive baboonYesniniYes
            Macaca mulatta Rhesus monkeyYesY/NniYes
            Cebidae Callithrix jacchus White-tufted-ear marmosetYesniniYes
            Order Scandentia
            Tupaiidae Tupaia chinensis Chinese tree shrewYesniniY/N
            Order Lagomorpha
            Leporidae Oryctolagus cuniculus RabbitYesYesniY/N
            Sylvilagus floridanis Eastern cottontail rabbitNoNoNoNo
            Order Rodentia
            Sciuridae Neosciurus carolinensis Gray squirrelYesYesniYes
            Urocitellus elegans Wyoming ground squirrelNoNoNoNo
            Sciurus niger Fox squirrelNoNoNoNo
            Cynomys ludovicianus Black-tailed prairie dogNoNoNoNo
            Cricetidae Peromyscus maniculatus bairdii Prairie deer mouseYesYesYesYes
            Mesocricetus auratus Golden hamsterYesYesYesYes
            Neotoma cinerea Bushy-tailed wood rat YesYesYesNo
            Myodes glareolus Bank voleYesYesNoY/N
            Muridae Rattus norvegicus Norway rat NoNoNoYes
            Mus musculus House mouse NoNoNoYes

            Class Mammalia, superorder Laurasiatheria

            Order Artiodactyla
            Cervidae Odocoileus virginianus texanus White-tailed deerYesYesYesYes
            Bovidae Bos taurus CowYesYesNoYes
            Equidae Equus caballus Horse § NoYesNoY/N
            Suidae Sus scrofa PigY/NY/NNoYes
            Order Carnivora
            Canidae Nyctereutes procyonoides Raccoon dogYesYesYesYes
            Canis lupus familiaris DogNoYesNoYes
            Vulpes vulpes Red fox†YesYesYesYes
            Canis latrans CoyoteNoNoNoNo
            Felidae Lynx canadensis Canada lynxYesniniYes
            Felis catus Domestic catYesYesYesYes
            Prionailurus viverrinus Fishing catYesniniY/N
            Panthera tigris TigerYesYesniYes
            Lynx rufus BobcatYesniniYes
            Mustelidae Mustela putorius furo Domestic ferretYesY/NYesYes
            Neogale vison American minkYesYesYesYes
            Meles meles Eurasian badgerNoYesniYes
            Aonyx cinereus Asian small-clawed otterYesniniNo
            Mustela lutreola European minkYesYesYesNo
            Procyonidae Procyon lotor RaccoonNoYesNoY/N
            Viverridae Arctictis binturong Binturong (bearcat)YesniniNo
            Mephitidae Mephitis mephitis Striped skunk†YesYesYesNo
            Hyenidae Crocuta crocuta Spotted hyenaYesniniY/N
            Order Pholidota
            Manidae Manis javanica Malayan pangolinniYesniYes
            Order Chiroptera
            Vespertilionidae Eptesicus fuscus Big brown batYesNoniY/N
            Pteropodidae Rousettus aegyptiacus Egyptian rousetteYesYesYesYes
            Other classes
            Class Aves Columba livia Rock pigeonNoNoNoY/N
            Class Crocodylia Alligator sinensis Chinese alligatorNoNoNoYes
            Class Amphibia Xenopus laevis African clawed frogNoNoNoYes
            Class Actinopteri Pimephales promelas Fathead minnowNoNoNoYes

            All data were compiled in EFSA/Nielson et al. [4] except where noted. Susceptibility indicators: yes = detected or transmitted; no = not detected (or no evidence of transmission); ni = not investigated; Y/N = equivocal results: studies had different findings. Sequence availability: yes = sequences available for all investigated proteins; no = no sequences were available for that species; Y/N = some complete sequences were available/some sequences were partial or rejected because of to low quality.

            Species in which oral/nasal viral shedding was found, but direct transmission was not investigated (woodrat, skunk [34] and red fox [35]).

            On the basis of Norway rats sampled in the sewer system in Antwerp, Belgium [37].

            The house mouse resists viral entry of the original Wuhan strain [36].

            §Racehorses tested in California through 2020 [38].

            SeqAPASS cross-species comparisons

            Primary amino acid sequences for the 24 investigated proteins were aligned in Level 1 of SeqAPASS, through comparison of all species with sequences in the NCBI database (S1 File). Twenty-six mammalian species with empirically determined SARS-CoV-2 exposure outcomes had primary amino acid sequences for the 24 proteins. Conserved domains were not found in the literature or the NCBI CDD for 10 of the 24 proteins. Conserved domains were obtained for the potential binding regions of 14 proteins, some of which contained more than one conserved domain, for a total of 22 (S2 File). Twenty-nine species had available sequences for these 14 proteins. Most of the conserved domains were specific hits in the CDD; however, some were non-specific but had high e-values and were specific to the regions encompassing interacting amino acids, as indicated in the literature and determined through molecular structural analyses (Table 1). Critical amino acid data used in SeqAPASS Level 3 were obtained for the species of interest for 18 of the 24 proteins, for which 26 species had sequences (S3 File). The SeqAPASS Level 1, 2, and 3 results (compiled in S2-S25 Figs) indicated that sequence similarity was markedly lower in the non-mammalian than in the mammalian vertebrates, except for the conserved 40S ribosomal subunits and the signal recognition protein SRP54; moreover, 4EHP, RAE1, and TBK1 were also well conserved. The SeqAPASS results were curated to replace partial with complete sequences, or predicted with empirically determined sequences, when available (S4 File). Clustered heat maps were generated from the curated data to compare the sequence similarity to the human query sequence for Level 1 (Fig 1A) and Level 2 (Fig 1B), and a similarity score (Methods Section 2.3.1) was determined for the 13 proteins with available structures for Level 3 (Fig 1C) among species with published COVID-19 susceptibility information. A cluster of 11 proteins in Level 1 were conserved across species, but showed visually apparent variations in percentage similarity in the rodent compared to the human query sequences for STAT1, TOMM70, and G3BP1 (Fig 1A). The two least conserved proteins were MAVS and BST2. The Level 1 scores ranged from approximately 20% similarity to the human query sequence for BST2 in several species including the Norway rat, to 100% similarity for all species’ 40S_S3 sequences. The species clustered according to taxonomic clades, as might be expected for full sequences (Fig 1A).

            Next follows the figure caption
            FIGURE 1 |

            Clustered heatmaps for SeqAPASS results, based on sequence similarity (%) for Level 1 primary sequence (A), Level 2 conserved domain sequence (B), and Level 3 similarity score (C). Right sidebars indicate viral RNA detection, seroconversion, and transmission of SARS-CoV-2 infection according to literature data (see text and Table 3). Green (Y) indicates a positive test, red (N) indicates that the parameter was not detected, yellow (Y/N) indicates equivocal results, and blue (UNK) indicates unknown status (the parameter was not tested or investigated). The dendrogram at the left shows hierarchical clustering of species, whereas the dendrogram across the top shows the hierarchical relationships of the proteins, according to the level of sequence conservation.

            Conserved domains are regions of interest likely to contain binding sites in the target protein. The sequence lengths in the hit proteins ranged from 74 (ISG15 cd01102) to 316 amino acid residues (TBK1 cd13988) (Table 1, S2 File). The domain encompasses the contact residues that had been determined from the respective crystal structures for the 40S ribosomal subunit uS3 pfam0189, G3BP1 cd00780, IRF3 pfam10401, ISG15 cd01792 and cd01810, MDA5 cd08818 and cd08819, POLA1 cd05776, STAT1 cd10372, and TMPRSS2 cd00190. In cases with no viral protein-bound structures, domains were selected based on specific hits in the CDD in protein-protein binding domains. These proteins included KPNA2, MAVS, MDA5, NEMO (cd09803 for potential binding with Orf9b), RIG-I, STAT2, and STING. The TBK1 cd13988 encompassed the in silico predicted binding residues (Table 1). The most variable cluster among the 22 domains included ISG15 cd01792 and cd01810, MAVS cd08811, and MDA5 cd12090 (Fig 1B). The most conserved cluster mirrored that in Level 1 (KPNA2, STAT1, NEMO, 40S subunit uS3, TBK1, and G3BP1). As in Level 1, the most variable to most conserved proteins across species tended to be ordered from immune-specialized proteins to general cellular maintenance proteins. The conserved domain sequences also showed relatively high sequence similarity, with scores ranging from approximately 40% to 100% similarity to the human sequences. Nonetheless, taxonomic clades were strongly clustered, as observed for the hominids and other primates; the felid, canid, and mustelid families within the carnivores; the artiodactyls; and the rodents. Of note, the Malayan pangolin was included as a species of interest; however, the MDA5 and STING sequences are currently annotated in the NCBI database as low-quality proteins (S2 File).

            Furthermore, narrowing the number of residues to the specific amino acids involved in PPI in the crystal structures, the Level 3 clustered heatmap had a score range of approximately 75–100% similarity, and also clustered the immune-specialized MAVS, ISG15, and IRF3 together with ACE2 (Fig 1C). Differences in these proteins were the main factor dividing the cricetid and murid rodents. However, taxonomic cohesiveness was maintained by each group to different degrees for each of these proteins. Of note, differences in amino acid identity in the contact residues in each protein were found between the human and certain species, but these differences did not coincide with the known measures of infection, seroconversion, and/or transmission for those species. That is, on the basis of the average residue contacts per protein at the positions interacting with the viral protein stressors, no single protein, including ACE2, accounted for a notable proportion of positive viral RNA or seroconversion tests.

            DISCUSSION

            The multi-level SeqAPASS alignments (Fig 1) aided in identifying key amino acid substitutions that might alter the PPIs in specific proteins of resistant species, as compared with sequences in humans and potential intermediate host species. The evaluation also identified data gaps for several proteins in the antiviral response pathway.

            ACE2, the receptor that binds the viral S protein RBD enabling SARS-CoV-2 entry into host cells, has been the sole focus of prior cross-species susceptibility studies [3,911]. Damas et al. [9] ranked sequences of more than 400 vertebrate species to predict their susceptibility to SARS-CoV-2 host cell entry, by scoring the number and type of differences in the 22 interacting amino acids in the human SARS-CoV and SARS-CoV-2 crystal structures. The researchers focused on the binding sites K31, E35, M82, and K353, and the glycosylation sites N53, N90, and N322, based on the effects of amino acid substitution on binding of SARS-CoV S protein [8]. Luan et al. [11] compared binding contacts across the sequences of 42 mammals, with a focus on the five residues K31, E35, D38, M82, and K353, and have also conducted homology modeling of the interaction interfaces between cat, dog, pangolin, or Chinese hamster ACE2 and the SARS-CoV-2 RBD. Luan et al. [10] also modeled bovine and cricetid ACE2 structures and reported permissiveness to entry, whereas reptiles and birds showed changes in nearly half the interacting amino acid residues. At that time, few non-human species had been tested, and little empirical data on infections in other species existed. Susceptibility data are now available for more than 40 species [4] (Table 3).

            Additionally, at the time only human ACE2-SARS-CoV-2 RBD crystal structures had been solved. More recently, several crystal or cryo-EM structures have been solved for other mammals, including the dog, cat, sea lion, pangolin, intermediate horseshoe bat, horse, and minke whale (Table 2). These ACE2-RBD structures represent several species beyond the human and house mouse model structures typically available for other bound proteins, presenting a rare opportunity for cross-species comparison of contact amino acids. Analysis of the contacts showed that the position, and not the identity, of the amino acid in the ACE2 dictates the amino acid contact pairing in the RBD. For example, a glutamine occupies position 24 in humans, whereas leucine, glutamic acid, or arginine occupies position 24 in the other species, but each remains paired with A475, G476, and N487 in the S protein RBD. This finding was observed at most positions, regardless of whether the ACE2 amino acid in each species was a match, partial match, or not a match to that in the human ACE2, according to the SeqAPASS results (Table 2). Despite viral binding regardless of amino acid differences at the contact positions, most of these species (dog, cat, pangolin, and horse) have been found to allow viral entry (Table 3), whereas the sea lion and minke whale have not been tested. The greatest differences in ACE2-RBD contact residue pairs with respect to those in humans were observed in horseshoe bats and cats. The low binding affinity (high Kd) determined for the horseshoe bat provides an additional indication of the improbable binding of the bat ACE2 to the RBD. The horse had the lowest Kd, which was most similar to the human Kd [33], but has shown little propensity for disease and no transmission [38] (Table 2). Beyond binding information, N-glycosylation site comparisons showed that many species had substitutions at either N90 or N322, but only mice had substitutions at both positions (S5B Fig). Although the amenability of the ACE2 to binding the SARS-CoV-2 S protein has demonstrated that correlations with infection and the positions of contact amino acids are important, their identity may be less important than previously thought, and more attention should be paid to post-translational modification sites. Additionally, cell entry assays conducted by Conceicao et al. [3], along with contact data from the solved mammalian structures (Table 2), suggest that, although cell entry is permitted, other mechanisms within cells may provide resistance to viral replication and transmission in species that do not transmit the virus.

            Several proteins investigated in this study may be important targets for human therapeutics but were identified as highly conserved, and therefore were considered unlikely to be drivers in determining species susceptibility to SARS-CoV-2 infection or transmission. These proteins include EIF4E2 (4EHP); the 40S ribosome subunits uS3 and uS5; G3BP1; GIGYF2; KPNA2; RAE1; SRP19; SRP54; STAT1; STAT2; TBK1; and TOMM70 (S2-S4, S7-S9, S16, S18-S21, S23, and S25 Figs). Five proteins lacked solved structures or data sufficient to determine critical amino acids: MAVS, POLA1, RIG-I, STING, and TMPRSS2 (S10, S15, S17, S22, and S24 Figs; S1 Table).

            Tethrin (BST-2) was investigated because it is a protein found only in mammals that functions by tethering nascent mammalian enveloped virions to each other and to the membranes of infected cells, thus preventing their spread to other cells. BST-2 is IFN-induced (i.e., an ISG) [39]. Our analysis of the unbound structure (PDB 3NWH) with the MOE Site Finder tool indicated that residues in the pockets with the top two propensity for ligand binding scores matched those identified in an in silico analysis by Bisht et al. [23] (S2 Table, Table 1). Some evidence indicates that BST-2 might be a stronger barrier in restricting viral replication in bats than in humans [40]. However, with respect to the specific residues studied herein, no substitutions were conserved across bats that did not have similar substitutions in other mammals.

            Information found for the remaining proteins (MDA5, NEMO, IRF3, and ISG15) revealed potential domains or specific amino acids for which substitutions may result in disruption of the PPI observed in humans, as discussed below.

            Within the MDA5 N-terminus are NCBI domains cd08818 and cd08819, representing two caspase activation and recruitment domains (CARD), first and second repeats. The cd08818 domain contains sites that bind ISG15 (ISGylation; K23, K43, and K68), CARD oligomerization sites (G74 and W75), and serine phosphorylation sites (S88 and S104). According to the SeqAPASS Level 3 results, the ISGylation sites (crucial to antiviral responses [41,42]), and S88 had partial match substitutions in various species, with substitutions for a combination of one of the ISGylation lysines and S88 substituted by asparagine (S88N) in the Eastern cottontail rabbit, big and little brown bats, cow, house mouse, and Norway rat. The gray squirrel, Chinese tree shrew, pig, and Jamaican fruit bat also have the single S88N substitution (S11C Fig). Because serine can be phosphorylated and asparagine cannot, the S88N substitution might potentially affect interactions with SARS-CoV-2 PLpro. Additionally, ISGylation requires lysine, and substitutions were found to result in attenuated signaling in humans [42]. Furthermore, the mammals with a combination of lysine and S88N substitutions have generally shown COVID-19 resistance [4] (Table 3); therefore, ISGylation and phosphorylation in MDA5 may lead to differences in interactions compared to those in humans. Evidence of this possibility has been described in the literature [43,44]; therefore, the low conservation of MDA5 and ISG15 (Fig 1B) and the complexity of the ISGylation pathway contribute to the potential for variation in antiviral responses across species.

            The interaction of the N-terminus of ORF9b with NEMO after viral infection interrupts its K63-linked polyubiquitination, thereby inhibiting IFNβ1 expression in HEK293T cells in an ORF9b-dose-dependent manner, however, binding residues were not determined [45]. For Level 2, domain cd09803 contains the polyubiquitin binding site that would be blocked by SARS-CoV-2 ORF9b [45]. This sequence was more than 90% conserved in mammalian orders Chiroptera, Rodentia, Artiodactyla, Carnivora, and Lagomorpha (S14B Fig). Hameedi et al. [46] solved the X-ray crystal structure of 3CLpro bound to NEMO and characterized 3CLpro cleavage of NEMO. An alanine substitution in the house mouse at the predicted binding hotspot V232 was considered to increase the propensity for helix-formation, thus potentially destabilizing house mouse NEMO, relative to the stronger interaction between human NEMO and the 3CLpro catalytic site [46]. This substitution exists in the resistant big and little brown bats, and the Norway rat, but is also found in the prairie deer mouse and golden hamster, both of which have tested positive for SARS-CoV-2 infection (S14C Fig; Table 3). Further investigation of the 3CLpro binding propensities to NEMO in other species is warranted.

            Interferon regulatory factor 3 (IRF3) is the crucial factor in the pathway that, when activated and transported to the nucleus, binds DNA and induces expression of IFN-I. Crystal structures show interaction sites with the adaptor proteins STING (5JEJ), MAVS (5JEK), TRIF (5JEL), and a rotavirus protein (5JEO) that interferes with IRF3 activation by binding at the host adaptor protein sites (S2 Table). The adaptor protein binding site on IRF3 surrounds the SARS-CoV-2 Plpro cleavage location; cleavage would thwart activation by disrupting the adaptor protein binding site. NSP3 (Plpro) cleaves IRF3 at residues 267–273 (CLGGGLA), thereby decreasing the IRF3 available for induction of IFN-I expression [26]. Although variability is found across species throughout the protein (Fig 1; S12A Fig), as well as specifically in the adaptor binding site residues (S12C Fig) and the nearby conserved domain pfam10401 (S12B Fig), the most consequential residue according to Mostaquil et al. [26] is the cleavage site G270. At this position, the domestic ferret (R) and rodents (K, N) have substitutions with respect to the human protein. Strikingly, the queried bats showed conservation of the glycine as well as other residues in the cleavage motif except C267, a residue for which many species including primates have substitutions (S12C Fig). Further cellular experimentation is needed to determine the effects of IRF3 cleavage on SARS-CoV-2 infection in other species.

            The function of ISG15 in antiviral immunity is direct inhibition of viral replication [47]. SARS-CoV-2 NSP3 (Plpro) preferentially cleaves ISG15 [48]. A house mouse crystal structure (PDB 6YVA) has been solved with Plpro bound to ISG15 [48]. In this study, we used an unbound human x-ray crystal structure (PDB 1Z2M) [49] as a template to which NSP3 was virtually docked using MOE and compared with the house mouse structure. Klemm et al. [50] generated a crystal structure of the C-terminal end of ISG15 bound to NSP3 (6XA9), and indicated that W123, P130, and E132 are important interacting residues. These residues were also determined by MOE to be interacting residues, but hydrogen bonding was not indicated at E132 (S2 Table). Variability in sequence and bonding was observed between the house mouse and human structures. Ionic and hydrogen bonds at K35 and E87 in mice were not found to be interacting in the human structures, but these bonds were present in both species at R153, and the hydrogen bonds matched between the two species in the conserved C-terminal motif, L154, R155, and G156. The conserved domain ISG15 cd01810 included these residues. Given the low ISG15 sequence conservation (Fig 1; S13 Fig), the differences between the house mouse and human structures’ bonding interactions, and the involvement of ISG15 in the ISGylation regulating several proteins in the IFN-I pathway (including MDA5, IRF3, STAT1, and RIG-I) [44], ISG15 appears to be critically important in determining cross-species variability in antiviral response.

            As observed in the clustered heatmaps (Fig 1), conserved domains and specific interacting amino acids clustered according to taxonomic groups rather than empirical measures of infection. Importantly, small numbers of individuals have been tested for infection or seroconversion within a species in some studies, and some results (pigs, ferrets, and rhesus monkeys) were inconsistent in interpretation among studies [4] (Table 3). Caveats in the empirical infection data include false positive or negative results, differences in testing strategies, and that some animals testing positive (“yes”) for viral RNA or seroconversion were described in reports in which other animals did not test positive; for example, in one study testing ten Malayan pangolins, only one tested positive [4]. Since the COVID-19 pandemic began, humans have shown variations in susceptibility, particularly with age and underlying conditions. Genetic factors such as gene regulation and splicing (e.g., in genes encoding antiviral 2′,5′-oligoadenylate synthetase [OAS] enzymes) have been found to affect human susceptibility [51]. Sequence variation in major histocompatibility complex (MHC) proteins has also been implicated in symptomatic versus asymptomatic cases [52]. Similar factors certainly exist in other mammalian species. In several studies in which animals were exposed to SARS-CoV-2, either naturally or experimentally, some individuals within a species were found to be infected, whereas others were not; moreover, some species appeared to be resistant, with none of the exposed individuals becoming infected [34]. Among the 43 species identified in Table 3 with data in the literature, 28 species were investigated for ability to transmit the virus, of which 13 showed no evidence of transmission or viral shedding. As discussed for the individual proteins, some evidence suggests that these differences might be explained by certain amino acid substitutions across species, in particular IFN-I pathway proteins investigated in this study.

            Whereas other studies have attempted predictive ranking of species according to the numbers of critical amino acid substitutions, particularly in ACE2, the results of this study indicated that the numbers of substitutions did not directly correlate with species infection status. The compelling and novel finding of this study is the identification of specific amino acid substitutions associated with key structural or physiochemical features in certain proteins that may alter the interaction with the viral protein. Such substitutions potentially underlie the mechanism of resistance in the species that have tested negative for SARS-CoV-2 or have been found not to transmit the virus; however, the mechanisms might differ among resistant species. The substitutions of the ISGylation residues K23, K43, or K68, and serine phosphorylation S88N in MDA5 in several non-transmitting species (Eastern cottontail rabbit, big and little brown bats, cow, pig, house mouse, and Norway rat) include cattle and pigs, but not dogs or horses. The NEMO V232A substitution in the house mouse at the predicted binding hotspot [46], is substituted only in rodents, tree shrews, and big brown bats. The IRF3 cleavage site G270 substitution is found in rodents but not bats [26]. These and other IFN-I pathway proteins show potential for binding location differences; however, more experimental data are needed, at both the molecular and cellular levels, to define the infection and transmission capabilities of species of epidemiological concern with greater certainty. Additionally, several proteins had no PPI crystal or modeled structures—a general research gap hindering studies of protein or ligand interactions across non-human vertebrate species. Furthermore, missing and low-quality sequences identified through the sequence-based prediction approaches presented herein are limitations of the study, and highlight the need for more reliable sequences and genome coverage for protein interaction predictions across species.

            CONCLUSIONS

            Our findings challenge previous assumptions regarding amino acid substitutions and pathogen-host protein interactions. This work demonstrated that susceptibility predictions cannot be based solely on the numbers and types of substitutions at the binding interfaces, in comparison to those in the most susceptible species. However, specific substitutions in NEMO, IRF3, and MDA5, in addition to ACE2, were found to potentially affect PPI binding affinity or protein modifications, such as ISGylation and phosphorylation, that do appear to correspond to species susceptibility. Our results also highlight the importance of evaluating both specific interaction-disrupting substitutions and overall structural similarity across species [5356]. These findings will require further investigation as promising leads in the identification of intermediate hosts, thus providing important information for epidemiological monitoring of emerging zoonotic diseases.

            ACKNOWLEDGEMENTS

            We thank Dr. Daniel Chang, US EPA, and Dr. Jon Doering, Louisiana State University, for providing comments on an early draft of the manuscript. This work was funded wholly by the US Environmental Protection Agency.

            CONFLICTS OF INTEREST

            All authors declare that they have no competing interests.

            SUPPLEMENTARY MATERIAL

            REFERENCES

            1. Hayn M, Hirschenberger M, Koepke L, Nchioua R, Straub JH, Klute S, et al.. Systematic functional analysis of SARS-CoV-2 proteins uncovers viral innate immune antagonists and remaining vulnerabilities. Cell Rep. 2021. Vol. 35(7):109126

            2. Liu Q, Chi S, Dmytruk K, Dmytruk O, Tan S. Coronaviral infection and interferon response: the virus-host arms race and COVID-19. Viruses. 2022. Vol. 14(7):1349

            3. Conceicao C, Thakur N, Human S, Kelly JT, Logan L, Bialy D, et al.. The SARS-CoV-2 spike protein has a broad tropism for mammalian ACE2 proteins. Sugden B, editor. PLOS Biol. 2020. Vol. 18(12):e3001016

            4. EFSA Panel on Animal Health and Welfare (AHAW). Nielsen SS, Alvarez J, Bicout DJ, Calistri P, Canali E, Drewe JA, et al.. SARS-CoV-2 in animals: susceptibility of animal species, risk for animal and public health, monitoring, prevention and control. EFSA J. 2023. Vol. 21(2):1–108

            5. Goraichuk IV, Arefiev V, Stegniy BT, Gerilovych AP. Zoonotic and reverse zoonotic transmissibility of SARS-CoV-2. Virus Res. 2021. Vol. 302:198473

            6. Hoagland DA, Møller R, Uhl SA, Oishi K, Frere J, Golynker I, et al.. Leveraging the antiviral type I interferon system as a first line of defense against SARS-CoV-2 pathogenicity. Immunity. 2021. Vol. 54(3):557–570.e5

            7. Hoffmann M, Kleine-Weber H, Schroeder S, Krüger N, Herrler T, Erichsen S, et al.. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell. 2020. Vol. 181(2):271–280.e8

            8. Li W, Zhang C, Sui J, Kuhn JH, Moore MJ, Luo S, et al.. Receptor and viral determinants of SARS-coronavirus adaptation to human ACE2. EMBO J. 2005. Vol. 24(8):1634–1643

            9. Damas J, Hughes GM, Keough KC, Painter CA, Persky NS, Corbo M, et al.. Broad host range of SARS-CoV-2 predicted by comparative and structural analysis of ACE2 in vertebrates. Proc Natl Acad Sci USA. 2020. Vol. 117(36):22311–22322

            10. Luan J, Jin X, Lu Y, Zhang L. SARS-CoV-2 spike protein favors ACE2 from Bovidae and Cricetidae. J Med Virol. 2020. Vol. 92(9):1649–1656

            11. Luan J, Lu Y, Jin X, Zhang L. Spike protein recognition of mammalian ACE2 predicts the host range and an optimized ACE2 for SARS-CoV-2 infection. Biochem Biophys Res Commun. 2020. Vol. 526(1):165–169

            12. Wu L, Chen Q, Liu K, Wang J, Han P, Zhang Y, et al.. Broad host range of SARS-CoV-2 and the molecular basis for SARS-CoV-2 binding to cat ACE2. Cell Discov. 2020. Vol. 6(1):68

            13. Yan H, Jiao H, Liu Q, Zhang Z, Xiong Q, Wang BJ, et al.. ACE2 receptor usage reveals variation in susceptibility to SARS-CoV and SARS-CoV-2 infection among bat species. Nat Ecol Evol. 2021. Vol. 5(5):600–608

            14. Pan T, Chen R, He X, Yuan Y, Deng X, Li R, et al.. Infection of wild-type mice by SARS-CoV-2 B.1.351 variant indicates a possible novel cross-species transmission route. Signal Transduct Target Ther. 2021. Vol. 6(1):420

            15. Shuai H, Chan JFW, Yuen TTT, Yoon C, Hu JC, Wen L, et al.. Emerging SARS-CoV-2 variants expand species tropism to murines. EBioMedicine. 2021. Vol. 73:103643

            16. Nerpel A, Yang L, Sorger J, Käsbohrer A, Walzer C, Desvars-Larrive A. SARS-ANI: a global open access dataset of reported SARS-CoV-2 events in animals. Sci Data. 2022. Vol. 9(1):438

            17. Gordon DE, Jang GM, Bouhaddou M, Xu J, Obernier K, White KM, et al.. A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature. 2020. Vol. 583(7816):459–468

            18. Mayasich SA, Goldsmith MR, Mattingly KZ, LaLone CA. Combining in vitro and in silico New Approach Methods to investigate type 3 iodothyronine deiodinase chemical inhibition across species. Environ Toxicol Chem. 2023. Vol. 42(5):1032–1048

            19. LaLone CA, Villeneuve DL, Lyons D, Helgen HW, Robinson SL, Swintek JA, et al.. Editor’s highlight: sequence alignment to predict across species susceptibility (SeqAPASS): a web-based tool for addressing the challenges of cross-species extrapolation of chemical toxicity. Toxicol Sci. 2016. Vol. 153(2):228–245

            20. Marchler-Bauer A, Bo Y, Han L, He J, Lanczycki CJ, Lu S, et al.. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 2017. Vol. 45(D1):D200–D203

            21. Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, et al.. CDD: NCBI’s conserved domain database. Nucleic Acids Res. 2015. Vol. 43(D1):D222–D226

            22. Sundar S, Thangamani L, Piramanayagam S, Rahul CN, Aiswarya N, Sekar K, et al.. Screening of FDA-approved compound library identifies potential small-molecule inhibitors of SARS-CoV-2 non-structural proteins NSP1, NSP4, NSP6 and NSP13: molecular modeling and molecular dynamics studies. J Proteins Proteomics. 2021. Vol. 12(3):161–175

            23. Bisht K, Pant K, Kumar N, Pande A, Pant B, Verma D. In-silico analysis of interaction of human tetherin protein with SARS-COV-2 ORF7A proteins and its mutants. Int J Curr Res Rev. 2020. Vol. 12(17):193–199

            24. Wang J, Youkharibache P, Zhang D, Lanczycki CJ, Geer RC, Madej T, et al.. iCn3D, a web-based 3D viewer for sharing 1D/2D/3D representations of biomolecular structures. Valencia A, editor. Bioinformatics. 2020. Vol. 36(1):131–135

            25. Gu Z, Eils R, Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016. Vol. 32(18):2847–2849

            26. Moustaqil M, Ollivier E, Chiu HP, Van Tol S, Rudolffi-Soto P, Stevens C, et al.. SARS-CoV-2 proteases PLpro and 3CLpro cleave IRF3 and critical modulators of inflammatory pathways (NLRP12 and TAB1): implications for disease presentation across species. Emerg Microbes Infect. 2021. Vol. 10(1):178–195

            27. Shang J, Ye G, Shi K, Wan Y, Luo C, Aihara H, et al.. Structural basis of receptor recognition by SARS-CoV-2. Nature. 2020. Vol. 581(7807):221–224

            28. Lan J, Ge J, Yu J, Shan S, Zhou H, Fan S, et al.. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature. 2020. Vol. 581(7807):215–220

            29. Zhang Z, Zhang Y, Liu K, Li Y, Lu Q, Wang Q, et al.. The molecular basis for SARS-CoV-2 binding to dog ACE2. Nat Commun. 2021. Vol. 12(1):4195

            30. Li S, Yang R, Zhang D, Han P, Xu Z, Chen Q, et al.. Cross-species recognition and molecular basis of SARS-CoV-2 and SARS-CoV binding to ACE2s of marine animals. Natl Sci Rev. 2022. Vol. 9(9):nwac122

            31. Wu L, Su J, Niu S, Chen Q, Zhang Y, Yan J, et al.. Molecular basis of pangolin ACE2 engaged by COVID-19 virus. Chin Sci Bull. 2021. Vol. 66(1):73–84

            32. Tang L, Zhang D, Han P, Kang X, Zheng A, Xu Z, et al.. Structural basis of SARS-CoV-2 and its variants binding to intermediate horseshoe bat ACE2. Int J Biol Sci. 2022. Vol. 18(12):4658–4668

            33. Lan J, Chen P, Liu W, Ren W, Zhang L, Ding Q, et al.. Structural insights into the binding of SARS-CoV-2, SARS-CoV, and hCoV-NL63 spike receptor-binding domain to horse ACE2. Structure. 2022. Vol. 30(10):1432–1442.e4

            34. Bosco-Lauth AM, Root JJ, Porter SM, Walker AE, Guilbert L, Hawvermale D, et al.. Peridomestic mammal susceptibility to severe acute respiratory syndrome coronavirus 2 infection. Emerg Infect Dis. 2021. Vol. 27(8):2073–2080

            35. Porter SM, Hartwig AE, Bielefeldt-Ohmann H, Bosco-Lauth AM, Root JJ. Susceptibility of wild canids to SARS-CoV-2. Emerg Infect Dis. 2022. Vol. 28(9):1852–1855

            36. Zhou P, Yang XL, Wang XG, Hu B, Zhang L, Zhang W, et al.. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020. Vol. 579(7798):270–273

            37. Colombo VC, Sluydts V, Mariën J, Vanden Broecke B, Van Houtte N, Leirs W, et al.. SARS-CoV-2 surveillance in Norway rats (Rattus norvegicus) from Antwerp sewer system, Belgium. Transbound Emerg Dis. 2022. Vol. 69(5):3016–3021

            38. Lawton KOY, Arthur RM, Moeller BC, Barnum S, Pusterla N. Investigation of the role of healthy and sick equids in the COVID-19 Pandemic through serological and molecular testing. Animals. 2022. Vol. 12(5):614

            39. Schubert HL, Zhai Q, Sandrin V, Eckert DM, Garcia-Maya M, Saul L, et al.. Structural and functional studies on the extracellular domain of BST2/tetherin in reduced and oxidized conformations. Proc Natl Acad Sci U S A. 2010. Vol. 107(42):17951–17956

            40. Hayward JA, Tachedjian M, Johnson A, Irving AT, Gordon TB, Cui J, et al.. Unique evolution of antiviral tetherin in bats. Lowen AC, editor. J Virol. 2022. Vol. 96(20):e01152-22

            41. Liu G, Lee JH, Parker ZM, Acharya D, Chiang JJ, Van Gent M, et al.. ISG15-dependent activation of the sensor MDA5 is antagonized by the SARS-CoV-2 papain-like protease to evade host innate immunity. Nat Microbiol. 2021. Vol. 6(4):467–478

            42. Vere G, Alam MR, Farrar S, Kealy R, Kessler BM, O’Brien DP, et al.. Targeting the ubiquitylation and ISGylation machinery for the treatment of COVID-19. Biomolecules. 2022. Vol. 12(2):300

            43. Speer SD, Li Z, Buta S, Payelle-Brogard B, Qian L, Vigant F, et al.. ISG15 deficiency and increased viral resistance in humans but not mice. Nat Commun. 2016. Vol. 7(1):11496

            44. Kang JA, Kim YJ, Jeon YJ. The diverse repertoire of ISG15: more intricate than initially thought. Exp Mol Med. 2022. Vol. 54(11):1779–1792

            45. Wu J, Shi Y, Pan X, Wu S, Hou R, Zhang Y, et al.. SARS-CoV-2 ORF9b inhibits RIG-I-MAVS antiviral signaling by interrupting K63-linked ubiquitination of NEMO. Cell Rep. 2021. Vol. 34(7):108761

            46. Hameedi MA, Prates ET, Garvin MR, Mathews II, Amos BK, Demerdash O, et al.. Structural and functional characterization of NEMO cleavage by SARS-CoV-2 3CLpro. Nat Commun. 2022. Vol. 13(1):5285

            47. Perng YC, Lenschow DJ. ISG15 in antiviral immunity and beyond. Nat Rev Microbiol. 2018. Vol. 16(7):423–439

            48. Shin D, Mukherjee R, Grewe D, Bojkova D, Baek K, Bhattacharya A, et al.. Papain-like protease regulates SARS-CoV-2 viral spread and innate immunity. Nature. 2020. Vol. 587(7835):657–662

            49. Narasimhan J, Wang M, Fu Z, Klein JM, Haas AL, Kim JJP. Crystal structure of the interferon-induced ubiquitin-like protein ISG15. J Biol Chem. 2005. Vol. 280(29):27356–27365

            50. Klemm T, Ebert G, Calleja DJ, Allison CC, Richardson LW, Bernardini JP, et al.. Mechanism and inhibition of the papain-like protease, PLpro, of SARS-CoV-2. EMBO J. 2020. Vol. 39(18):e106275

            51. Huffman JE, Butler-Laporte G, Khan A, Pairo-Castineira E, Drivas TG, Peloso GM, et al.. Multi-ancestry fine mapping implicates OAS1 splicing in risk of severe COVID-19. Nat Genet. 2022. Vol. 54(2):125–127

            52. Castelli EC, De Castro MV, Naslavsky MS, Scliar MO, Silva NSB, Andrade HS, et al.. MHC variants associated with symptomatic versus asymptomatic SARS-CoV-2 infection in highly exposed individuals. Front Immunol. 2021. Vol. 12:742881

            53. Fischhoff IR, Castellanos AA, Rodrigues JPGLM, Varsani A, Han BA. Predicting the zoonotic capacity of mammals to transmit SARS-CoV-2. Proc R Soc B Biol Sci. 2021. Vol. 288(1963):20211651

            54. Kaushik R, Kumar N, Zhang KYJ, Srivastava P, Bhatia S, Malik YS. A novel structure-based approach for identification of vertebrate susceptibility to SARS-CoV-2: implications for future surveillance programmes. Environ Res. 2022. Vol. 212:113303

            55. LaLone CA, Blatz DJ, Jensen MA, Vliet SMF, Mayasich S, Mattingly KZ, et al.. From protein sequence to structure: the next frontier in cross-species extrapolation for chemical safety evaluations. Environ Toxicol Chem. 2023. Vol. 42(2):463–474

            56. Wang G, Liu X, Wang K, Gao Y, Li G, Baptista-Hon DT, et al.. Deep-learning-enabled protein-protein interaction analysis for prediction of SARS-CoV-2 infectivity and variant evolution. Nat Med. 2023. Vol. 29(8):2007–2018

            Author and article information

            Journal
            Zoonoses
            Zoonoses
            Zoonoses
            Compuscript (Shannon, Ireland )
            2737-7466
            2737-7474
            27 November 2024
            : 4
            : 1
            : e965
            Affiliations
            [1 ]Aquatic Sciences Center, University of Wisconsin-Madison, Madison, USA
            [2 ]US Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Great Lakes Toxicology and Ecology Division, Duluth, MN, USA
            [3 ]Oak Ridge Institute for Science and Education, Duluth, MN, USA
            Author notes
            *Corresponding author: E-mail: LaLone.Carlie@ 123456epa.gov (CAL)

            Present address: aSally A. Mayasich, Battelle Memorial Institute, Columbus, OH, mayasich@ 123456battelle.org ;

            bP.G.S.: US EPA, Duluth, MN, Schumann.Peter@ 123456epa.gov ;

            cM.B.: Biology Department, University of Minnesota-Duluth, Duluth, MN, botz0027@ 123456d.umn.edu

            Author information
            https://orcid.org/0000-0002-7621-481X
            https://orcid.org/0000-0002-4475-1928
            https://orcid.org/0000-0003-3174-1314
            Article
            10.15212/ZOONOSES-2024-0028
            b273e3c8-1338-4a2c-be89-579b9b787a0a
            2024 The Authors.

            This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY) 4.0, which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.

            History
            : 25 June 2024
            : 09 September 2024
            : 19 September 2024
            Page count
            Figures: 1, Tables: 3, References: 56, Pages: 17
            Categories
            Original Article

            Parasitology,Animal science & Zoology,Molecular biology,Public health,Microbiology & Virology,Infectious disease & Microbiology
            protein-protein interaction,COVID-19,SARS-CoV-2,cross-species susceptibility,interferon

            Comments

            Comment on this article