INTRODUCTION
In early 2020, the causative agent of Coronavirus Disease 2019 (COVID-19) was identified as Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) [1, 2]. Although the precise origin of the virus is unknown, it shows high similarity with several bat coronaviruses. After the World Health Organization issued global warnings, SARS-CoV-2 crossed international borders and led to a pandemic [3]. As of December 2022, SARS-CoV-2 has caused 650 million infections and 6.6 million deaths globally [4]. Compared with other beta-coronaviruses (e.g., Middle East Respiratory Syndrome Coronavirus [MERS-CoV] and SARS-CoV), SARS-CoV-2 has a higher replication rate and attack rate in communities and induces high fatality rates in hospitalized patients [5]. Nevertheless, the infection and recovery patterns scarcely depend on demographic factors [6], and the underlying mechanisms are not fully understood to date. COVID-19 symptoms are similar to those of influenza, including fever, frontal headache, retro-orbital or temporal headache, gastrointestinal symptoms, ground-glass opacity, pleural effusion [7], loss of taste or smell, and neurological symptoms. Laboratory findings including lymphopenia, hypersensitive C-reactive protein, erythrocyte sedimentation rate, creatine kinase, transaminases (aspartate aminotransferase and alanine aminotransferase), thrombocytopenia are more common biomarkers for diagnosis in patients with COVID-19 [8], and thus may serve as biomarkers [8]. Similarly, lung radiographic findings indicate ground grass opacity, predominantly peripheral distribution, and interlobular septal thickening in patients with COVID-19. Patients with severe COVID-19 and respiratory distress are transferred to intensive care for mechanical ventilation. Of note, SARS-CoV-2 infection severity is associated with comorbidities, such as cardiovascular diseases, diabetes, and obesity, thus leading to high mortality [9, 10].
COVID-19 PATHOPHYSIOLOGY
COVID-19 pathophysiology has been extensively discussed elsewhere [11–13]. Detailed research on host-virus interactions has revealed that SARS-CoV-2 enters hosts through different routes, including nasal and oral routes, and binds the functional receptor angiotensin-converting enzyme 2 (ACE2) [14]. ACE2 is highly expressed on alveolar epithelial cells, enterocytes, endothelial cells, and the oral mucosa. The spike (S) protein of SARS-CoV-2 enters cells via ACE2, and SARS-CoV-2 RNA subsequently is replicated and propagated by the host machinery. However, the possibilities of other receptors supporting viral entry (neuropilin 1, B0AT1 [neutral amino acid transporter]) should not be ruled out [15]. Because of its rapid replication, the virus damages infected organs, particularly the lungs. During the acute phase, immune cells, particularly dendritic cells and macrophages, secrete signaling molecules, which recruit other immune cells to the infected environment [11]. This process is not transient and leads to the hyperactivation of immune cells, which copiously secrete cytokines and chemokines that damage infected organs, in a response termed a “cytokine storm” [16]. Furthermore, the involvement of diverse pathways has been proposed, including cyclooxygenase-2 (COX-2) and prostaglandin E2 (PGE2) mediated pathways [17, 18], programmed cell death mediated by stimulation extracellular neutrophils traps (NETosis) [19], and other pathways [20]. In addition, the S protein-ACE2 interaction results in down-regulation of ACE2 on lung epithelial cells; this response is believed to be a primary cause of lung injury [21]. ACE2 simultaneously upregulates ACE1 via negative feedback mechanisms [22], thus leading to excessive angiotensin-II. Angiotensin II in turn binds angiotensin II receptor type 1 (AGTR1A) receptors and elicits excessive vascular pulmonary activity consistent with the lung pathology observed in pulmonary destruction [22, 23]. Irrespective of the signaling pathways, COVID-19 progresses through processes including lymphopenia, cytokine storms, accumulation of macrophages and neutrophils in the lungs, immune dysregulation, acute respiratory distress syndrome.
EMERGENCE OF SARS-COV-2 VARIANTS
The human immune system is tightly regulated during infection and exhibits long-lasting antigen/pathogen-specific memory over time [24–26]. To evade the host immune system, continually evolving viruses may either hijack antiviral pathways or produce diverse mutations increasing genetic diversity. These mutations can lead to significant changes in viral protein structure, including antigenic drift, folding/masking of critical antigenic residues exposed to the host, and production of new protein sequences [27, 28]. Multiple new SARS-CoV-2 variants have arisen worldwide, including the Alpha (B.1.1.7), Beta (B.1.351), Gamma (P.1), Delta (B.1.617.2), Epsilon (B.1.427 and B.1.429), Eta (B.1.525), Iota (B.1.526), Kappa (B.1.617.1), Mu (B.1.621, B.1.621.1), Zeta (P.2), Theta (P.3), and Omicron (B.1.1.529, BA.1, BA.1.1, BA.2, BA.3, BA.4, and BA.5) variants (Fig 1) [29–32]. Changes in viral S protein alter the viral receptor binding pattern in the host, thus enhancing the infection potential, evading conventional detection methods, producing diverse disease outcomes, and decreasing susceptibility to the immune response raised against earlier circulating SARS-CoV-2 strains or vaccines [33]. The emergence of these variants has posed continuing challenges to host immunity induced by first-generation vaccines or prior infection.
FIRST-GENERATION COVID-19 VACCINES
Despite causing unusual economic and livelihood loss globally, the COVID-19 pandemic has also driven remarkable scientific and technological progress in the past several years. A complete SARS-CoV-2 genome was quickly determined [34], and structural elucidation was performed by mapping the host receptor interactions [35]. Substantial success was attained in vaccine development against SARS-CoV-2. More than 90 vaccine candidates are being developed worldwide through various strategies, including live inactivated vaccines, viral vector-based vaccines, recombinant subunit vaccines, and nucleic acid (DNA or RNA) vaccines [36, 37]. Among them, two mRNA-based vaccines (Pfizer-BioNTech and Moderna) (Fig 2) and an adenovirus 26-based vaccine (Johnson & Johnson) were approved by the United States Food and Drug Administration for clinical use under emergency authorization. Since then, the two mRNA vaccines have received full approval. Subsequently, additional vaccines have been approved for COVID-19, including Novavax’s adjuvanted vaccine (SARS-CoV-2 S protein with Matrix-M adjuvant). Millions of doses of these vaccines have been distributed worldwide and are being used for immunizing both infected and non-infected populations [38]. These first-generation COVID-19 vaccines have been extensively reviewed elsewhere [37, 39–41].
Despite these incredible achievements, new problems have been emerged that challenge the long-term control of the pandemic; these problems include emerging viral variants (Fig 1) with greater transmissibility and immune escape ability, and waning immunity over time in vaccinated individuals [32, 42].
Most first-generation COVID-19 vaccines approved for clinical use or under clinical development target the viral S protein, its subunits (S1 and S2) or its receptor binding domain (RBD) (Fig 3A), and are aimed at eliciting a protective neutralizing antibody response. Variants with S proteins bearing structural changes—particularly those of the Delta and Omicron strains with increased transmissibility and immune escape—have emerged and become predominant [43–50]. For example, multiple S variants have been shown to have diminished sensitivity to neutralizing antibodies induced by the first-generation S-targeting vaccines [45–47]. Clinical research has also indicated that S-targeting COVID-19 vaccines show diminished efficacy against the Delta [51] and Omicron [52] strains. Thus, novel vaccine strategies targeting conserved regions of the virus to induce broader protection against COVID-19 variants are needed.
SARS-COV-2 NUCLEOCAPSID (N) PROTEIN BIOLOGY
N is an important structural protein of SARS-CoV-2 and is abundantly expressed in infected cells [53]. Hundreds of copies of N are present in the viral core, which encloses the genomic RNA. Like other viral structural proteins, N protein plays a crucial role in the CoV life cycle, including mRNA transcription, cytoskeletal organization, RNA replication, packaging, and immune regulation [54]. Liquid-liquid phase separation of N protein regulates viral transcription and assembly [55]. Initial studies elucidating N protein structure by nuclear magnetic resonance have indicated that its N-terminal domain (NTD) has an overall right-handed fold structure composed of a β-sheet core with an extended central loop [56]. The protein is composed of two alpha helices and five beta sheets, containing a total of 419 amino acids. It is functionally divided into two domains: the NTD and C-terminal domain (CTD) (Fig 3). The NTD, which contains an RNA-binding domain, acts as a linker between the RNA and matrix, thus enabling virion formation. N binds the positive single stranded RNA to the ribonucleoprotein core and has a “beads-on-a-string” appearance in micrographs. The CTD assists in RNA binding and contains a dimerization sequence, which facilitates RNA synthesis via interaction with the replication-transcription complexes [57, 58]. The CTD dimerization sequence also facilitates viral genome incorporation into virions. The intrinsically disordered region (IDR), between the domains, physically links the positive viral RNA genome and matrix protein [55]. Interestingly, the presence of the IDR favors N protein proteolysis, which is a key strategy for viral proliferation [59]. The proteolytic products, particularly N1−209, interact with immunophilin (i.e., Cyclophilin A) in host cells and promote the SARS-CoV-2 replication cycle [59].
Because of its abundant expression after SARS-CoV-2 infection, N protein has been used as a diagnostic marker for several antigenic tests, including IgG binding assays [60]. Conventionally, the coronavirus N protein has also been shown to modulate the host intracellular machinery and play regulatory roles during the viral life cycle. Therefore, the characteristics of domains of N-protein have been targets for antiviral development via inhibiting or blocking the RNA-binding activity or oligomerization capabilities. More detailed information can be found in comprehensive reviews published elsewhere [61, 62].
HOST IMMUNITY TO SARS-COV-2 N PROTEIN
Most patients with COVID-19 mount a specific immune response against the full length and fragments of the N protein and, to a lesser extent, against a fragment containing amino acids 300–685 of S protein [63]. A total of 82% of convalescent patients have detectable N-specific IgG 12 months after SARS-CoV-2 infection, in addition to S-specific antibodies [64]. In terms of T cell responses, data from convalescent patients have indicated that the S, M, and N proteins each account for 27%, 21%, and 11% of the total CD4+ response, respectively [65]. Higher frequencies of multi-cytokine production (poly-functional) by M- and N-specific CD8+ T cells are associated with mild COVID-19 [66]. Moreover, N-specific T cell and B cell (in terms of antibody secretion) responses have been reported in different cohorts of COVID-19 infected patients [65, 67, 68]. N induces robust both CD4+ and CD8+ T cell responses. CD4+ and CD8+ T cells have been found to recognize multiple regions of N protein in all individuals who have recovered from COVID-19 [66]. Another study comparing the T cell response to SARS-CoV-1 and SARS-CoV-2 N protein has indicated that N-specific memory T cells from SARS-CoV-1 infected individuals are long-lasting and cross-reactive with SARS-CoV-2 N protein. The N-specific memory T cells remained detectable in SARS-CoV-1 infected individuals 17 years after the SARS outbreak in 2003; moreover, these T cells showed strong cross-reactivity to N protein of SARS-CoV-2 [66].
Similarly, CD8+ T cells specific to an immunodominant SARS-CoV-2 N epitope cross-react with selective seasonal coronaviruses [69]. Screening of SARS-CoV-2 peptide pools has revealed that the N protein induces an immunodominant response in recovered patients with HLA-B7+ COVID-19 that is also detectable in SARS-CoV-2 unexposed donors [70]. A single N-encoded epitope that is highly conserved across circulating coronaviruses drives this immunodominant response. Notably, CD8+ T cell responses against the N protein epitope (N105–113) restricted by B*07:02 demonstrate strong antiviral activity and correlate with protection against severe disease [70]. Indeed, the most immunodominant CD8+ T cell response known to date is the N105 peptide presented by HLA-B*07:02, which arises from a high frequency of T cells within the naive T cell repertoire that recognize N105, rather than from human cells previously primed with coronavirus [70]. SARS-CoV-2 infection induces strong CD4+ and CD8+ T cell responses, particularly in recovered patients with severe rather than mild disease [10, 71]. Importantly, in recovered patients with mild disease, a high N-specific CD8+ T cell response has been observed [71]. More detailed information on cellular immunity induced by SARS-CoV-2 infection or vaccination has been reviewed elsewhere [72].
NUCLEOCAPSID-BASED SARS-COV-2 VACCINES
Beyond the S protein, the N protein is a well-recognized dominant target of antibody and T cell responses in SARS-CoV-2-infected individuals and therefore has been suggested as a potential immunogen to augment vaccine-mediated protective immunity [73]. Despite its use as a diagnostic marker, N-specific IgG in the first week of SARS-CoV-2 infection is associated with rapid symptom resolution [74]. Although current S-protein-targeting vaccines confer strong protection against ancestral SARS-CoV-2, the emerging variants have greater immune evasion of vaccine- and/or infection-elicited S-protein-specific neutralization than prior variants [75]. Therefore, vaccines expressing N protein have potential value in vaccine development. Indeed, efforts a decade ago were aimed at investigating N protein as a potential vaccine immunogen against coronaviruses. The COVID-19 pandemic has led to new light being shed on these investigations [61, 73, 76–78]. Several N-based vaccine immunogens have been developed via various technologies.
Adenovirus type-5 (Ad5) vector expressing N protein of ancestral SARS-CoV-2 has been investigated as a vaccine candidate [79, 80]. The vaccine is immunogenic and elicits both antibody and T-cell (CD4+ and CD8+) responses in mice [79]. Ad5-N alone has been found to induce very modest protection against SARS-CoV-2 in a K18-hACE2 transgenic mouse model (expressing human angiotensin-converting enzyme 2 [hACE2] receptor). However, combining Ad5-N with Ad5 expressing S (Ad5-S) has been observed to have synergistic effects and to induce stronger protection in the mouse brain (but not the lungs) than Ad5-S alone [79]. Another independent study testing Ad5-N vaccine has observed vaccine-induced protective immunity in Syrian hamsters and K18-hACE2 mice, as evidenced by decreased animal weight loss and viral loads [81].
N protein-based subunit vaccines have also been developed and tested. SARS-CoV-2 N protein either alone or in combination with well-known adjuvants (e.g., Freund’s adjuvant, alum, or QS-21) has been shown to be immunogenic in animal models [82, 83]. A synthetic peptide vaccine comprising N epitopes (HLA class I bound cytotoxic T lymphocyte peptide) along with adjuvants (Toll-like receptor [TLR] 4 agonist (MPLA) and TLR9 agonist [CpG oligonucleotide]) have shown moderate protection against SRAS-CoV-2 in Rhesus macaques [84].
Our group has generated a nucleoside-modified, lipid-nanoparticle (LNP)-formulated mRNA vaccine that encodes the full-length N protein of SARS-CoV-2 (Wuhan-Hu-1 strain) (mRNA-N) [60]. Immunogenicity analyses in mice have indicated that mRNA-N is highly immunogenic and induces strong N-specific binding antibody and T cell (both CD4+ and CD8+) responses. As expected, no neutralizing antibody response is elicited by mRNA-N [60]. Similarly to N-expressing vaccines based on other platforms [79, 80], the mRNA-N vaccine has been found to induce modest protection against mouse-adapted SARS-CoV-2 and the Delta strain in mice and hamsters [60].
MULTIVALENT SARS-COV-2 VACCINES EXPRESSING MULTIPLE VIRAL PROTEINS
To elicit both neutralizing antibodies and broadly protective T cell responses, several multivalent SARS-CoV-2 vaccines expressing both S and N proteins have been developed, mainly through use of viral vectors. A synthetic modified vaccinia Ankara (MVA) vaccine expressing SARS-CoV-2 S and N protein (COH04S1) has been shown to be immunogenic [85] and to confer protection against a challenge with ancestral SARS-CoV-2 in non-human primates (NHPs) [86]. The efficacy of this vaccine against major SARS-CoV-2 variants remains to be determined. The vaccine has been evaluated in a phase I clinical trial, which has indicated that the vaccine is well tolerated and elicits S- and N-specific immune responses in healthy participants [87]. Similarly, another independent study has investigated the effect of a recombinant MVA vaccine expressing S and N proteins (MVA/SdFCS-N) in NHPs against the SARS-CoV-2 Delta variant [88]. Among the different routes of administration, MVA/SdFCS-N administration via an intramuscular route has been found to better protection against the Delta variant than other routes [88].
Another study has focused on the development of trivalent COVID-19 vaccines by using adenoviral vectors of human and chimpanzee origin expressing the S1, N, and RNA-dependent RNA polymerase of SARS-CoV-2 [89] and also has compared intramuscular and intranasal immunization strategies. Of note, a single intranasal mucosal immunization led to strong, systemic neutralizing antibodies. In addition, intranasal vaccination increased tissue resident memory CD8+ T cells in the respiratory mucosa as well as locally trained macrophages. The intranasal strategy conferred protection against the parent strain of SARS-CoV-2 as well as other variants of concern (B.1.1.7 and B.1.351) [89].
Using the human Ad5 vector, ImmunityBio has developed an approach targeting both SARS-CoV-2 S and N proteins. The S and Enhanced T-cell Stimulation Domain (N-ETSD) of the N protein are used together in the vaccine, and both have been found to be immunogenic in mice. Subcutaneous priming and intranasal or subcutaneous boosting with this dual antigen vaccine induces greater Th1-biased T-cell and humoral responses than Ad5-S alone [90]. In a phase I trial, a single dose (1 × 1011 viral particle) vaccination in healthy adults has been found to elicit a strong T cell response [91]. This platform supports the feasibility of needleless immunizations in general. However, viral challenge studies are needed to evaluate the vaccine efficacy in animal models.
Vesicular stomatitis virus is a commonly used vector for mucosal delivery. Intranasal administration of a vesicular stomatitis virus-based vaccine expressing S and N proteins has been found to elicit a mucosal immune response and protective effects in a hamster viral challenge model, whereas the same vaccine administered intramuscularly is only marginally protective [92].
In addition to viral vectors, multivalent vaccines based on DNA or proteins have been reported. Mice immunized with a DNA vaccine encoding S/RBD and N protein have been found to develop broad neutralizing antibodies and N-specific T cell responses [93]. The DNA vaccine has been found to protect mice against lethal SARS-CoV-2 infection [93]. Similarly, a phase I clinical trial of a DNA vaccine (GX-19N™) encoding the RBD and N protein has demonstrated its safety and immunogenicity [94], although the efficacy of this vaccine in animal models is unclear. Hong and colleagues have immunized macaques with multivalent protein subunit vaccine comprising the RBD fused with tetanus toxoid epitope P2 (RBD-P2) and N protein, and shown that addition of N protein induces slightly faster SARS-CoV-2 clearance than that induced by RBD-P2 alone in NHPs [95].
Although the above multivalent vaccine approaches have been found to elicit various levels of protection against SARS-CoV-2 or its variants in different animal models, a direct comparison of their efficacy in protecting against SARS-CoV-2 VOCs, with respect to the efficacy of current clinically approved first-generation COVID-19 vaccines, particularly the mRNA-S vaccines, is critically lacking. Their efficacy against the currently predominant Omicron variants also remains to be determined through comparison with the clinically approved S-protein-targeting vaccines.
As discussed earlier, the mRNA vaccine expressing the full-length N protein (mRNA-N) generated by our group confers modest protection against a mouse adaptive SARS-CoV-2 and Delta strain. To enable comparison with first-generation S-protein-targeting vaccines, we also generated an mRNA-LNP vaccinee expressing the prefusion-stabilized SARS-CoV-2 S protein (mRNA-S), whose immunogenicity and protective efficacy against multiple SARS-CoV-2 VOCs was evaluated either alone or in combination with mRNA-N (mRNA-S+N) in rodent models [60]. Our data showed that mRNA S+N vaccination induced markedly stronger protection than mRNA-S alone against multiple SARS-CoV-2 strains, including a mouse-adapted strain, Delta, and Omicron, in both the lower and upper respiratory tracts. The difference between mRNA-S alone and mRNA-S+N was most profound in the protection against Omicron: whereas mRNA-S alone had a substantially diminished efficacy against Omicron because of strong immune escape, mRNA-S+N vaccination conferred substantial protection against Omicron (four of five animals had no detectable viral RNA and titers in the lungs) [60], thus indicating induction of cross-protective immunity against VOCs by mRNA-S+N. In vivo CD8+ T cell depletion supports that CD8+ T cells play a critical role in protection against VOCs after mRNA-S+N vaccine immunization [60]. Collectively, our findings suggest that the bivalent mRNA-S+N is a potential pan-COVID-19 vaccine for emerging SARS-CoV-2 variants.
SUMMARY AND FUTURE DIRECTIONS
As new variants of SARS-CoV-2 continue to arise, pan-COVID-19 vaccines that induce broad protection against emerging strains must urgently be developed. Current strategies to develop vaccines against COVID-19 VOCs, include utilization of VOC-specific S immunogens. Although a VOC-targeted S protein booster has been clinically approved in the two mRNA vaccinees, this strategy is likely to be challenging, because as S proteins continually mutate, the design and selection of VOC-specific sequences will be become less optimal. Given the more conserved sequence of N than S protein across different variants, as well as the cross-reactive, long-lasting N-protein-specific T cell immunity [96, 97], our study and those from other groups have clearly demonstrated the benefit and utility of including N immunogen in the next-generation COVID-19 vaccines for VOCs. Our study has indicated that, despite the use of ancestral sequences, the inclusion of mRNA-N along with mRNA-S for vaccination elicits a robust protection against both the Delta and Omicron strains, which is not achieved by mRNA-S alone. Given the demonstrated safety profile of the mRNA-LNP platform in large human populations, the bivalent mRNA-S+N vaccine should be advanced to clinical testing.
Several important questions remain to be explored. First, the durability of vaccine-elicited protective immunity is critical [98, 99]. Whether the dual mRNA-S+N vaccine induces more durable protective immunity than mRNA-S alone is unclear and is currently under investigation. Second, because a large human population has been vaccinated with first-generation vaccines or naturally infected with the virus, further testing the inclusion of N immunogen as a booster strategy will be likely to reveal additional insights into the utility of this bivalent vaccine approach against SARS-CoV-2 VOCs. Third, further validating the safety and efficacy of the vaccine approach in larger animal models will be helpful before clinical testing. Finally, because COVID-19 VOCs can infect individuals through vaccine breakthrough and antibody escape, screening for broadly neutralizing or pan-neutralizing antibodies against COVID-19 VOCs is also important for the development of a pan-COVID-19 vaccine, as well as the prevention and treatment of VOC infections in the future.