Mass Spectrometry Analysis of SARS-CoV-2 Nucleocapsid Protein Reveals Camouflaging Glycans and Unique Post-Translational Modifications : Infectious Microbes & Diseases

Secondary Logo

Journal Logo

Original Article

Mass Spectrometry Analysis of SARS-CoV-2 Nucleocapsid Protein Reveals Camouflaging Glycans and Unique Post-Translational Modifications

Sun, Zeyu#; Zheng, Xiaoqin#; Ji, Feiyang; Zhou, Menghao; Su, Xiaoling; Ren, Keyi; Li, Lanjuan

Editor(s): van der Veen, Stijn

Author Information
Infectious Microbes & Diseases 3(3):p 149-157, September 2021. | DOI: 10.1097/IM9.0000000000000071

Abstract

Introduction

The ongoing devastating coronavirus disease 2019 (COVID-19) pandemic is caused by the newly identified severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which can cause lower respiratory tract infections and atypical pneumonia.1–3 Compared to its close relative SARS-CoV, SARS-CoV-2 is more contagious.4–6 As in other coronaviruses, the SARS-CoV-2 virion is composed of the trimeric Spike protein (S), the membrane glycoprotein (M), and the envelope protein (E) on its surface, and nucleocapsid phosphoprotein (N, or NCP) in its core.7 Tremendous efforts have been made to elucidate structural and functional details of protein S due to its essential role in host attachment, thus making it the primary target for vaccine or antibody development.1,8–11 However, structural and functional details of other structural proteins of SARS-CoV-2 have not been investigated thoroughly.

Being the most abundant structural protein, NCP constitutes the coronavirus (CoV) core that encapsulates and stabilizes the viral genome.12 In infected cells, NCP plays a vital role in viral transcription and virion assembly.12–16 The multifunctional roles of NCP were also revealed by studies showing its ability to disrupt host innate immune responses17–19 and to manipulate host gene expression.20 At the early stage of SARS-CoV infections, NCP can trigger immediate and strong immune responses. Therefore, both NCP and anti-NCP antibodies can be detected in the serum of SARS-CoV patients, even at the early stage of infections.21 Studies have demonstrated that NCP of SARS-CoV or SARS-CoV-2 is a more sensitive indicator of early infections than IgG or viral RNA.22,23 Because SARS-CoV-2 NCP shares high homology with its SARS-CoV counterpart,24 it is presumed to have similar value for diagnostic applications.23,25–28 Moreover, the strong immunogenicity of NCP also makes it an attractive target to develop SARS-CoV vaccines as an alternative to protein S.29 Analogously, SARS-CoV-2 NCP has also been proposed as a candidate target for the development of vaccines30–32 and antiviral drugs33–36 against COVID-19. This idea was further prompted by increasing concerns over the efficacy of currently deployed vaccines or antibodies targeting SARS-CoV-2 protein S, which mutates rapidly worldwide.37–39 Encouragingly, recent studies have shown that SARS-CoV-2 patients are more likely to develop T cell responses against NCP than against protein S, and are able to generate durable B cell memory for both NCP and protein S.40–42

The SARS-CoV NCP was previously found to contain several important post-translational modifications (PTMs), primarily in the form of phosphorylation.7,43–45 Phosphorylation of SARS-CoV-2 NCP was also recently proposed as a key factor in modulating nucleocapsid assembly.45,46 However, little is known about NCP decoration with other forms of PTMs that may regulate its functions or interactions with host and viral components. In particular, as bioinformatic analyses predict five possible glycosylation sequons (47NNT, 77NSS, 192NSS, 196NST, 269NVT) in NCP, it will be interesting to test if those sites contain glycans, which may contribute greatly to its immunogenicity and functions. In this study, we report a comprehensive analysis of the SARS-CoV-2 NCP PTM profile using high-resolution mass spectrometry, based on which a protein structural model was proposed to highlight possible biological or immunological implications. For the first time, we show that SARS-CoV-2 NCP is decorated with N-glycans and ubiquitin in addition to phosphoryl groups.

Results

Landscape of PTMs on SARS-CoV-2 NCP

To portray the PTM landscape, liquid chromatography with tandem mass spectrometry (LC-MSMS) data of peptides, derived by digestion of SARS-CoV-2 NCP with multiple proteases, were subjected to iterative spectra-database matching for common PTMs (Figure 1). Detailed PTM identification can be found in Supplementary Table 1, https://links.lww.com/IMD/A9. Our data unveiled that SARS-CoV-2 NCP is decorated with a plethora of PTMs (Figure 2A), including acetylation (K342, K375), succinylation (K387, K388), di-Gly (K169, K374, K388), phosphorylation (S2, S26, S176, S180, S184, T265, T379, T391, T393, S410, S416) and methylation (E136) as summarized in Table 1. All domains of NCP were decorated with PTMs, with the majority of PTMs clustered in the N terminal domain (NTD) and C tail of NCP. As NCP is expected to be heavily modified by phosphorylation, this study identified five novel NCP phosphorylation sites in addition to previous archived sites,47,48 amounting to a total of 27 identified phosphorylation sites out of 80 potential sites predicted by GPS 5.049 in SARS-CoV-2 NCP (Figure 2B, Supplementary Table 2, https://links.lww.com/IMD/A10). Example mass spectra of sequences containing phosphorylation identification can be found in Supplementary Figure 1, https://links.lww.com/IMD/A12. Other PTMs, such as crotonylation (K), farnesylation (K/Nterm), myristoylation (K/Nterm), palmitoylation, or prenylation (C) were not found in this study.

F1
Figure 1:
Overall workflow for comprehensive PTM survey of SARS-CoV-2 NCP using multi-enzyme digestion and enrichment techniques. DTT: dithiothreitol; FASP: filter aided sample preparation; HILIC: hydrophilic interaction liquid chromatography; IAA: iodacetamide; NCP: nucleocapsid phosphoprotein; PTM: post-translational modification; SARS-CoV-2: severe acute respiratory syndrome coronavirus 2; SIMAC: sequential elution from IMAC (immobilized metal ion affinity based chromatography).
F2
Figure 2:
Summary of PTMs in SARS-CoV-2 NCP. A: Distribution of PTMs identified in this study related to functional subunits and domains of SARS-CoV-2 NCP. PTM notation can be found in right lower legend. Predominant N-glycan forms were directly marked on both glycosylation sites. B: Venn diagram summary of SARS-CoV-2 NCP phosphorylation sites identified in our and previous studies. C: Lysates from SARS-CoV-2 NCP expressed 293T cells were subjected to N and IgG pull-down followed by immunoblotting analyses. CTD: C terminal domain; NCP: nucleocapsid phosphoprotein; NTD: N terminal domain; PTM: post-translational modification; SARS-CoV-2: severe acute respiratory syndrome coronavirus 2.
Table 1 - Summary of PTMs identified from SARS-CoV-2 NCP
Domain Sites PTM Sequence identified
N-arm 2 Phosphorylation SDNGPQNQRNAPRITF
26 Phosphorylation ITFGGPSDSTGSNQNGER
NTD 77 N-glycosylation GQGVPINTNSSPDDQIGYYR
136 Methylation DGIIWVATEGALNTPK
169 di-Gly NPANNAAIVLQLPQGTTLPK
176 Phosphorylation GFYAEGSRGGSQASSRGFYAEGSR
180 Phosphorylation GFYAEGSRGGSQASSR
184 Phosphorylation GFYAEGSRGGSQASSR
CTD 265 Phosphorylation QKRTATKAYNVTQAFGR
269 N-glycosylation AYNVTQAFGR
342 Acetylation LDDKDPNFK
C-tail 374 di-Gly KKADETQALPQR
375 Acetylation KADETQALPQR
379 Phosphorylation KADETQALPQRADETQALPQR
387 Succinylation QKKQQTVTLLPAADLDDFSK
388 di-Gly KQQTVTLLPAADLDDFSK
388 Succinylation QKKQQTVTLLPAADLDDFSK
391 Phosphorylation KQQTVTLLPAADLDDFSKQTVTLLPAADLDDFSK
393 Phosphorylation QTVTLLPAADLDDFSK
410 Phosphorylation KQQTVTLLPAADLDDFSKQLQQSMSSADSTQA
416 Phosphorylation
CTD: C terminal domain; NCP, nucleocapsid phosphoprotein; NTD: N terminal domain.
Modified sites. Tryptic digestion resulted in di-Gly residue on lysine modified by ubiquitin.

Confirmation of ubiquitination of SARS-CoV-2 NCP

The di-Gly residue on lysine can be the result of tryptic digestion either by ubiquitin or ISG15 modifier. To resolve this ambiguity, SARS-CoV-2 NCP expressed in 293T cells was immunoprecipitated and then immunoblotted with either anti-ubiquitin or anti-ISG15 antibodies. As compared to the negative control sample (IgG immunoprecipitated), SARS-CoV-2 NCP was clearly modified by ubiquitin rather than by ISG15 (Figure 2C).

Determination of N-glycosylation on SARS-CoV-2 NCP

To characterize glycosylation, glycopeptides from SARS-CoV-2 NCP digested with multiple proteases were enriched by hydrophilic interaction liquid chromatography-solid phase extraction (HILIC SPE). To determine glycosylated sites, peptides were deglycosylated by PNGase F in H2O18 creating a +2.98 Da mass shift to mark glycosylated sites. Out of five possible N-X-S/T glycosylation sites, two sites (N77, N269, Table 1) in NCP were confirmed by intact glycopeptide characterization, while no evidence of N-glycosylation was found for the other three potential sites (N47, N192, and N196). Example mass spectra showing evidence of glycosylation were shown in Supplementary Figure 2A and 2B, https://links.lww.com/IMD/A12.

Intact glycopeptides were also analyzed directly by LC-MSMS to resolve the highly heterogeneous glycan components. Our analysis identified 55 unique N-glycopeptides, deemed as a unique amino acid sequence with a unique N-glycan composition (Supplementary Table 3, https://links.lww.com/IMD/A11). Glycans in both sites (N77, N269) were confirmed by intact glycopeptide profiling. The composition analysis by pGlyco (Figure 3A) suggested that N77 was preferentially occupied by complex type N-glycans, with 13 possible configurations comprising nearly 95% of total LC-MS intensity of all glycopeptides involving site N77, while the remaining 5% intensity was contributed by a hybrid N-glycan. According to LC-MS intensity, the most dominant glycan on N77 was Fuc2Hex3HexNAc6, which is predicted to be a fucosylated bi-antennary complex. In comparison, nearly 79% of LC-MS intensity of all N269-related glycopeptides can be attributed to six high-mannose oligosaccharides, while the remaining 21% was contributed by complex and hybrid N-glycans. The most dominant glycan on N269 was Hex5HexNAc2. So far, no evidence of sialic acid conjugation to NCP glycans was found at either site. Example N-glycopeptides spectra can be found in Supplementary Figure 2C and 2D, https://links.lww.com/IMD/A12.

F3
Figure 3:
SARS-CoV-2 NCP decorated with glycans. A: Relative LC-MS intensity of four major N-glycan categories on both sites. B: The most likely glycan structure on site N77 was added to the NTD RNA-binding domain SARS-CoV-2 NCP based on NTD binding model with dsRNA (Protein Data Bank ID: 7ACS, sequence 44-180). C: The most likely glycan structure on site N269 was added to the CTD dimerization domain SARS-CoV-2 NCP based on model of four CTD subunits polymerized (Protein Data Bank ID: 6WZQ, sequence 247-364), with each subunit colored differently. All NCP sequences were represented in cartoon mode while glycans were represented in sphere mode. CTD: C terminal domain; dsRNA: double-stranded RNA; LC-MS: liquid chromatography with mass spectrometry; NCP: nucleocapsid phosphoprotein; NTD: N terminal domain; PTM: post-translational modification; SARS-CoV-2: severe acute respiratory syndrome coronavirus 2.

Based on the structures of the most likely N-glycans derived in this study and the SARS-CoV-2 NCP structure models, we generated atomic models that represented the most likely spatial distribution of the N-glycans on the SARS-CoV-2 NCP (Figure 3B and 3C). Since the full structure of NCP has not been determined yet, currently available partial structure models of the N-terminal RNA-binding domain (7ACS) and C-terminal dimerization domain (6WZQ) were used to resolve glycans on site N77 and N296, respectively.

Alignment of PTMs with immunogenic epitopes on SARS-CoV-2 NCP

As PTMs, particularly in the form of glycans, can drastically change immunogenicity of a protein, we checked whether PTMs are located in potential epitope areas of SARS-CoV-2 NCP. However, because currently there is a lack of experimental data on SARS-CoV-2 NCP immune recognition, we resorted to bioinformatic prediction tools to generate potential epitope sequences. Out of 11 B cell epitopes of NCP predicted by Bepipred, six involved phosphorylation, ubiquitination, succinylation, acetylation, and/or glycosylation (Table 2). In particular, multiple phosphorylation sites in the SR-rich domain (S176, S180, S184) and ubiquitination (K169) were found in epitope 165-216. Another C-tail epitope (358-402) was decorated by a variety of PTMs including phosphorylation (T379, T391, T393), ubiquitination (K374, K388), acetylation (K375), and succinylation (K387, K388). To highlight, glycosylation (N77) was also found in a B cell epitope (59-105). In addition, as a viral core protein, the NCP epitope is likely to be endogenously processed by CD8+ T cells and presented as major histocompatibility complex class I (MHC-I)–associated peptides. Out of 11 T-cell epitope peptides predicted by NetMHCpan, only two contain PTMs, including ubiquitination (K169) in epitope 165–173 and acetylation (K342) in epitope 338-346.

Table 2 - Predicted immuno-epitopes in SARS-CoV-2 NCP aligned with PTMs
Start End Epitope sequence Length Score PTM
B cell epitope sequences predicted by Bepipred algorithm
4 15 NGPQNQRNAPRI 12 0.60
17 48 FGGPSDSTGSNQNGERSGARSKQRRPQGLPNN 32 0.68 Phosphorylation: S26
59 105 HGKEDLKFPRGQGVPINTNSSPDDQIGYYRRATRRIRGGDGKMKDLS 47 0.56 Glycosylation: N77
119 127 AGLPYGANK 9 0.54
137 163 GALNTPKDHIGTRNPANNAAIVLQLPQ 27 0.58
165 216 TTLPKGFYAEGSRGGSQASSRSSSRSRNSSRNSTPGSSRGTSPARMAGNGGD 52 0.66 Ubiquitination: K169; Phosphorylation: S176, S180, S184
226 267 RLNQLESKMSGKGQQQQGQTVTKKSAAEASKKPRQKRTATKA 42 0.60 Phosphorylation: S265
276 299 RRGPEQTQGNFGDQELIRQGTDYK 24 0.56
343 348 DPNFKD 6 0.53
358 402 DAYKTFPPTEPKKDKKKKADETQALPQRQKKQQTVTLLPAADLDD 45 0.62 Ubiquitination: K374, K388; Acetylation K375; Phosphorylation: T379, T391, T393; Succinylation: K387, K388
404 416 SKQLQQSMSSADS 13 0.58 Phosphorylation: S410, S416
MHC-I epitope sequences predicted by Net MHC pan EL algorithm
112 121 YLGTGPEAGL 10 0.229
158 167 VLQLPQGTTL 10 0.197
165 173 TTLPKGFYA 9 0.095 Ubiquitination: K169
220 230 ALLLLDRLNQL 11 0.956
226 234 RLNQLESKM 9 0.126
305 313 AQFAPSASA 9 0.176
316 324 GMSRIGMEV 9 0.422
330 339 WLTYTGAIKL 10 0.066
338 346 KLDDKDPNF 9 0.552 Acetylation: K342
351 359 ILLNKHIDA 9 0.125
397 407 AADLDDFSKQL 11 0.191
Bepipred score were calculated for each amino acid, and score for a peptide (selection threshold >0.5) were calculated as average of all amino acids. For MHC peptides prediction, the panEL prediction score was provided, and peptides of top 2% ranking score (corresponding to >0.095 in our analysis) were selected.MHC-I: major histocompatibility complex class I; NCP: nucleocapsid phosphoprotein; PTM: post-translational modification; SARS-CoV-2: severe acute respiratory syndrome coronavirus 2.

Discussion

In coronavirus-infected cells, NCP is mainly located in the cytoplasm. It is able to self-associate into an oligomer and bind to viral RNA via its NTD, thus forming the ribonucleoprotein core complex that protects viral RNA and aids virus replication.12–16 On top of this, the multifunctional NCP has also been shown to participate in many host cellular processes, including gene transcription, signal transduction, apoptosis regulation, interferon inhibition, cytoskeleton reorganization, and cell cycle regulation.13,14,17,20,50,51 Given its importance, a detailed survey of PTMs may help to extend our understanding of the structural details and biological functions of SARS-CoV-2 NCP.

CoV NCP is widely known as a phosphoprotein.7,43,44 Multiple phosphorylation sites have been mapped to the SR-rich region in SARS-CoV NCP, which have been shown to play roles in the polymerization and translocation of NCP and its binding to viral RNA and host heteronuclear ribonucleoproteins.44,52,53 However, previous LC-MS–based studies have revealed more phosphorylation sites outside of the SR-rich region.47,48 Our data also support that SARS-CoV-2 NCP is heavily decorated by phosphoryl groups across all domains. Compared with previous studies,47,48 our data have identified five novel phosphorylation sites, all located in the C-tail region where phosphorylation has not been discovered before. The difference of phosphorylation profiles between our study and two previous studies can probably be attributed to differential bias of enrichment techniques, and varying cellular consequences of viral infection and NCP transfection. Given that there is as many as 80 potential phosphorylation sites in SARS-CoV-2 NCP, and the technical bias of the peptide-centric LC-MS approach against hydrophilic phosphorylated sequences, more phosphorylation sites are likely to be mapped in future studies with complementary techniques or different expression systems. Future studies are also warranted to explore virological or immunological implications of the complex NCP phosphorylation profile, particularly for those sites outside of the SR-rich region, such as the C-tail region, which has been demonstrated to have high antigenicity on SARS-CoV NCP.29 Other than phosphorylation, additional forms of PTMs such as acetylation, methylation, and succinylation were also found in SARS-CoV-2 NCP. Acylation can reduce the charge state and increase the hydrophobicity of modified sites. Further studies are needed to investigate possible biological roles of these PTMs.

To the best of our knowledge, this is the first time ubiquitination has ever been identified in a CoV protein. Other than causing protein degradation events, ubiquitination also triggers signaling or protein translocation. The locations of the newly identified ubiquitination sites on SARS-CoV-2 NCP are interesting: site K169 resides immediately before the SR-rich domain, while sites K374 and K388 both sit within the predicted bipartite nuclear localization signal domain (NLS-BP, KKKKADETQALPQRQKKQ), which is responsible for NCP translocation from the cytoplasm to the nucleus. Notably, alignment analysis (Supplementary Figure 3A, https://links.lww.com/IMD/A12) suggests that all three ubiquitination sites are strictly conserved in NCP of SARS-CoV-2 (K169, K374, K388), SARS-CoV (K170, K375, K389) and bat CoV HKU3 (K169, K374, K388). In fact, ubiquitination, rather than ISGylation, can also be detected by immunoblotting of NCP from SARS-CoV (Supplementary Figure 3B, https://links.lww.com/IMD/A12). The impact of these ubiquitination events on the cellular location and functions of NCP remains to be investigated. It will also be interesting to known which E3 ligase is responsible for the ubiquitination of NCP. Interestingly, E3 ligase TRIM25 was recently found to be an anti-viral host determinant,54,55 and an interactor of SARS-CoV and Middle East respiratory syndrome coronavirus (MERS-CoV) NCP.17 Therefore, future studies are needed to test if SARS-CoV-2 NCP can be ubiquitinated by TRIM25.

Glycosylation is typically found on the viral surface or envelope proteins, and it has been shown to be a critical determinant for viral pathogenesis and immunogenicity,56–58 and thus has been considered as a key molecular aspect for antiviral treatment or vaccine development.59,60 For instance, CoV Spike proteins, including those of SARS-CoV and SARS-CoV-2, are heavily masked by glycan camouflage,61–64 with important implications on host attachment, immune responses, as well as virion assembly and budding.58,65–73 Glycosylation can also be found on CoV membrane proteins.74,75 However, so far there has been no report about glycosylation of non-surface proteins such as CoV NCP. In our study, glycosylation of SARS-CoV-2 NCP was evidenced indirectly by PNGase F treatment in H2O18, which leaves an isotopic mark to the glycosylation locus. Alignment analysis of CoV sequences (Supplementary Figure 4, https://links.lww.com/IMD/A12) revealed that only N68 in NCP of the highly pathogenic MERS-CoV shares homology with site N77 in SARS-CoV-2 NCP. In contrast, potential glycosylation sites that are homologous to N269 of SARS-CoV-2 NCP can be found across multiple CoV species: SARS-CoV (N270), H-CoV-229E (N260), Bat-CoV-HKU3 (N269). To further confirm NCP glycosylation, the glycan component in these sites was determined directly by LC-MSMS as well. Typically, due to incomplete glycan maturation, viral protein glycans are mainly composed of short oligomannose branches, as in the case of site N269 in the C terminal domain domain.76 Nonetheless, site N77 in the NTD domain was almost completely occupied by complex glycans. These heterogeneous N-glycan patterns greatly extend conformational flexibility and epitope diversity of NCP and may profoundly contribute to the unique biological and clinical characteristics of SARS-CoV-2 as compared to other CoVs.

The ongoing global efforts to generate vaccines or antibodies for SARS-CoV-2 primarily focus on protein S as the target antigen.37,38,77 Compared to surface protein S that mutates at higher rate, NCP is more conserved across CoV species.24 Therefore, NCP is a more attractive target for the development of universal CoV vaccines. This idea is further prompted by recent studies showing NCP was capable of inducing long-term humoral immune responses in SARS-CoV-2 patients.40–42 As a viral core protein that is highly expressed in infected cells, NCP is likely to induce immune responses containing cytotoxic T lymphocytes,78,79 providing complementary protection in addition to neutralizing antibodies. This is underscored by the recent discovery showing SARS-CoV-2 patients developed stronger T cell responses against NCP than against protein S.40 Furthermore, current anti-CoV drug developments primarily target replicase proteins, therefore development of “cocktail” antiviral therapies can be achieved by targeting NCP.36 The comprehensive PTM profile, along with the refined structural models of SARS-CoV-2 NCP proposed in this study, can provide guidance to produce new vaccines or therapeutics to join the current vaccination and treatment paradigm in our arduous battle against the COVID-19 pandemic.

Materials and methods

Expression and purification of SARS-CoV-2 NCP

HEK293T (Human embryonic kidney, ATCC® CRL-3216TM) cells were grown in Dulbecco modified Eagle medium (Gibco BRL, Grand Island, NY, USA) supplemented with 10% fetal bovine serum (Gibco BRL). The full-length SARS-CoV-2 NCP gene (GenBank number QHD43419.1) was cloned into the pcDNA3.1 vector with a C-terminal 2 X Strep tag. HEK293T cells or Vero cells were transfected with the plasmid using Lipofectamine2000 Transfection Reagent (Invitrogen, CA, USA, 11668019). After 40–60 hours, the cells were lysed in 1 mL cold immunoprecipitation (IP) buffer (50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 1 mM EDTA) supplemented with 0.5% Nonidet P 40 Substitute (NP-40, Solarbio) and 1× complete EDTA-free protease inhibitor cocktail (Roche, Switzerland, 11873580001) for 30 minutes at 4°C. The lysates were then centrifuged for 30 minutes at 15000×g. The remaining lysate was incubated with 30 μL of Strep-Tactin Sepharose beads (IBA Lifesciences, Germany) in 0.6 mL IP Buffer for 2 hours with constant rotation at 4°C. The beads were washed three times with 1 mL of IP buffer containing 0.05% NP-40 followed by one final wash in detergent-free IP buffer in a fresh detergent-free tube. Proteins were eluted by agitating beads in 40 μL IP buffer supplemented with 2.5 mM D-Desthiobiotin (IBA Lifesciences) on a vortex mixer at room temperature for 30 minutes.

Sample preparation for PTM survey

NCP was digested into peptides according to a modified filter aided sample preparation protocol.80 Briefly, NCP was first reduced by 5 mM dithiothreitol in 0.5% RapiGest SF and 50 mM ammonium bicarbonate (ABC) in a 10 kD molecular weight cut-off spin column (Millipore, Germany) at 56°C for 15 minutes, and then alkylated by 15 mM iodoacetamide for 45 minutes in the dark. The sample was centrifuged at 12000×g for 10 minutes and washed twice by 50 mM ABC. Protein was resuspended in 50 mM ABC with 0.1% RapiGest SF and digested overnight at 37°C by sequence-grade trypsin, endoproteinase Lys-C, and chymotrypsin (all Promega, WI, USA) with a protein: protease ratio of 1:30. All peptides were desalted using C18 stagetips, dried by SpeedVac before direct LC-MSMS analyses or glycosylation characterization. The overall workflow for comprehensive PTM survey can be found in Figure 1.

Sample preparation for protein glycosylation characterization

Peptides derived from nucleocapsid protein by multiple protease digestion were used to enrich intact N-glycopeptides by house-made zwitterionic hydrophilic interaction liquid chromatography (ZIC-HILIC) stage-tip. Briefly, stage-tip packed with Exsil Pure ZIK HILIC 5 μm beads (Dr. Maisch GmbH, Germany, 6136918) was pre-conditioned with 0.1% trifluoroacetic acid (TFA) in 80% acetonitrile (ACN). Peptides in 0.1% TFA in 80% ACN were loaded to the beads and unbound flow-through fraction was collected. The column was washed twice with 0.1% TFA in 80% ACN. Glycopeptides were eluted first by 0.1% TFA and then by 50 mM ABC. Other than direct analyses by LC-MSMS, glycopeptides were also dried by Speed Vac and deglycosylated by PNGase F (NEB, 1:100) overnight at 37°C in 50 mM ABC in pure H2O18.

LC-MSMS experiments

Peptides (500 ng) were separated by C18 nano-column using an Ultimate 3000 nanoflow liquid chromatography system (Thermo Scientific, MA, USA) and analyzed by the Q-Exactive HFX mass spectrometry (Thermo Scientific) operated under the data-dependent mode. The MS spectra were recorded by Xcalibur software 2.3 (Thermo Scientific). All operation parameters for nanoLC and MS systems are detailed in the Supplementary Methods based on previously published methods, https://links.lww.com/IMD/A12.64

Bioinformatics

All MS raw data files of directed peptides analyses were searched by MaxQuant (version 1.6.10.43) against the Uniprot SARS-CoV-2 sequences. Mass tolerance of 4.5 ppm was set for the search. Trypsin/chymotrypsin/LysC with up to two missed cleavages was set. Carbamidomethylation on Cys was set as fixed modification, while oxidation on Met and O18 deamidation on Asn were set as variable modification. Identification was further screened by N-glycosylation possible sequons (N-X-S/T, X≠P). To explore additional PTM forms, phosphorylation (S/T), acetylation (K), methylation (K/R/E), succinylation (K), crotonylation (K), di-Gly (K), farnesylation (K/Nterm), myristoylation (K/Nterm), palmitoylation or prenylation, and oxidation on proline were separately specified in individual searches. Peptide level 1% FDR was set to filter the results. Confident identification of PTM was based on localization probability of 95%.

To reveal N-glycosylation forms, MS raw files from ZIC-HILIC fractioned peptides were analyzed by pGlyco (version 2.2.2)81 using the following parameters: carbamidomethylation on Cys was set as fixed modification, while oxidation on Met was set as variable modification; two missed trypsin/chymotrypsin/LysC digestion cleavages; mass error of 5 ppm and 15 ppm for MS and MS2. The MS2 spectra were annotated by built-in pGlyco.gdb glycan structure database81 to identify glycan fragments.

Immunogenic epitopes prediction

Identified PTM positions were aligned with possible immunogenic epitopes on SARS-CoV-2 NCP predicted by bioinformatic tools. Briefly, predictions of B cell epitopes were performed by Bepipred (v2.0) available via IEDB database (http://tools.iedb.org/main/bcell).82 Sequences with average R score higher than 0.5 were considered as potential linear B cell epitopes. In addition, T cell epitopes likely presented by MHC-I were predicted by NetMHCpan (v4.1b).83 Sequences with top 2% ranking score were considered as potential T cell epitopes. Redundant or nested sequences were discarded.

Immunoblotting analysis

To confirm protein ubiquitination or ISGylation status, SARS-CoV-2 NCP expressed in 293T cells was immunoprecipitated by Strep-Tactin Sepharose beads (IBA Lifesciences, 2-1502-001), separated by 4%–15% gradient sodium dodecyl sulfate-polyacrylamide gel electrophoresis before transferred to PVDF membranes. Immunoblotting was performed by Strep-Tactin-HRP conjugate antibody (IBA Lifesciences, 2-1502-001, 1:1000), anti-Ubiquitin (CST 3933, 1:1000), anti-ISG15 (CST 2758, 1:1000), and anti-GAPDH (CST 5174, 1:1000) antibody and visualized by chemiluminescence detection kit.

Acknowledgments

The authors thank the proteomics and metabolomics platform in the State Key Laboratory for Diagnosis and Treatment of Infectious Diseases at Zhejiang University for mass spectrometry analyses.

References

[1]. Wu F, Zhao S, Yu B, et al. A new coronavirus associated with human respiratory disease in China. Nature 2020;579(7798):265–269.
[2]. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis 2020;20(5):533–534.
[3]. Zhou P, Yang XL, Wang XG, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 2020;579(7798):270–273.
[4]. Guan WJ, Ni ZY, Hu Y, et al. Clinical characteristics of coronavirus disease 2019 in China. N Engl J Med 2020;382(18):1708–1720.
[5]. Zhou F, Yu T, Du R, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet 2020;395(10229):1054–1062.
[6]. Wang D, Hu B, Hu C, et al. Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China. JAMA 2020;323(11):1061–11061.
[7]. Satarker S, Nampoothiri M. Structural proteins in severe acute respiratory syndrome coronavirus-2. Arch Med Res 2020;51(6):482–491.
[8]. Wan Y, Shang J, Graham R, Baric RS, Li F. Receptor recognition by the novel coronavirus from Wuhan: an analysis based on decade-long structural studies of SARS coronavirus. J Virol 2020;94(7):e00127–e00220.
[9]. Hoffmann M, Kleine-Weber H, Schroeder S, et al. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell 2020;181(2):271–280.e8.
[10]. Walls AC, Park YJ, Tortorici MA, Wall A, McGuire AT, Veesler D. Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell 2020;181(2):281–292.e6.
[11]. Wrapp D, Wang N, Corbett KS, et al. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science 2020;367(6483):1260–1263.
[12]. Hurst KR, Ye R, Goebel SJ, Jayaraman P, Masters PS. An interaction between the nucleocapsid protein and a component of the replicase-transcriptase complex is crucial for the infectivity of coronavirus genomic RNA. J Virol 2010;84(19):10276–10288.
[13]. Chang CK, Hou MH, Chang CF, Hsiao CD, Huang TH. The SARS coronavirus nucleocapsid protein--forms and functions. Antiviral Res 2014;103:39–50.
[14]. McBride R, van Zyl M, Fielding BC. The coronavirus nucleocapsid is a multifunctional protein. Viruses 2014;6(8):2991–3018.
[15]. Dinesh DC, Chalupska D, Silhan J, et al. Structural basis of RNA recognition by the SARS-CoV-2 nucleocapsid phosphoprotein. PLoS Pathog 2020;16(12):e1009100.
[16]. Savastano A, Ibanez de Opakua A, Rankovic M, Zweckstetter M. Nucleocapsid protein of SARS-CoV-2 phase separates into RNA-rich polymerase-containing condensates. Nat Commun 2020;11(1):6041.
[17]. Hu Y, Li W, Gao T, et al. The severe acute respiratory syndrome coronavirus nucleocapsid inhibits type I interferon production by interfering with TRIM25-mediated RIG-I ubiquitination. J Virol 2017;91(8):e02143–e2216.
[18]. Li JY, Liao CH, Wang Q, et al. The ORF6, ORF8 and nucleocapsid proteins of SARS-CoV-2 inhibit type I interferon signaling pathway. Virus Res 2020;286:198074.
[19]. Chen K, Xiao F, Hu D, et al. SARS-CoV-2 nucleocapsid protein interacts with RIG-I and represses RIG-mediated IFN-beta production. Viruses 2020;13(1):47.
[20]. Tsai TL, Lin CH, Lin CN, Lo CY, Wu HY. Interplay between the poly(A) tail, poly(A)-binding protein, and coronavirus nucleocapsid protein regulates gene expression of coronavirus and the host cell. J Virol 2018;92(23):e01162–e01218.
[21]. Che XY, Qiu LW, Pan YX, et al. Sensitive and specific monoclonal antibody-based capture enzyme immunoassay for detection of nucleocapsid antigen in sera from patients with severe acute respiratory syndrome. J Clin Microbiol 2004;42(6):2629–2635.
[22]. Liu L, Liu W, Zheng Y, et al. A preliminary study on serological assay for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in 238 admitted hospital patients. Microbes Infect 2020;22(4–5):206–211.
[23]. Guo L, Ren L, Yang S, et al. Profiling early humoral response to diagnose novel coronavirus disease (COVID-19). Clin Infect Dis 2020;71(15):778–785.
[24]. Basu BV, Brown OR. Comparative analysis of Coronaviridae nucleocapsid and surface glycoprotein sequences. Front Biosci 2020;25:1894–1900.
[25]. McAndrews KM, Dowlatshahi DP, Dai J, et al. Heterogeneous antibodies against SARS-CoV-2 spike receptor binding domain and nucleocapsid with implications on COVID-19 immunity. JCI Insight 2020;5(18):e142386.
[26]. Li T, Wang L, Wang H, et al. Serum SARS-CoV-2 nucleocapsid protein: a sensitivity and specificity early diagnostic marker for SARS-CoV-2 infection. Front Cell Infect Microbiol 2020;10:470.
[27]. Chia WN, Tan CW, Foo R, et al. Serological differentiation between COVID-19 and SARS infections. Emerg Microbes Infect 2020;9(1):1497–1505.
[28]. Liu D, Wu F, Cen Y, et al. Comparative research on nucleocapsid and spike glycoprotein as the rapid immunodetection targets of COVID-19 and establishment of immunoassay strips. Mol Immunol 2021;131:6–12.
[29]. Liu G, Hu S, Hu Y, et al. The C-terminal portion of the nucleocapsid protein demonstrates SARS-CoV antigenicity. Genomics Proteomics Bioinformatics 2003;1(3):193–197.
[30]. Dutta NK, Mazumdar K, Gordy JT. The nucleocapsid protein of SARS-CoV-2: a target for vaccine development. J Virol 2020;94(13):e00647–e00720.
[31]. Kumar A, Kumar P, Saumya KU, Kapuganti SK, Bhardwaj T, Giri R. Exploring the SARS-CoV-2 structural proteins for multi-epitope vaccine development: an in-silico approach. Expert Rev Vaccines 2020;19(9):887–898.
[32]. Ong E, Wong MU, Huffman A, He Y. COVID-19 coronavirus vaccine design using reverse vaccinology and machine learning. Front Immunol 2020;11:1581.
[33]. Kwarteng A, Asiedu E, Sakyi SA, Asiedu SO. Targeting the SARS-CoV2 nucleocapsid protein for potential therapeutics using immuno-informatics and structure-based drug discovery techniques. Biomed Pharmacother 2020;132:110914.
[34]. Ray M, Sarkar S, Rath SN. Druggability for COVID-19: in silico discovery of potential drug compounds against nucleocapsid (N) protein of SARS-CoV-2. Genomics Inform 2020;18(4):e43.
[35]. Tatar G, Ozyurt E, Turhan K. Computational drug repurposing study of the RNA binding domain of SARS-CoV-2 nucleocapsid protein with antiviral agents. Biotechnol Prog 2021;37(2):e3110.
[36]. Lang Y, Chen K, Li Z, Li H. The nucleocapsid protein of zoonotic betacoronaviruses is an attractive target for antiviral drug discovery. Life Sci 2020;282:118754.
[37]. McCarthy KR, Rennick LJ, Nambulli S, et al. Recurrent deletions in the SARS-CoV-2 spike glycoprotein drive antibody escape. Science 2021;371(6534):1139–1142.
[38]. Starr TN, Greaney AJ, Addetia A, et al. Prospective mapping of viral mutations that escape antibodies used to treat COVID-19. Science 2021;371(6531):850–854.
[39]. Dos Santos WG. Impact of virus genetic variability and host immunity for the success of COVID-19 vaccines. Biomed Pharmacother 2021;136:111272.
[40]. Reynolds CJ, Swadling L, Gibbons JM, et al. Discordant neutralizing antibody and T cell responses in asymptomatic and mild SARS-CoV-2 infection. Sci Immunol 2020;5(54):eabf3698.
[41]. Hartley GE, Edwards ESJ, Aui PM, et al. Rapid generation of durable B cell memory to SARS-CoV-2 spike and nucleocapsid proteins in COVID-19 and convalescence. Sci Immunol 2020;5(54):eabf8891.
[42]. Dan JM, Mateus J, Kato Y, et al. Immunological memory to SARS-CoV-2 assessed for up to 8 months after infection. Science 2021;371(6529):eabf4063.
[43]. Wang J, Ji J, Ye J, et al. The structure analysis and antigenicity study of the N protein of SARS-CoV. Genomics Proteomics Bioinformatics 2003;1(2):145–154.
[44]. Peng TY, Lee KR, Tarn WY. Phosphorylation of the arginine/serine dipeptide-rich motif of the severe acute respiratory syndrome coronavirus nucleocapsid protein modulates its multimerization, translation inhibitory activity and cellular localization. FEBS J 2008;275(16):4152–4163.
[45]. Carlson CR, Asfaha JB, Ghent CM, et al. Phosphoregulation of phase separation by the SARS-CoV-2 N protein suggests a biophysical basis for its dual functions. Mol Cell 2020;80(6):1092–1103.e4.
[46]. Carlson CR, Asfaha JB, Ghent CM, et al. Phosphoregulation of phase separation by the SARS-CoV-2 N protein auggests a biophysical basis for its dual functions. Mol Cell 2020;80(6):1092–1103. e4.
[47]. Bouhaddou M, Memon D, Meyer B, et al. The global phosphorylation landscape of SARS-CoV-2 infection. Cell 2020;182(3):685–712.e19.
[48]. Davidson AD, Williamson MK, Lewis S, et al. Characterisation of the transcriptome and proteome of SARS-CoV-2 reveals a cell passage induced in-frame deletion of the furin-like cleavage site from the spike glycoprotein. Genome Med 2020;12(1):68.
[49]. Wang C, Xu H, Lin S, et al. GPS 5.0: an update on the prediction of kinase-specific phosphorylation sites in proteins. Genomics Proteomics Bioinformatics 2020;18(1):72–80.
[50]. Kopecky-Bromberg SA, Martinez-Sobrido L, Frieman M, Baric RA, Palese P. Severe acute respiratory syndrome coronavirus open reading frame (ORF) 3b, ORF 6, and nucleocapsid proteins function as interferon antagonists. J Virol 2007;81(2):548–557.
[51]. Surjit M, Liu B, Chow VT, Lal SK. The nucleocapsid protein of severe acute respiratory syndrome-coronavirus inhibits the activity of cyclin-cyclin-dependent kinase complex and blocks S phase progression in mammalian cells. J Biol Chem 2006;281(16):10669–10681.
[52]. Surjit M, Kumar R, Mishra RN, Reddy MK, Chow VT, Lal SK. The severe acute respiratory syndrome coronavirus nucleocapsid protein is phosphorylated and localizes in the cytoplasm by 14-3-3-mediated translocation. J Virol 2005;79(17):11476–11486.
[53]. Luo H, Chen Q, Chen J, Chen K, Shen X, Jiang H. The nucleocapsid protein of SARS coronavirus has a high binding affinity to the human cellular heterogeneous nuclear ribonucleoprotein A1. FEBS Lett 2005;579(12):2623–2628.
[54]. Choudhury NR, Heikel G, Michlewski G. TRIM25 and its emerging RNA-binding roles in antiviral defense. Wiley Interdiscip Rev RNA 2020;11(4):e1588.
[55]. El-Asmi F, McManus FP, Brantis-de-Carvalho CE, Valle-Casuso JC, Thibault P, Chelbi-Alix MK. Cross-talk between SUMOylation and ISGylation in response to interferon. Cytokine 2020;129:155025.
[56]. Yang TJ, Chang YC, Ko TP, et al. Cryo-EM analysis of a feline coronavirus spike protein reveals a unique structure and camouflaging glycans. Proc Natl Acad Sci U S A 2020;117(3):1438–1446.
[57]. Vigerust DJ, Shepherd VL. Virus glycosylation: role in virulence and immune interactions. Trends Microbiol 2007;15(5):211–218.
[58]. Raman R, Tharakaraman K, Sasisekharan V, Sasisekharan R. Glycan-protein interactions in viral pathogenesis. Curr Opin Struct Biol 2016;40:153–162.
[59]. Chen WH, Du L, Chag SM, et al. Yeast-expressed recombinant protein of the receptor-binding domain in SARS-CoV spike protein with deglycosylated forms as a SARS vaccine candidate. Hum Vaccin Immunother 2014;10(3):648–658.
[60]. Kumar S, Maurya VK, Prasad AK, Bhatt MLB, Saxena SK. Structural, glycosylation and antigenic variation between 2019 novel coronavirus (2019-nCoV) and SARS coronavirus (SARS-CoV). Virusdisease 2020;31(1):13–21.
[61]. Shajahan A, Supekar NT, Gleinich AS, Azadi P. Deducing the N- and O- glycosylation profile of the spike protein of novel coronavirus SARS-CoV-2. Glycobiology 2020;30(12):981–988.
[62]. Walls AC, Tortorici MA, Frenz B, et al. Glycan shield and epitope masking of a coronavirus spike protein observed by cryo-electron microscopy. Nat Struct Mol Biol 2016;23(10):899–905.
[63]. Watanabe Y, Allen JD, Wrapp D, McLellan JS, Crispin M. Site-specific glycan analysis of the SARS-CoV-2 spike. Science 2020;369(6501):330–333.
[64]. Sun Z, Ren K, Zhang X, et al. Mass spectrometry analysis of newly emerging coronavirus HCoV-19 spike protein and human ACE2 reveals camouflaging glycans and unique post-translational modifications. Engineering (Beijing) 2020;doi: 10.1016/j.eng.2020.07.014. [Published ahead of print August 30].
[65]. Fukushi M, Yoshinaka Y, Matsuoka Y, et al. Monitoring of S protein maturation in the endoplasmic reticulum by calnexin is important for the infectivity of severe acute respiratory syndrome coronavirus. J Virol 2012;86(21):11745–11753.
[66]. de Groot RJ. Structure, function and evolution of the hemagglutinin-esterase proteins of corona- and toroviruses. Glycoconj J 2006;23(1–2):59–72.
[67]. Chang D, Zaia J. Why glycosylation matters in building a better flu vaccine. Mol Cell Proteomics 2019;18(12):2348–2358.
[68]. Li W, Hulswit RJG, Widjaja I, et al. Identification of sialic acid-binding function for the Middle East respiratory syndrome coronavirus spike glycoprotein. Proc Natl Acad Sci U S A 2017;114(40):E8508–E8517.
[69]. Parsons LM, Bouwman KM, Azurmendi H, de Vries RP, Cipollo JF, Verheije MH. Glycosylation of the viral attachment protein of avian coronavirus is essential for host cell and receptor binding. J Biol Chem 2019;294(19):7797–7809.
[70]. Shih YP, Chen CY, Liu SJ, et al. Identifying epitopes responsible for neutralizing antibody and DC-SIGN binding on the spike glycoprotein of the severe acute respiratory syndrome coronavirus. J Virol 2006;80(21):10315–10324.
[71]. York IA, Stevens J, Alymova IV. Influenza virus N-linked glycosylation and innate immunity. Biosci Rep 2019;39(1):BSR20171505.
[72]. Zheng J, Yamada Y, Fung TS, Huang M, Chia R, Liu DX. Identification of N-linked glycosylation sites in the spike protein and their functional impact on the replication and infectivity of coronavirus infectious bronchitis virus in cell culture. Virology 2018;513:65–74.
[73]. Zhou Y, Lu K, Pfefferle S, et al. A single asparagine-linked glycosylation site of the severe acute respiratory syndrome coronavirus spike glycoprotein facilitates inhibition by mannose-binding lectin through multiple mechanisms. J Virol 2010;84(17):8753–8764.
[74]. Liang JQ, Fang S, Yuan Q, et al. N-Linked glycosylation of the membrane protein ectodomain regulates infectious bronchitis virus-induced ER stress response, apoptosis and pathogenesis. Virology 2019;531:48–56.
[75]. Ma HC, Fang CP, Hsieh YC, Chen SC, Li HC, Lo SY. Expression and membrane integration of SARS-CoV M protein. J Biomed Sci 2008;15(3):301–310.
[76]. Watanabe Y, Bowden TA, Wilson IA, Crispin M. Exploitation of glycosylation in enveloped virus pathobiology. Biochim Biophys Acta Gen Subj 2019;1863(10):1480–1497.
[77]. Rappazzo CG, Tse LV, Kaku CI, et al. Broad and potent activity against SARS-like viruses by an engineered human monoclonal antibody. Science 2021;371(6531):823–829.
[78]. Zhao P, Cao J, Zhao LJ, et al. Immune responses against SARS-coronavirus nucleocapsid protein induced by DNA vaccine. Virology 2005;331(1):128–135.
[79]. Zhu MS, Pan Y, Chen HQ, et al. Induction of SARS-nucleoprotein-specific immune response by use of DNA vaccine. Immunol Lett 2004;92(3):237–243.
[80]. Sun Z, Liu X, Jiang J, et al. Toward biomarker development in large clinical cohorts: an integrated high-throughput 96-well-plate-based sample preparation workflow for versatile downstream proteomic analyses. Anal Chem 2016;88(17):8518–8525.
[81]. Liu MQ, Zeng WF, Fang P, et al. pGlyco 2.0 enables precision N-glycoproteomics with comprehensive quality control and one-step mass spectrometry for intact glycopeptide identification. Nat Commun 2017;8(1):438.
[82]. Jespersen MC, Peters B, Nielsen M, Marcatili P. BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes. Nucleic Acids Res 2017;45(W1):W24–W29.
[83]. Reynisson B, Alvarez B, Paul S, Peters B, Nielsen M. NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data. Nucleic Acids Res 2020;48(W1):W449–W454.
Keywords:

glycosylation; mass spectrometry; nucleocapsid protein; post-translational modification; SARS-CoV-2

Supplemental Digital Content

Copyright © 2021 the Author(s). Published by Wolters Kluwer Health, Inc.