A major component of the adaptive immune response to infection is the generation of protective and long-lasting humoral immunity. Analyses of antibody responses against different infectious agents are critical for diagnosing infectious diseases, understanding pathogenic mechanisms, and the development and evaluation of vaccines. Protein microarrays are well suited to identify, quantify, and compare individual antigenic responses following exposure to infectious agents. It can now evaluate antibody responses to hundreds, or even thousands, of recombinant antigens at one time. These large-scale studies have uncovered new antigenic targets, provided new insights into vaccine research and yielded an overview of immunoreactivity against almost the entire proteome of certain pathogens. This technology can be applied to the development of improved serodiagnostic tests, discovery of subunit vaccine antigen candidates, epidemiologic research, and vaccine development, as well as providing novel insights into infectious disease and the immune system. In this review, we will discuss the use of protein microarrays as a powerful tool to define the humoral immune response to bacteria and viruses.
Factors governing selection of the particular antigens recognized are unclear [1,2]. It is not uncommon for viruses encoding a small number of proteins to generate antibodies against each encoded protein. But for infectious agents containing hundreds or thousands of proteins only a subset of the proteome is recognized and little is known about the extent or the characteristics of this subset of antigens. Methods for making a complete empirical accounting of the immunoproteome have limitations, particularly when the genome of the organism is large. The Protein Microarray Laboratory at University of California Irvine has developed a highly efficient method to determine the humoral immune response to microbial antigens. We have applied this approach to more than 30 medically important infectious microorganisms [3–33] including Mycobacterium tuberculosis, Plasmodium falciparum, Plasmodium vivax, Brucella melitensis, Chlamydia trachomatis, Francisella tularensis, Burkholderia pseudomallei, Coxiella burnetii, Borrelia burgdorferi, Salmonella enterica Typhi, Rickettsia prowazekii, Rickettsia rickettsii, Orientia tsutsugamushi, Bartonella henselae, Leptospira interrogans, Toxoplasma gondii, Candida albicans, Schistosoma mansoni, and viruses including vaccinia herpes simplex viruses 1&2, varicella zoster virus, Epstein–Barr virus, human papillomaviruses, HIV, dengue, influenza, West Nile virus, yellow fever, Saint Louis encephalitis, Japanese encephalitis, and chikungunya viruses. After launching this project 10 years ago, we have made more than 40 000 plasmids, printed the encoded proteins on 25 000 microarrays and probed the arrays with 15 000 serum specimens in order to determine disease-associated antibody profiles in people infected with each agent. These chips can be probed with sera from infected patients to determine the immunodominant antigens for each agent and the methodology is amenable to the screening of sera from very large cohorts numbering in the thousands. When seroreactive and serodiagnostic antigen subsets from different infectious agents are printed onto the same array, the chip can discriminate between patients infected with different agents and also identify individuals with coinfections or multiple infections. We have shown that the individual proteins printed on these arrays capture antibodies present in serum from infected individuals and the amount of captured antibody can be quantified using fluorescent secondary antibody. In this way, a comprehensive profile of antibodies that result after infection or exposure can be determined that is characteristic of the type of infection and the stage of disease [9,10,31].
Here we summarize the approximate seroreactive and serodiagnostic antigens that were identified and published in 30 different organisms, and discuss the antibody response predictions from classification of reactive antigens based on functional and physical properties.
PROTEIN MICROARRAY PRODUCTION, PROBING, AND ANALYSIS
Genes were amplified and cloned using a high-throughput PCR and recombination method . Open reading frames from genomic DNA or cDNA were identified and amplified using gene-specific primers containing about 20 bp nucleotide extension complementary to ends of linearized pXT7 vector, which allows homologous recombination between the PCR product and pXT7 vector in competent Escherichia coli cells. The resulting fusion proteins also harbored a hemagglutinin epitope at the 3′ end and polyhistidine at the 5′ end. Plasmids were expressed at 24 °C in a 16 h in-vitro transcription/translation E. coli system (expressway kits from Invitrogen). For no DNA controls, no plasmid DNA was added to the same amount of reagent from in-vitro transcription/translation E. coli system to test E. coli background reactivity. For microarrays, 10 μl of reaction was mixed with 3.3 μl 0.2% Tween 20 to give a final concentration of 0.05% Tween 20, and printed onto nitrocellulose coated glass FAST slides (Whatman) using an Omni Grid 100 microarray printer (Genomic Solutions). Sera samples were diluted in E. coli lysate (Mclab). Slides were incubated in biotin-conjugated secondary antibody (Jackson ImmunoResearch) and detected by incubation with streptavidin-conjugated SureLight P-3 (Columbia Biosciences). Microarray slides were scanned and analyzed using a Perkin Elmer ScanArray Express HT or Genepix microarray scanner. Intensities were quantified. All signal intensities were corrected for spot-specific background. All foreground values were transformed and normalized using a robust linear model or nonlinear variance stabilizing normalization to remove systematic effects [24,34,35] (Fig. 1).
PERCENTAGE OF SEROREACTIVE ANTIGENS
Discovery of novel antigens associated with infectious diseases is fundamental to the development of serodiagnostic tests and protein subunit vaccines against existing and emerging pathogens. Through over 10 years of effort, we have identified over 1000 antigens associated with infections or vaccinations in 30 different organisms (Table 1) [3–6,9–17,23,25▪▪,31–33,36▪,37–39,40▪▪,41,42,43▪,44,45▪,46▪,47▪▪], accounting for around 2–5% of bacterial genome; 20–57% of viral genome; and 10–45% of parasite genomes. Antigens differentially reactive among infected and healthy controls comprise even smaller percentage of the genome size: from 0.3 to 3% for bacteria; 16 to 40% for viruses, and 2 to 18% for parasites. Borrelia burgdorferi, however, generate higher antibody responses against approximately 15% of polypeptides during natural infection, of which half are differentially reactive between naturally infected and uninfected individuals .
Antigens were classified as ‘seroreactive’ with mean reactivity greater than 2–3 standard deviations above the mean of the negative controls in most organisms; and differentially reactive antigens are classified by Benjamini-Hochberg adjusted P value smaller than 0.05 by comparing the negative group with infected or vaccinated individuals.
Full proteome microarrays were constructed for only a limited number of bacterial species; however, other data were published using partial arrays containing only partial proteome, and may over-represent the percentages of seroreactive and serodiagnostic antigens in the full proteome because the subset of proteins on the array was selected based on antigenic features seen previously.
ENRICHMENT ANALYSIS REVEALS PHYSICAL PROPERTIES AND CELLULAR FUNCTIONS ASSOCIATED WITH IMMUNOGENICITY
Another application for these empirical data is to train an algorithm to predict reactive antigens in silico, and several studies from our group apply enrichment analyses to identify proteomic features that tend to be seen more frequently in the seroreactive and serodiagnostic antigen sets [12,17,23].
Efforts to predict antigenicity have relied on a few computational algorithms predicting signal peptide sequences (signalP), transmembrane domains (TMHMM), or subcellular localization (Psort). The current database from this protein microarray approach contains quantitative antibody reactivity data against 40 000 proteins derived from 30 infectious microorganisms and more than 30 million data points derived from 15 000 patient sera. Interrogation of these data sets has revealed more than 10 proteomic features that are associated with antigenicity allowing an in-silico protein sequence and functional annotation-based approach to triage the least likely antigenic proteins from those that are more likely to be antigenic.
These proteomic enrichment features (Table 2) are: functionally annotated Clusters of Orthologous Groups of proteins (U, M, N, and O) or gene ontology function and process; computationally predicted features (TMHMM, Signal peptide, pSort Outermembrane, pSort Periplasmic, and isoelectric point (pI) less than 5 for bacteria, and pI 7–9 for parasites); and abundance of expression. This approach applied to B. melitensis predicts 37% of the bacterial proteome containing 91% of the antigens empirically identified by probing proteome microarrays .
Parasite toxoplasma gondii proteins were assigned by gene ontology functions. Proteins involved in protein binding, catalytic activity, transporter activity, and transferase activity were significantly enriched . Proteins with enzymatic activity other than kinase activity were enriched at 2.0 fold, and proteins with enzyme regulator activity, structural molecule activity, and ion channel activity were also highly enriched. Proteins with gene ontology null functions, or involved in nucleotide and nucleic acid binding were underrepresented .
Proteins were also assigned by gene ontology process classification. Proteins involved in ATP biosynthetic process were enriched. Several proteins involved in transport were also significantly enriched, including ion transport, protein transport, vesicle mediated transport, and other transport functions. Proteins involved in metabolic process, proteolysis, and signal peptide processing were also enriched. Conversely, proteins not assigned with gene ontology process categories were significantly underrepresented (0.5 fold; P value 3.301 × 10−21) .
An examination with the Pf proteins on the microarray based on gene ontological analysis revealed that approximately 40% of the immunogenic proteins are expressed in the membrane of the parasite or host erythrocyte and that they are overrepresented in the biological process categories of ‘pathogenesis,’ ‘cytoadherence to microvasculature,’ ‘antigenic variation,’ and ‘rosetting’ .
The data set of Vaccinia viral proteins also allowed us to identify properties of viral proteins that were associated with immunogenicity. We found that membrane and core proteins, proteins with late or early/late temporal expression, and proteins with transmembrane domains were overrepresented in the immunoreactive antigen set relative to the whole proteome. These predictors are strongest in MVA profiles, as the antibody profile to MVA is more heavily skewed toward structural proteins. In contrast, early proteins were underrepresented relative to the whole proteome, and there was negligible influence of molecular weight, pI, or the presence of a signal sequence on immunogenicity. Vaccinia antigens are either abundant components of MV particles, such as A10 and L4 , or are expressed at high levels in infected cells, such as I1 and WR148 [49,50]. Their abundance may contribute to immunogenicity once released from infected cells, particularly if, like D13 , such proteins have a propensity for self-assembly into macromolecular structures.
Analysis was also done for the herpes simplex virus-1 antibody profile based on gene ontology component classifiers according to the database at www.uniprot.org. The percentage of the total number of genes assigned to each gene ontology component present in the proteome and in the seroreactive antigens was determined, and the ratio was used to determine the fold enrichment. The analysis revealed 12 proteins on the array that were assigned the gene ontology component virion membrane, of which nine were seroreactive. Tegument proteins were not enriched in the seroreactive antigen set .
Overall, our data show that the antibody profile is not a random assortment of specificities, but strongly biased toward the recognition of certain proteomic features. Why we do not observe antibodies to all intracellular proteins expressed from infected cells remains unclear. It is also interesting, to note, that the rules that determine immunogenicity might be different from those that define protection.
NAIVE BAYES CLASSIFICATION
Individual proteomic features provide some information about the likelihood of a protein being seroreactive; however, using all of these features together leads to a better segregation of the hits from the rest of the proteome. To analyze the relationship between all of these features and the seroreactivity of the proteins in a rigorous manner, we used a naive Bayes formulation .
We applied a naive Bayes classification approach to assign a relative numerical score to each antigen in the B. melitensis (Bm) proteome. This score reflects the relative likelihood that a protein will be reactive based on its functionally annotated or computationally predicted features. Our analyses indicates that 91% of serodiagnostic antigens are predictable from the top 20% of the genome ranked by this naive Bayes classification approach, and the antigens with enriched features in the top 20% of the genome account for 100% of serodiagnostic antigens with these features. Without this naive Bayes classification approach, we would have to clone 37% of the genome with enriching features to obtain 91% of serodiagnostic antigens. This analysis greatly enhances the predictive efficiency compared with previous studies, will provide a basis for targeted screens of entire proteomes based on likelihood of seroreactivity, and help determine trends in the humoral immune response to gram-negative bacteria. The same approach has been applied to S. enterica and revealed that we would need to screen only 25% of the genome to be able to identify 72% of serodiagnostic antigens (Table 3).
The development of protein arrays for profiling the antibody response generated upon exposure to an infectious agent has allowed for new insight into the humoral immune response and the identification of potential subunit vaccine candidates and new diagnostics. No other existing approach can provide such a thorough perspective of the humoral immune response to infection. Moreover, it provides a systematic foundation formation on proteomic features (functional and physically properties) of seroreactive and serodiagnostic antigens. The information presented here will allow future protein microarray screening to focus efforts on portions of the proteome that most likely contain seroreactive proteins, and may also be useful for understanding the antibody responses to bacteria, viruses, and parasites.
Financial support and sponsorship
This work was supported by NIH grants U01AI078213, AI089686 and AI095916.
Conflicts of interest
There are no conflicts of interest.
REFERENCES AND RECOMMENDED READING
Papers of particular interest, published within the annual period of review, have been highlighted as:
- ▪ of special interest
- ▪▪ of outstanding interest
1. Mayers C, Duffield M, Rowe S, et al. Analysis of known bacterial protein vaccine antigens reveals biased physical properties and amino acid composition. Comp Funct Genomics 2003; 4:468–478.
2. Rappuoli R. Reverse vaccinology, a genome-based approach to vaccine development. Vaccine 2001; 19:2688–2691.
3. Molina DM, Pal S, Kayala MA, et al. Identification of immunodominant antigens of Chlamydia trachomatis using proteome microarrays. Vaccine 2010; 28:3014–3024.
4. Driguez P, Doolan DL, Loukas A, et al. Schistosomiasis vaccine discovery using immunomics. Parasit Vectors 2010; 3:4.
5. Crompton PD, Kayala MA, Traore B, et al. A prospective analysis of the Ab response to Plasmodium falciparum before and after a malaria season by protein microarray
. Proc Natl Acad Sci U S A 2010; 107:6958–6963.
6. Felgner PL, Kayala MA, Vigil A, et al. A Burkholderia pseudomallei protein microarray
and cross-reactive antigens. Proc Natl Acad Sci U S A 2009; 106:13499–13504.
7. Chen C, Bouman TJ, Beare PA, et al. A systematic approach to evaluate humoral and cellular immune responses to Coxiella burnetii immunoreactive antigens. Clin Microbiol Infect 2009; 15 (Suppl 2):156–157.
8. Doolan DL, Mu Y, Unal B, et al. Profiling humoral immune responses to P. falciparum infection with protein microarrays. Proteomics 2008; 8:4680–4694.
9. Davies DH, Wyatt LS, Newman FK, et al. Antibody profiling by proteome microarray reveals the immunogenicity of the attenuated smallpox vaccine modified vaccinia virus ankara is comparable to that of Dryvax. J Virol 2008; 82:652–663.
10. Barbour AG, Jasinskas A, Kayala MA, et al. A genome-wide proteome array reveals a limited set of immunogens in natural infections of humans and white-footed mice with Borrelia burgdorferi. Infect Immun 2008; 76:3374–3389.
11. Sundaresh S, Randall A, Unal B, et al. From protein microarrays to diagnostic antigen
discovery: a study of the pathogen Francisella tularensis. Bioinformatics 2007; 23:i508–i518.
12. Liang L, Tan X, Juarez S, et al. Systems biology approach predicts antibody signature associated with brucella melitensis infection in humans. J Proteome Res 2011; 10:4813–4824.
13. Liang L, Doskaya M, Juarez S, et al. Identification of potential serodiagnostic
and subunit vaccine antigens by antibody profiling of toxoplasmosis cases in Turkey. Mol Cell Proteomics 2011; 10:006916M110.
14. Liang L, Leng D, Burk C, et al. Large scale immune profiling of infected humans and goats reveals differential recognition of Brucella melitensis antigens. PLoS Negl Trop Dis 2010; 4:e673.
15. Vigil A, Chen C, Jain A, et al. Profiling the humoral immune response
of acute and chronic Q fever by protein microarray
. Mol Cell Proteomics 2011; 10 (M110):006304.
16. Kalantari-Dehaghi M, Molina DM, Farhadieh M, et al. New targets of pemphigus vulgaris antibodies identified by protein array technology. Exp Dermatol 2011; 20:154–156.
17. Vigil A, Ortega R, Jain A, et al. Identification of the feline humoral immune response
to Bartonella henselae infection by protein microarray
. PLoS One 2010; 5:e11447.
18. Yang ZR, Lertmemongkolchai G, Tan G, et al. A genetic programming approach for Burkholderia pseudomallei diagnostic pattern discovery. Bioinformatics 2009; 25:2256–2262.
19. Tippayawat P, Saenwongsa W, Mahawantung J, et al. Phenotypic and functional characterization of human memory T cell responses to Burkholderia pseudomallei. PLoS Negl Trop Dis 2009; 3:e407.
20. Cannella AP, Lin JC, Liang L, et al. Serial kinetics of the antibody response against the complete brucella melitensis orfeome in focal vertebral brucellosis. J Clin Microbiol 2012; 50:922–926.
21. Tan X, Traore B, Kayentao K, et al. Hemoglobin S and C heterozygosity enhances neither the magnitude nor breadth of antibody responses to a diverse array of Plasmodium falciparum antigens. J Infect Dis 2011; 204:1750–1761.
22. Barry AE, Trieu A, Fowkes FJ, et al. The stability and complexity of antibody responses to the major surface antigen
of Plasmodium falciparum are associated with age in a malaria endemic area. Mol Cell Proteomics 2011; 10:008326M111.
23. Eyles JE, Unal B, Hartley MG, et al. Immunodominant Francisella tularensis antigens identified using proteome microarray. Proteomics 2007; 7:2172–2183.
24. Sundaresh S, Doolan DL, Hirst S, et al. Identification of humoral immune responses in protein microarrays using DNA microarray data analysis techniques. Bioinformatics 2006; 22:1760–1766.
25▪▪. Patton DL, Teng A, Randall A, et al. Whole genome identification of C. trachomatis immunodominant antigens after genital tract infections and effect of antibiotic treatment of pigtailed macaques. J Proteomics 2014; 108:99–109.
This is the first time that Chlamydia trachomatis immunodominant and potential vaccine antigens have been identified in nonhuman primates following infection.
26. Beare PA, Chen C, Bouman T, et al. Candidate antigens for Q fever serodiagnosis revealed by immunoscreening of a Coxiella burnetii protein microarray
. Clin Vaccine Immunol 2008; 15:1771–1779.
27. Doskaya M, Kalantari-Dehaghi M, Walsh CM, et al. GRA1 protein vaccine confers better immune response
compared to codon-optimized GRA1 DNA vaccine. Vaccine 2007; 25:1824–1837.
28. Mochon AB, Jin Y, Kayala MA, et al. Serological profiling of a Candida albicans protein microarray
reveals permanent host-pathogen interplay and stage-specific responses during candidemia. PLoS Pathog 2010; 6:e1000827.
29. Davies DH, Liang X, Hernandez JE, et al. Profiling the humoral immune response
to infection by using proteome microarrays: high-throughput vaccine and diagnostic antigen
discovery. Proc Natl Acad Sci U S A 2005; 102:547–552.
30. Davies DH, McCausland MM, Valdez C, et al. Vaccinia virus H3L envelope protein is a major target of neutralizing antibodies in humans and elicits protection against lethal challenge in mice. J Virol 2005; 79:11724–11733.
31. Davies DH, Molina DM, Wrammert J, et al. Proteome-wide analysis of the serological response to vaccinia and smallpox. Proteomics 2007; 7:1678–1686.
32. Luevano M, Bernard HU, Barrera-Saldana HA, et al. High-throughput profiling of the humoral immune responses against thirteen human papillomavirus types by proteome microarrays. Virology 2010; 405:31–40.
33. Kunnath-Velayudhan S, Salamon H, Wang HY, et al. Dynamic antibody responses to the Mycobacterium tuberculosis proteome. Proc Natl Acad Sci U S A 2010; 107:14703–14708.
34. Sboner A, Karpikov A, Chen G, et al. Robust-linear-model normalization to reduce technical variability in functional protein microarrays. J Proteome Res 2009; 8:5451–5464.
35. Huber W, von Heydebreck A, Sultmann H, et al. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 2002; 18 (Suppl 1):S96–S104.
36▪. Gerns Storey HL, Richardson BA, Singa B, et al. Use of principal components analysis and protein microarray
to explore the association of HIV-1-specific IgG responses with disease progression. AIDS Res Hum Retroviruses 2014; 30:37–44.
A collection of HIV-specific antibody responses that together were associated with reduced disease progression, by PCA and microarray analyses.
37. Kalantari-Dehaghi M, Chun S, Chentoufi AA, et al. Discovery of potential diagnostic and vaccine antigens in herpes simplex virus 1 and 2 by proteome-wide antibody profiling. J Virol 2012; 86:4328–4339.
38. Vigil A, Ortega R, Nakajima-Sasaki R, et al. Genome-wide profiling of humoral immune response
to Coxiella burnetii infection by protein microarray
. Proteomics 2010; 10:2259–2269.
39. Lessa-Aquino C, Borges Rodrigues C, Pablo J, et al. Identification of seroreactive proteins of Leptospira interrogans serovar copenhageni using a high-density protein microarray
approach. PLoS Negl Trop Dis 2013; 7:e2499.
40▪▪. Lessa-Aquino C, Wunder EA Jr, Lindow JC, et al. Proteomic features predict seroreactivity against leptospiral antigens in leptospirosis patients. J Proteome Res 2015; 14:549–556.
This is the first full proteome Leptospira interrogans microarray that was constructed for identifying serodiagnostic antigens, and it also provides an empirical basis for predicting antigenicity from Gram-negative bacteria.
41. Liang L, Juarez S, Nga TV, et al. Immune profiling with a Salmonella Typhi antigen
microarray identifies new diagnostic biomarkers of human typhoid. Sci Rep 2013; 3:1043.
42. Baum E, Badu K, Molina DM, et al. Protein microarray
analysis of antibody responses to Plasmodium falciparum in western Kenyan highland sites with differing transmission levels. PLoS One 2013; 8:e82246.
43▪. Campo JJ, Aponte JJ, Skinner J, et al. RTS,S vaccination is associated with serologic evidence of decreased exposure to Plasmodium falciparum liver- and blood-stage parasites. Mol Cell Proteomics 2015; 14:519–531.
These microarray data provide insight into the mechanism by which RTS,S vaccine protects from Malaria.
44. Molina DM, Finney OC, Arevalo-Herrera M, et al. Plasmodium vivax preerythrocytic-stage antigen
discovery: exploiting naturally acquired humoral responses. Am J Trop Med Hyg 2012; 87:460–469.
45▪. Baum E, Sattabongkot J, Sirichaisinthop J, et al. Submicroscopic and asymptomatic Plasmodium falciparum and Plasmodium vivax infections are common in western Thailand: molecular and serological evidence. Malar J 2015; 14:95.
These microarray data are empirical data for understanding prevalance of Plasmodium in western Thailand.
46▪. Gaze S, Driguez P, Pearson MS, et al. An immunomics approach to schistosome antigen
discovery: antibody signatures of naturally resistant and chronically infected individuals from endemic areas. PLoS Pathog 2014; 10:e1004033.
Several potentially protective and well tolerated schistosomiasis vaccine antigens were identified.
47▪▪. Tang YT, Gao X, Rosa BA, et al. Genome of the human hookworm Necator americanus. Nat Genet 2014; 46:261–269.
The authors report sequencing and assembly of the N. americanus genome, and provide an invaluable resource to boost ongoing efforts toward fundamental and applied postgenomic research.
48. Chung CS, Chen CH, Ho MY, et al. Vaccinia virus proteome: identification of proteins in vaccinia virus intracellular mature virion particles. J Virol 2006; 80:2127–2140.
49. Patel DD, Pickup DJ, Joklik WK. Isolation of cowpox virus A-type inclusions and characterization of their major protein component. Virology 1986; 149:174–189.
50. Liu X, Kremer M, Broyles SS. A natural vaccinia virus promoter with exceptional capacity to direct protein synthesis. J Virol Methods 2004; 122:141–145.
51. Szajner P, Weisberg AS, Lebowitz J, et al. External scaffold of spherical immature poxvirus particles is made of protein trimers, forming a honeycomb lattice. J Cell Biol 2005; 170:971–981.
52. Witten IH, Frank E. Data mining: practical machine learning tools and techniques. 2nd ed.Amsterdam:San Francisco Morgan Kaufman; 2005.