Ovarian cancer is a severe disease that threatens the health and life of many women; the incidence and mortality rates of ovarian cancer are ranked fifth and fourth, respectively, among malignancies in women in the United States.1 The 5-year survival rate after diagnosis of ovarian cancer is approximately 30%, but 85% or more of those who survive more than 5 years were diagnosed with stage I ovarian cancer.2 To increase the survival rate and quality of life of women with ovarian cancer, it is necessary to improve the rate at which early-stage ovarian cancer is diagnosed.
In the past decades, some serum biomarkers such as CA125, HE4, CA72-4, CA15-3, glycodelin, MMP7, SLP1, Plau-R, and Muc-1 have been studied in the diagnosis of ovarian cancer. CA 125, which is approved in the Unites States for monitoring the response to therapy and to detect recurrences, is the most extensively studied predictive marker for ovarian cancer.3–5 Unfortunately, CA 125 is only elevated in approximately 50% to 60% of patients with early-stage ovary cancer; furthermore, it has a low specificity,6 and its positive predictive value is less than 10% as a single marker (the addition of ultrasound screening to measurements of CA 125 improves its positive predictive value to approximately 20%).7 HE4 has been shown to be effective for ovarian cancer detection8,9 and received approval by the Food and Drug Administration as a recurrence monitoring marker. Limited information suggests that rising HE4 could detect a recurrence earlier than CA125.9,10 Because of the limited sensitivities and specificities constraining the use of CA 125, HE4, and other biomarkers, new technologies for the detection of early-stage ovarian cancer are needed.
In 2002, Pertricoin et al5 reported a surface-enhanced laser desorption/ionization mass spectrometry (SELDI-TOF MS)–generated proteomic model that had 100% sensitivity and 95% specificity for the detection of ovarian cancer. Surface-enhanced laser desorption/ionization mass spectrometry is designed for reliable quantitative precision over a wide range of masses. The ProteinChip Reader (Ciphergen Biosystems, Fremont, Calif) is especially adapted to achieve high-sensitivity quantification and good reproducibility.
In this study, we analyzed serum samples from women with ovarian cancer or benign ovarian tumors and from healthy controls using SELDI-TOF MS and the ProteinChip. SELDI-ProteinChip technology is an effective approach for diagnosing proteomic models for discovering new biomarkers in serum samples. It is suitable for large-scale and high-throughput clinical research and applications. Because of the defective calibration of the relative molecular weight, we adopted internal calibration and external calibration to rectify the relative molecular weight. We also normalized the process of serum samples to avoid low repeatability.
Our objective was to use this technology, combined with the artificial neural network (ANN) method of biomedical informatics, to establish an ovarian cancer proteomic model by identifying differential protein peaks in serum samples with benign ovarian tumors, and healthy controls.
MATERIALS AND METHODS
The research protocol was approved by the ethics committee of the Peking University Third Hospital. Informed consent was obtained from each of the patients and control subjects.
The study was divided into 3 sets. The first set of the study (data set 1) included 25 patients with primary ovarian cancer and 20 healthy women to develop a proteomic model that discriminated cancer from healthy control effectively. A blind test set, including 23 new cases (12 patients with primary ovarian cancer and 11 healthy women), was used to validate the sensitivity and specificity of this multivariate model. The second set (data set 2) included 29 patients with ovarian cancer and 41 noncancer controls (including 22 benign tumor patients and 19 healthy women) to develop a proteomic model that discriminated cancer from noncancer controls. A blind test set, including 34 new cases (13 patients with primary ovarian cancer, 11 women with benign tumor, and 10 healthy women), was used to validate the sensitivity and specificity of the model. The third set (data set 3) included 26 patients with ovarian cancer and 23 benign control (23 benign tumor patients) to evelop a proteomic model that discriminated cancer from benign control. A blind test set, including 25 new cases (13 patients with primary ovarian cancer, 12 women with benign tumor), was used to validate the sensitivity and specificity of the model.
Samples from 118 women with epithelial ovarian cancer admitted to the Peking University Third Hospital from January 2003 to December 2009 were included in the study. The demographics and clinical characteristics of study population are indicated in Table 1. There were 20 International Federation of Gynecology and Obstetrics (FIGO) stage I cases, 35 FIGO stage II cases, 46 FIGO stage III cases, and 17 FIGO stage IV cases. The patients with ovarian cancer had different histologic types: serous papillary carcinoma (n = 87), endometrioid carcinoma (n = 4), mucinous carcinoma (n = 11), clear cell carcinoma (n = 13), and mixed cystadenocarcinoma (n = 3). Sixty-eight women with benign ovarian tumors were recruited for the benign ovarian tumor control group. The patients with benign ovarian tumor had different histologic types: serous cystadenocarcinoma (n = 25), mucinous cystadenocarcinoma (n = 12), mixed cystadenocarcinoma (n = 2), and simple ovarian cyst (n = 29). Sixty healthy, age-matched individuals who had undergone medical examinations in our hospital were recruited for the healthy control group. Diagnoses of ovarian cancer were performed by pathologists after surgery.
Five milliliters of fasting venous blood was drawn into a sterile empty tube in the morning and left in the tube in a standing position for approximately 40 minutes. After the blood had clotted, it was centrifuged at 4°C and 1500g per minute for 10 minutes. The serum was then quickly removed, aliquoted, and frozen at −80°C.
Instruments and Chips
We used the Protein Biology System PBS-IIc SELDI-TOF MS system (Ciphergen Biosystems), which includes the ProteinChip System (H4), TOF MS (PBSII series), and analysis software. The supercentrifuge, acetonitrile, 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES, PH 7.4), ProteinChip α-cyano-4-hydroxycinnamic acid energy-absorbing molecules (MW 189.2), and 3-[(cholamidopropyl)-dimethylammonio]-1-propane (CHAPS, c3023) are products of Ciphergen.
Serum samples were thawed in an ice bath and centrifuged at 1500g for 5 minutes at 4°C. Ten microliters of supernatant was added to 90 μL of 0.5% CHAPS and centrifuged at 3500g for 5 minutes. Fifty microliters of the supernatant was added to 20 mM HEPES to a final volume of 200 μL. This sample was then added to the H4 ProteinChip, which was vortexed in the bioprocessor for 60 minutes. The supernatant was discarded, and the ProteinChip was washed 3 times with 200 μL of 20 mM HEPES for 5 minutes each time. The ProteinChip was then washed twice with deionized water for 1 minute each time. α-Cyano-4-hydroxycinnamic acid energy-absorbing molecules (0.5 μL) were added to the ProteinChip, to which the proteins from the serum sample had bound after air drying. α-Cyano-4-hydroxycinnamic acid was again added to the ProteinChip after arefraction had been confirmed.
For the parameter settings, the highest molecular weight was set to 30,000 d, the priming range to 2000 to 20,000 d, the laser intensity to 157, and the detection sensitivity to 5. Each sample was activated 65 times. The error range for molecular weight was controlled to less than 0.1%. Intrachip and interchip quality controls were set. The coefficient of variation was controlled to less than 0.05% for molecular weight and to less than 25% for peak amplitude.
The H4 ProteinChip with bound protein was read by the protein reader. A laser was used to stimulate the protein on the chip surface. Ten laser spots were then collected; each spot was stably activated 5 to 10 times. The TOF MS peaks formed a 2-dimensional mass spectrum according to their means and amplitudes. The y-axis indicates the abundance (ie, intensity), and the x-axis indicates the mass-to-charge ratio (m/z; ie, molecular weight).
An ANN was used to analyze data obtained by MS analysis11 as follows. (1) Raw data were filtered with tools provided by Ciphergen ProteinChip Software 3.0. m/z peaks of 2000 to 30,000 d were filtered.12 The U test was performed on means from the ovarian cancer and control groups to compare the degree of significant differences in each of the m/z peaks between the groups. (2) The ANN was trained by the initially screened m/z peaks to validate whether the m/z peaks contained the information for diagnostic prediction of diseases.13 (3) Candidate biomarkers from the initially screened m/z peaks were screened, and the variables (ie, the m/z peaks) were ranked from lowest to highest P value. The number of variables was increased to train the ANN from the first variable. When the ANN was stable, the necessary minimum variables were screened for the predictive model.14
Comparisons between groups of the initially screened m/z peaks were performed with Student t test. Data were analyzed with the SPSS for Windows statistical package version 13.0 (SPSS, Chicago, Ill). P < 0.05 was considered to be statistically significant.
Comparisons Between Ovarian Cancer and Healthy Controls
We initially identified 208 m/z peaks as candidate ovarian cancer markers. Further screening of the 208 peaks identified 7 m/z peaks (2269.474 d, 4109.436 d, 4492.997 d, 5796.426 d, 5766.379 d, 5912.586 d, and 11,695.560 d) with P values of less than 0.01. These 7 peaks comprised the proteomic model (model 1). The 7 m/z peaks represented markers that were highly expressed in the serum samples from women with ovarian cancer. The sensitivity, specificity, and positive rate of the model were optimized. We established a stable model by using the 7 m/z peaks; there were significant differences in these peaks in the ovarian cancer group compared with the healthy control group. The model was validated by a blind test. The training groups used in the ANN included 25 ovarian cancer cases and 20 healthy controls; the blind test group included the remaining 12 ovarian cancer cases and 11 healthy controls (Table 2).
For the ovarian cancer group, the sensitivity was 100%. In the control group, there was 1 case of recognition error (untraceable outpatient case, in which ultrasonography indicated no obvious abnormalities in ovaries); the specificity was 90.91% (Table 3).
Comparison Between Ovarian Cancer and Noncancer Control Groups
We established a second proteomic model (model 2) comprising 17 m/z peaks (5765.297 d, 11,696.760 d, 5911.936d, 11,539.001 d, 6442.042 d, 4474.677 d, 2197.418 d, 6488.885 d, 3899.376 d, 6111.820 d, 5808.813 d, 14623.310 d, 2122.078 d, 4255.193 d, 5845.059 d, 4851.276 d, and 4466.399 d) by the same means used in model 1. There were significant differences in every peak between the ovarian cancer and control groups (P < 0.01). This model was also validated by a blind test. The groups used in the ANN training included serum samples from 29 patients with ovarian cancer and 41 noncancer controls. The blind test groups include 11 patients with ovarian cancer and 21 noncancer controls. The sensitivity of the second model was 90.90%, the specificity was 95.00%, and the positive rate was 90.90% (Table 4).
Three peaks, at 5766.379 d, 5912.586 d, and 11,695.560 d, appeared in both models 1 and 2. Differences in the values of these 3 peaks between the cancer and noncancer groups were analyzed by the U test; the P values were 0.000060, 0.000169, and 0.000123, respectively, which demonstrates that these 3 candidate biomarkers were highly and specifically expressed in the serum of patients with ovarian cancer.
Comparison Between Ovarian Cancer and Benign Ovarian Tumor Groups
We established a third diagnostic model (model 3) with 184 candidate biomarkers using the same methods that were used to generate models 1 and 2. Model 3 was also validated by a blind test. The training groups used for the ANN included 26 ovarian cancer cases and 20 benign controls, and the blind test group included 13 ovarian cancer cases and 12 benign controls. The sensitivity of the model was 100%, and the specificity was 83.33% (Table 5).
The same 3 peaks seen in models 1 and 2, at 5766.379 d, 5912.586 d, and 11,695.560 d, also appeared in model 3. Differences in the values of these 3 peaks between the cancer and benign control groups were analyzed by the U test; the P values were 0.00239, 0.000988, and 0.000552, respectively, which demonstrates that these 3 candidate biomarkers were highly and specifically expressed in the serum of patients with ovarian cancer.
Comparison of Peaks Corresponding to 3 Potential Biomarkers Among Healthy Control, Benign Ovarian Tumor Control, and Ovarian Cancer Groups
The corresponding peaks of the 3 biomarkers increased with the degree of malignancy. There were significant differences among the groups (P < 0.01; Table 6).
Measurement of Serum CA 125
We also measured serum CA 125 levels. The positive rate for CA 125 elevation (>35 U/mL) in the serum of all patients with ovarian cancer was 72.9% (86/118), whereas the positive rate for CA 125 elevation in patients with stage I or II ovarian cancer was 58.2% (32/55).
Neoplasms are complex diseases that result from interactions between genetic and environmental factors. Ovarian neoplasms comprise benign, premalignant, and malignant types. The type and stage of ovarian cancer have important implications for the screening, diagnosis, treatment, and follow-up. Highly sensitive and specific neoplasm biomarkers are valuable because they can ensure early detection and, therefore, a better prognosis. Surface-enhanced laser desorption/ionization mass spectrometry has been used extensively to screen for neoplasm biomarkers and in studies of antineoplastic drug resistance because of its rapidity, convenience, small required samples, high-throughput ability, repeatability, and high sensitivity (markers can be detected at the femtomolar level).15,16 We used ANN to analyze the data produced by MS. By establishing proteomic models, ANN can increase the specificity and reliability of prediction.17
Compared with our measurements of serum CA 125 levels, the proteomic model for ovarian cancer generated by SELDI-TOF MS and ANN in this study had superior sensitivity, specificity, and positive rates. The proteomic model was also able to discriminate accurately between early- and later-stage ovarian cancer. In a previous investigation using proteomic models to detect ovarian cancer, subjects were divided into ovarian cancer and healthy control group, but the healthy control group included patients with ovarian cysts and endometriosis and those with no ovarian abnormalities. In our study, we made 3 comparisons: ovarian cancer group versus healthy control, ovarian cancer group versus noncancer control, and ovarian cancer group versus benign ovarian tumor control. Thus, by avoiding the negative effects of benign ovarian tumors seen in the previous study, we obtained more specific biomarkers for ovarian cancer. Of the 7 m/z peaks in the proteomic model generated by our study, the 5766.379 d, 5912.586 d, and 11695.560 d biomarkers were only highly expressed in the serum of patients with ovarian cancer, indicating that they are highly specific to ovarian cancer.
To determine their relevance, it is important to identify the biomarkers represented by the 3 peaks. In 2003, Ye et al18 identified an 11.7 kd biomarker as haptoglobin-derived subunit, a potential marker for ovarian cancer that is complementary to CA 125. In 2005, Moshkovskii et al19 detected an 11.7 kd potential cancer biomarker in thermostable plasma fractions using ProteinChip SELDI-TOF MS. This peak invariably appeared with another close peak of approximately 11.5 kd. The 2 peaks were identified by MS as serum amyloid A1 (11.68 kd) and its N-terminal arginine-truncated form (11.5 kd). In another study in 2007, Moshkovskii et al20 showed that acute-phase serum amyloid A is an important component of discriminatory cancer protein profiles. Among the 8 known ovarian cancer SELDI profile components, acute-phase serum amyloid A is the most relevant to the molecular pathogenesis of cancer, and it has the highest degree of up-regulation in disease.20 A 2008 study by Helleman et al21 also identified an 11.7 kd protein as amyloid A1, the potential biomarker for ovarian cancer.
We think that the 11.7 kd biomarker identified in our study is serum amyloid A1 because this protein has also been identified as a tumor biomarker in studies of several other types of malignancies. For example, Yokoi et al22 demonstrated that an 11.7 kd protein, which was identified as serum amyloid A, is a tumor biomarker in the serum of nude mice with orthotopic human pancreatic cancer and in the plasma of patients with pancreatic cancer. This protein was also identified as a tumor biomarker in conventional renal cell carcinoma23 and lung cancer.24
There are many differences among laboratories performing SELDI-TOF MS in the types of chips, bioinformatic analytical methods, and experimental materials used. Thus, a set of standard operating procedures for SELDI-TOF MS is of critical importance. Furthermore, further studies should evaluate how patients are chosen, what standards are used, and the role of laboratory error.25
In summary, the purpose of discovering tumor biomarkers is to improve the early diagnosis of ovarian cancer. The positive rate for the prediction of ovarian cancer stages I and II by the proteomic model identified by SELDI-TOF MS and ANN was 100%; this suggests that serum protein MS may be highly useful for the early diagnosis of ovarian cancer. In our study, the biomarkers at 11695.560 d was a potential biomarker for the early diagnosis of ovarian cancer; the other 2 potential biomarkers, at 5766.379 d and 5912.586 d, need to be identified and further investigated. Furthermore, larger samples will be required to confirm the diagnostic proteomic models identified in our study. The identification of highly sensitive and specific proteomic models for screening ovarian cancer biomarkers, in combination with ANN, is an effective method for the early diagnosis of ovarian cancer. As proteomic and ANN technologies continue to improve, they will become increasingly useful for exploring the mechanisms of cancer pathogenesis and for the early diagnosis of neoplasms, such as ovarian cancer.
1. Jemal A, Twari RC, Murray T, et al. Cancer statistics, 2004. CA Cancer J Clin. 2004; 54: 8–29.
2. Young RC, Walton LA, Ellenberg SS, et al. Adjuvant therapy in stage I and stage II epithelial ovarian cancer. N Engl J Med. 1990; 322: 1021–1027.
3. Bast RC, Klug TL, St John E, et al. A radioimmunoassay using a monoclonal antibody to monitor the course of epithelial ovarian cancer. N Engl J Med. 1983; 309: 883–887.
4. Jacobs IJ, Skates SJ, MacDonald N, et al. Screening for ovarian cancer: a pilot randomised controlled trial. Lancet. 1999; 353: 1207–1210.
5. Petricoin EF, Ardekani AM, Hitt BA, et al. Use of proteomic patterns in serum to identify ovarian cancer. Lancet. 2002; 359: 572–577.
6. Sasaroli D, Coukos G, Scholler N. Beyond CA125: the coming of age of ovarian cancer biomarkers. Are we there yet? Biomark Med. 2009; 3: 275–288.
7. Cohen LS, Escobar PF, Scharm C, et al. Three dimensional power Doppler ultrasound improves the diagnostic accuracy for ovarian cancer prediction. Gynecol Oncol. 2001; 82: 40–48.
8. Hellström I, Raycraft J, Hayden-Ledbetter M, et al. The HE4 (WFDC2) protein is a biomarker for ovarian carcinoma. Cancer Res. 2003; 63: 3695–3 700.
9. Havrilesky LJ, Whitehead CM, Rubatt JM, et al. Evaluation of biomarker panels for early stage ovarian cancer detection and monitoring for disease recurrence. Gynecol Oncol. 2008; 110: 374–3 82.
10. Anastasi E, Marchei GG, Viggiani V, et al. HE4: a new potential early biomarker for the recurrence of ovarian cancer. Tumour Biol. 2010; 31: 113–11 9.
11. Izawa N, Kishimoto M, Konishi M, et al. Recognition of culture state using two-dimensional gel electrophoresis with an artificial neural network. Proteomics. 2006; 6: 3730–3738.
12. Li J, Zhang Z, Rosenzweig J, et al. Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin Chem. 2002; 48: 1296–1304.
13. Khan J, Wei JS, Ringnér M, et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med. 2001; 7: 673–679.
14. Ball G, Mian S, Holding F, et al. An integrated approach utilizing artificial neural networks and SELDI mass spectrometry for the classification of human tumours and rapid identification of potential biomarkers. Bioinformatics. 2002; 18: 395–404.
15. Azad NS, Rasool N, Annunziata CM, et al. Proteomics in clinical trials and practice: present uses and future promise. Mol Cell Proteomics. 2006; 5: 1819–1829.
16. Poon TC. Opportunities and limitations of SELDI-TOF-MS in biomedical research:pratical advices. Expert Rev Proteomics. 2007; 4: 51–65.
17. Wang L, Zheng W, Yu JK, et al. Artificial neural networks combined with surface-enhanced laser desorption/ionization mass spectra distinguish endometriosis from healthy population. Fertil Steril. 2007; 88: 1700–1702.
18. Ye B, Cramer DW, Skates SJ, et al. Haptoglobin-alpha subunit as potential serum biomarker in ovarian cancer: identification and characterization using proteomic profiling and mass spectrometry. Clin Cancer Res. 2003; 9: 2904–2911.
19. Moshkovskii SA, Serebryakova MV, Kuteykin-Teplyakov KB, et al. Ovarian cancer marker of 11.7 kd detected by proteomics is a serum amyloid A1. Proteomics. 2005; 5: 3790–3797.
20. Moshkovskii SA, Vlasova MA, Pyatnitskiy MA, et al. Acute phase serum amyloid A in ovarian cancer as an important component of proteome diagnostic profiling. Proteomics Clin Appl. 2007; 1: 107–117.
21. Helleman J, van der Vlies D, Jansen MP, et al. Serum proteomic models for ovarian cancer monitoring. Int J Gynecol Cancer. 2008; 18: 985–995.
22. Yokoi K, Shih LC, Kobayashi R, et al. Serum amyloid A as a tumor marker in sera of nude mice with orthotopic human pancreatic cancer and in plasma of patients with pancreatic cancer. Int J Oncol. 2005; 27: 1361–1369.
23. Paret C, Schön Z, Szponar A, et al. Inflammatory protein serum amyloid A1 marks a subset of conventional renal cell carcinomas with fatal outcome. Eur Urol. 2010; 57: 859–866.
24. Sreseli RT, Binder H, Kuhn M, et al. Identification of a 17-protein signature in the serum of lung cancer patients. Oncol Rep. 2010; 24: 263–270.
25. Dijkstra M, Vonk RJ, Jansen RC. SELDI-TOF mass spectra: a view on sources of variation. J Chromatogr B Analyt Technol Biomed Life Sci. 2007; 847: 12–23.