Materials and Methods
This prospective cohort study was conducted at Prentice Women's Hospital of Northwestern Memorial Hospital from March 1, 1997, to March 15, 1998, with institutional review board approval. Women who presented for either laparoscopy or laparotomy consented to have pelvic examinations under anesthesia by an attending gynecologist, gynecology resident, or medical student who had no knowledge of subjects' symptoms and indications for surgery. Examiners were selected by experience and immediate availability to do pelvic examinations when subjects were brought to the operating room. No one physician or medical student accounted for more than 9% of the total examinations.
Before surgery, women provided information on weight, height, lower abdominal surgeries and corresponding abdominal wall scars (vertical and transverse, all below the umbilicus), and pelvic organs removed. Body mass index (BMI) was calculated for each woman as weight in kilograms per height in meters squared (kg/m2), and obesity was defined as BMI at least 30 kg/m2.
The subjects had many indications for surgery, from diagnostic or sterilization procedures to laparotomy for suspected pelvic malignancy. The conditions for pelvic examination were standardized across examinations. After general anesthesia was given, women were placed in the dorsal lithotomy position by using Allen stirrups (Allen Medical Systems, Bedford Heights, OH). Their bladders were emptied by straight catheterization. Each examiner recorded pelvic examination findings, including adnexal (size and presence of adnexal masses), uterine (position, size, mobility, and contour), and rectovaginal (cul-de-sac obliteration, external rectal compression, and uterosacral nodularity) findings. Examiners recorded years of postgraduate training for attending physicians and residents and medical school year for students. Fifty-two board-certified obstetrician-gynecologists comprised the attending physician group. The resident group included 30 residents from all four postgraduate years. Forty third- and fourth-year medical students participated. Examiners' dominant and examining hands were recorded. Postoperatively, surgeons completed forms that described surgical findings that encompassed the same variables assessed during pelvic examinations.
An adnexal mass was defined as approximately 5 cm or more in greatest diameter, and uterine enlargement was defined as at least 8 weeks' gestation. Left and right adnexa were considered separately for analysis.
Data were analyzed with a standard statistical package (SPSS 9.0; SPSS, Inc., Chicago, IL). Sensitivity, specificity, and positive predictive value were expressed as proportions. The 95% confidence intervals (CIs) were computed using normal approximation to the binomial distribution for moderate values (eg, a value at least 0.3 and no more than 0.7). For values near zero (less than 0.3) or near unity (greater than 0.7), the upper and lower confidence limits were computed with an alternative method, described by Fleiss, based on the two roots of the quadratic equation.13 Continuous variables were tested for statistically significant differences using two-tailed paired t tests.
We calculated Youden J statistic, a summary index that combines sensitivity and specificity, assuming both have equal importance.14 A perfectly valid test has a J value of one, whereas a J value of zero suggests that the test performs no better than chance.15 The likelihood ratio is a statistical alternative proposed by Sackett et al16 as a more stable measure than sensitivity and specificity. The positive predictive value takes both measures into account but varies greatly depending on the prevalence of the disorder in the population. As an alternative, we calculated the likelihood ratio for pelvic examination as the odds that a positive or negative pelvic examination would be expected in a woman with an adnexal mass (as opposed to one without an adnexal mass).
Table 1 shows the characteristics of 140 women who participated in the study. They were predominantly premenopausal, and only 14 (10%) were 50 years or older. Mean BMI was 26.1 kg/m2 and nearly one third had prior surgeries.
Many adnexal masses were found at surgery (49 left and 33 right adnexal masses). The three examiner groups did 361 examinations. Detection of adnexal masses is reported in terms of sensitivity, specificity, Youden J statistic, positive predictive value, and likelihood ratio in Table 2. Sensitivity of pelvic examinations was low irrespective of adnexal laterality or examiner experience. Sensitivity tended to be better for attending physicians and residents than for medical students but the difference was not statistically significant. Specificity was uniformly high for both adnexa. Except for gynecology residents' examinations of the left adnexa, the ability of bimanual examination to detect adnexal masses was no better than chance alone (ie, sensitivity and specificity equal to 100%). The greater prevalence of left adnexal masses in our cohort was associated with higher positive predictive value for left adnexa. When the likelihood ratio was analyzed, the resident group had the best, most consistent performance among the examiner groups. For example, if a resident described the left adnexa as abnormal, the odds of finding an adnexal mass at surgery were more than five times greater than of not finding one. When attending physicians described right adnexal masses by examination, the odds of detecting one at surgery were the same as those of not finding one.
We compared examiners' estimated size of adnexa with surgically confirmed adnexal size using a paired t test. Examiners systematically underestimated the size of the adnexa by bimanual examination. We found no statistically significant differences in estimated size of adnexas between attending physicians and residents. In separate comparisons, attending physicians and gynecology residents were statistically significantly more accurate than medical students in determining adnexal size. On average, attending physicians (mean ± standard error of mean [SEM]; left −1.49 ± 0.30; right −1.61 ± 0.31) and residents (left −1.42 ± 0.24; right −1.01 ± 0.26) underestimated sizes of adnexae by approximately 1–1.5 cm, whereas students' (left −2.54 ± 0.29; right −2.34 ± 0.24) estimates were 2–2.5 cm less than surgically confirmed values.
The effects of patient characteristics on accuracy of pelvic examinations are shown in Table 3. No statistically significant differences were found for sensitivity, specificity, and positive predictive values for lean versus obese women, those with normal versus enlarged uteri, and absence or presence of abdominal wall scars. However, overall accuracy of examinations, expressed as the J statistic, was statistically significantly better than chance for left adnexal examinations among leaner women, those with normal-sized uteri, and those without abdominal scars. There were fewer right adnexal masses among those women, so there was only a trend toward comparable findings with the right adnexas. Using the dominant hand as the examining hand was unrelated (not shown) to overall accuracy of the bimanual examinations.
The need for bimanual pelvic examination as part of routine gynecologic care has been questioned.5,17 The United States Department of Health and Human Services Public Health Services, in the Clinician's Handbook of Preventive Services, and the National Cancer Institute do not endorse pelvic examination as a screening test for adnexal disease, particularly ovarian cancer, because of a lack of information on sensitivity, specificity, and positive and negative predictive values.18,19 No study has shown that routine pelvic examination increases detection of adnexal disease, whether benign or malignant.
This study design eliminated examination limitations inherent in an awake patient (anxiety, pain, and bladder distention), so we expected sensitivity and specificity of examinations to improve. Sensitivity of pelvic examination to detect adnexal masses equal to or greater than 5 cm was low. Our findings confirmed those in the literature that pelvic examination for detecting adnexal disease is unreliable. Roman et al20 compared pelvic examination to tumor marker levels and ultrasound for predicting pelvic cancer in women with adnexal masses. Sensitivity and positive predictive value of pelvic examination were only 51% and 43.8%, respectively. They did not endorse pelvic examination for that application because half the patients with cancer had unsuspicious examinations. A retrospective review of adnexal masses in postmenopausal women by Rulin and Preston21 found that multiple masses equal to or greater than 5 cm (and even as large as 10 cm) frequently were missed by gynecologists at an academic center. In a population screening study by Andolf et al,22 only 23% of persistent adnexal masses found by ultrasound were detected by pelvic examination and none of the borderline or malignant ovarian lesions were found by pelvic examination. A follow-up study reported that the pelvic examination missed 43% of adnexal masses up to 5 cm and 19% of those equal to or greater than 5 cm in postmenopausal women.23
In the current study, the specificity (79–92%) of pelvic examination to detect adnexal masses was acceptable. Those values should be considered in the context of the cohort of women with relatively high prevalence of adnexal masses (greater than 20%) in the study group. Most women were premenopausal, a group at higher risk of developing ovarian neoplasms compared with postmenopausal women.2,24 Prevalence of adnexal masses in the general population is 0.17–5.9% in asymptomatic women23,25–30 and 7.1–12% in symptomatic women.23,31–33 Therefore, prevalence of adnexal masses in our study was at least twice that reported in the general population. If specificity and negative predictive value were extrapolated to prevalence of adnexal masses in the general population, both measures would improve significantly.
Clinicians armed with the likelihood ratio for a screening test and the expected odds for a disorder can better interpret the results of a screening test. For example, the expected odds of an adnexal mass in the general population was reported as 1:32 (approximately 3%).20 Multiplying the odds by the likelihood ratio for the screening test result (likelihood ratio for attending physicians 2.8) gives an estimate of the odds that women have surgically confirmed adnexal masses. In that example, the computed odds, 2.8:32, can be converted to a probability using the following formula: probability = odds/(odds + 1). A positive finding for an adnexal mass by bimanual examination would have a surgically confirmed mass slightly more than 8% of the time, which compares with ultrasound, for which a positive test for an adnexal mass would be confirmed slightly more than 17% of the time.20
Whether the left or right adnexa can be better assessed by the bimanual pelvic examination was another variable in our study. Stovall et al34 found that pelvic examination assessment of the right adnexa was more accurate than that of the left adnexa. They proposed that predominance of examinations with the right hand and interference of the sigmoid colon with palpation of the left adnexa could account for the difference. In contrast, we found that the left adnexal assessment had better sensitivity and positive predictive value, which was consistent for all examiner groups. Specificity was similar for both adnexa, suggesting that the ability to rule out adnexal disease is independent of adnexal laterality.
Other studies looked at the effect of resident education on diagnostic accuracy,35,36 but our study compared groups with different levels of experience. Residents did slightly better than attending physicians in all aspects assessed, but that difference was not statistically significant. Medical students were less accurate, but that difference was not statistically significant. This supports the hypothesis that postgraduate training in gynecology and obstetrics improves pelvic examination skills. Level of training beyond medical school did not increase accuracy any further, so it is possible that pelvic examination skills reach a plateau after a certain number of examinations have been done. In that regard, we could expect that primary care physicians who do not perform pelvic examinations routinely,37 often because of insufficient training,8,9 will have lower sensitivity and specificity to detect adnexal masses. Therefore, routine reliance on ancillary testing would seem justified.
The limiting effect of obesity on detection of adnexal masses was only marginal. Uterine enlargement, frequently mentioned as a limitation in assessing the adnexa by ultrasonography, was also found to decrease detection of adnexal masses by pelvic examination.
The results of this study increase concerns about the value of the pelvic examination for detecting adnexal disease; however, adnexal assessment is only one component of pelvic examination. Inspection of the lower genital tract coupled with routine cervical cytology allows preclinical epithelial disease to be detected. Uterine assessment correlates well with ultrasound and pathologic weight.38,39 Rectal examination for fecal occult blood might detect colorectal cancer. Pelvic examination could remain a screening technique as long as examiners acknowledge its significant limitations in assessing the adnexa.
1. Finkler NJ, Benacerraf B, Lavin P, Wojciechowski C, Knapp RC. Comparison of serum CA 125, clinical impression, and ultrasound in the preoperative evaluation of ovarian masses. Obstet Gynecol 1988;72:659–64.
2. Koonings PP, Campbell K, Mishell DR, Grimes DA. Relative frequency of primary ovarian neoplasms: A 10-year review. Obstet Gynecol 1989;74:921–6.
3. Vasilev S, Schlaerth JB, Campeau J, Morrow CP. Serum CA 125 levels in the preoperative evaluation of pelvic masses. Obstet Gynecol 1988;71:751–6.
4. Creasman WT. Ovarian cancer screening. ACOG clinical review 1997;2:1–15.
5. Grover SR, Quinn MA. Is there any value in bimanual pelvic examination as a screening test? Med J Aust 1995;162:408–10.
6. Jacobs I, Stabile I, Bridges J, Reynolds C, Kemsley P, Grudzinskas J, et al. Multimodal approach to screening for ovarian cancer. Lancet 1988;1:268–71.
7. Russell DJ. The female pelvic mass: Diagnosis and management. Med Clin North Am 1995;79:1481–93.
8. Day SC, Grosso LJ, Norcini JJ, Blank LL, Swanson DB, Horne MH. Residents' perception of evaluation procedures used by their training program. J Gen Intern Med 1990;5:424–6.
9. Neinstein LS, Shapiro JR. Pediatrician's self-evaluation of adolescent health care training, skills, and interest. J Adolesc Health Care 1986;7:18–21.
10. Tolmas HC. Adolescent pelvic examination: An effective practical approach. Am J Disabled Child 1991;145:1269–71.
11. Campbell KA, Shaughnessy AF. Diagnostic utility of the digital rectal examination as part of the routine pelvic examination. J Fam Pract 1998;46:165–7.
12. Sturman MF. Pelvic examination versus fiberoptic laparoscopy. J Clin Gastroenterol 1988;10:612–3.
13. Fleiss JL. Statistical methods for rates and proportions. 2nd ed. New York: John Wiley & Sons, 1981.
14. Youden WJ. Index for rating diagnostic tests. Cancer 1950;3:32–5.
15. Szklo M, Nieto FJ. Epidemiology: Beyond the basics. 1st ed. Gaithersburg, Maryland: Aspen Publication, 2000.
16. Sackett DL, Haynes RB, Guyatt GH, Tugwell P. Clinical epidemiology: A basic for clinical medicine. 2nd ed. Boston: Little, Brown, and Company, 1991.
17. Oboler SK, LaForce FM. The periodic physical examination in asymptomatic adults. Ann Intern Med 1989;110:214–26.
18. Kramer BS, Gohagan J, Prorok PC, Smart C. A National Cancer Institute sponsored screening trial for prostatic, lung, colorectal, and ovarian cancers. Cancer 1992;71 (Suppl):589–93.
19. United States Department of Health and Human Services Public Health Service. Put prevention into practice. Clinician's handbook of preventive services. Washington, DC: United States Government Printing Office, 1998.
20. Roman LD, Muderspach LI, Stein SM, Laifer-Narin S, Groshen S, Morrow PC. Pelvic examination, tumor marker level, and grayscale and Doppler sonography in the prediction of pelvic cancer. Obstet Gynecol 1997;89:493–500.
21. Rulin MC, Preston AL. Adnexal masses in postmenopausal women. Obstet Gynecol 1987;70:578–81.
22. Andolf E, Svalenius E, Astedt B. Ultrasonography for early detection of ovarian carcinoma. Br J Obstet Gynaecol 1986;93:1286–9.
23. Andolf E, Jorgensen C. Cystic lesions in elderly women, diagnosed by ultrasound. Br J Obstet Gynaecol 1989;96:1076–9.
24. Bennington JL, Ferguson BR, Haber SL. Incidence and relative frequency of benign and malignant ovarian neoplasms. Obstet Gynecol 1968;32:627–32.
25. van Nagell JR, DePriest PD, Puls LE, Donaldson ES, Gallion HH, Pavlik EJ, et al. Ovarian cancer screening in asymptomatic postmenopausal women by transvaginal sonography. Cancer 1991;68:458–62.
26. Jacobs I, Prys A, Bridges J, Stabil I, Fay T, Lower A, et al. Prevalence screening for ovarian cancer in postmenopausal women by CA 125 measurement and ultrasonography. BMJ 1993;306:1030–4.
27. Bourne T, Campbell S, Reynolds KM, Whitehead MI, Hampson J, Royston P, et al. Screening for early familial ovarian cancer with transvaginal ultrasonography and colour blood flow imaging. BMJ 1993;306:1025–9.
28. Hayashi H, Yaginuma Y, Kitamura S, Saitou Y, Miyamoto T, Komori H, et al. Bilateral oophorectomy in asymptomatic women over 50 years old selected by ovarian cancer screening. Gynecol Obstet Invest 1999;47:58–64.
29. Bahn V, Amso N, Whitehead MI, Campbell S, Royston P, Collins WP. Characteristics of persistent ovarian masses in asymptomatic women. Br J Obstet Gynaecol 1996;96:1384–91.
30. Sasaski H, Oda M, Ohmura M, Akiyama M, Liu C, Tsugane S, et al. Follow up of women with simple ovarian cysts detected by transvaginal sonography in the Tokyo metropolitan area. Br J Obstet Gynaecol 1999;106:415–20.
31. Reuss LM, Kolton S, Tharakan T. Transvaginal ultrasonography in gynecologic office practice: Assessment in premenopausal women. Am J Obstet Gynecol 1996;175:1189–94.
32. Kurjak A, Zalud I, Schulman H, Sosic A, Shalan H. Transvaginal ultrasound, color flow, and doppler waveform of the postmenopausal adnexal mass. Obstet Gynecol 1992;80:917–21.
33. Schoenfeld A, Levavi H, Hirsch M, Pardo J, Ovadia J. Transvaginal sonography in postmenopausal women. J Clin Ultrasound 1990;18:350–8.
34. Stovall TG, Elder RF, Ling FW. Predictors of pelvic adhesions. J Reprod Med 1989;34:345–8.
35. Carter J, Fowler J, Carson L, Carlson J, Twiggs LB. How accurate is the pelvic examination as compared to transvaginal sonography?: A prospective, comparative study. J Reprod Med 1994;39:32–4.
36. Frederick JL, Paulson RJ, Sauer MV. Routine use of vaginal ultrasonography in the preoperative evaluation of gynecologic patients: An adjunct to resident education. J Reprod Med 1991;36:779–82.
37. Morris PD, Morris ER. Family practice residents' compliance with preventive medicine recommendations. Am J Prev Med 1988;4:161–5.
38. Cantuaria GH, Angioli R, Frost L, Duncan R, Penalver MA. Comparison of bimanual examination with ultrasound examination before hysterectomy for uterine leiomyoma. Obstet Gynecol 1998;92:109–12.
39. Killackey MA, Neuwirth RS. Evaluation and management of the pelvic mass: A review of 540 cases. Obstet Gynecol 1988;71:319–21.