Assessment of Vulvovaginal Complaints: Accuracy of Telephone Triage and InOffice Diagnosis

Allen‐Davis, Jandel T. MD; Beck, Arne PhD; Parker, Ruth CNM; Ellis, Jennifer L. MBA; Polley, Dana MA

Obstetrics & Gynecology:
Original Research

OBJECTIVE: To examine the agreement between telephone and office management of vulvovaginal complaints and to assess the accuracy of diagnosis of vulvovaginitis.

METHODS: Prospective structured telephone nurse interviews of all patients with vulvovaginal complaints who called the Kaiser Permanente Telephone Call Center were conducted. Patients were appointed to a physician, nurse midwife, or physician's assistant for office evaluation. Both groups (nurses and practitioners) made independent diagnosis and treatment decisions. κ coefficients were used to evaluate the interexaminer agreement between telephone nurses and practitioners, and practitioners and traditional diagnostic tests.

RESULTS: A total of 485 patients underwent telephone interviews, and 253 (52%) completed the study protocol. κ values showed poor agreement between nurses and practitioners for bacterial vaginosis (0.12), candidiasis (0.22), and trichomoniasis (−0.05). Practitioners failed to accurately diagnose vaginitis when κ values were analyzed. There was also poor agreement between telephone nurses and practitioners regarding the necessity of an office visit (0.14).

CONCLUSION: This prospective study challenges the notion that the telephone is an effective tool to diagnose and treat vulvovaginal complaints. Moreover, given the poor agreement between practitioners' diagnoses and microbiologic and microscopic data, further study into optimal diagnosis of vulvovaginitis is needed.

In Brief

Telephone triage is not an effective tool for diagnosing and treating vulvovaginal complaints.

Author Information

Department of Obstetrics and Gynecology, and Clinical Research Unit, Kaiser Permanente, Wheat Ridge, Colorado.

Address reprint requests to: Jandel T. Allen‐Davis, MD, Department of Obstetrics and Gynecology, Kaiser Permanente, 4803 Ward Road, Wheat Ridge, CO 80033; E‐mail: jandel.c.allen‐

This work was supported by a Research & Development grant from Kaiser Permanente, Colorado.

We thank James McGregor, MD, for his assistance in this study.

Received May 30, 2001. Received in revised form September 4, 2001. Accepted September 17, 2001.

Article Outline

The telephone is a point of entry in many modern health care delivery systems. In an effort to control costs, to avoid unnecessary appointments, and to maximize efficiency for patients and offices, nurses may diagnose and manage many common conditions via the telephone, usually following established protocols. These systems range from simple office‐based phone triage to large generic telephone call centers that operate around the clock. Despite this common practice, there are very few published studies evaluating clinical outcomes of care delivered in this fashion.1–4 Most published research evaluates patient satisfaction, and there are no studies evaluating either satisfaction or clinical outcomes with phone management of common gynecologic complaints.

Vulvovaginitis is commonly managed by phone, often by office nurses. Vulvovaginitis is the most common reason why women seek gynecologic help. With the introduction of over‐the‐counter antifungals, treatment of vulvovaginal complaints is more readily available, but relies on a patient's self‐diagnosis. If this self‐assessment is erroneous, several suboptimal outcomes could occur. First, patients may be selecting out more resistant strains of Candida species, while failing to receive the correct diagnosis and treatment in the case where candidiasis is not the cause of the symptoms; this could result in complicating a rather straightforward case of vulvovaginitis. Second, incorrect diagnosis potentially increases costs, results in more office visits, and erodes any efficiency that may be gained by phone management.

Additionally, the diagnostic accuracy of providers has not been scrutinized in an evidence‐based fashion. Specifically, no published study has looked at microscopy, pH, and culture and correlated them with final diagnosis or patient outcomes. This prospective study attempts to address these complex issues.

Back to Top | Article Outline


From June 1996 to August 1996, any female patient who called the Kaiser Permanente Phone Call Center with vulvovaginal complaints was referred to a registered nurse present in the Call Center for phone evaluation. This time frame was used to obtain a sample size of 500 patients, which was estimated to be sufficient for yielding reliable estimates of interrater agreement. Using consecutive sampling, phone assessments of 485 women were obtained during this period. Exclusion criteria included pregnancy, patients already entered into the study, and those who refused to participate. The nurse asked the patient a uniform battery of questions, made an assessment and treatment plan based on the information received, and judged whether she would have treated the patient over the phone. Patients were given a same‐day appointment with a physician, nurse midwife, or physician's assistant, who was blinded to the nursing assessment and plan. To accurately assess their current level of knowledge and practice, practitioners and nurses received no prestudy education in the assessment of vulvovaginal complaints.

In the office, patients answered the same questions the phone nurse had asked them via a self‐administered questionnaire. Practitioners reviewed this questionnaire and performed a physical examination. Cervical cultures for Neisseria gonorrhea were obtained via Thayer Martin plates. Chlamydia trachomatis testing was obtained using Chlamydiazyme enzyme immunoassay (Abbott Laboratories, Diagnostics Division, Abbott Park, IL). Additionally, vaginal pH and microscopic evaluation of vaginal discharge were done. Vaginal fluid was cultured for Trichomonas vaginalis using the In‐Pouch system (BioMed Diagnostics, San Jose, CA) (tests were read at 4 days and 7 days). Candida species was cultured using a BAP/MAC Choc Ml Sab plate, and aerobic cultures were also done. The practitioners made a diagnosis and prescribed treatment. They made a judgment regarding whether they would have treated the patient over the phone. Call Center nurses were blinded to this information. An attempt was made to contact patients 2 weeks after treatment to ascertain whether they completed the treatment and whether their symptoms had resolved.

Analyses were done using SAS 6.12 (SAS System, Cary, NC). κ coefficients were used to evaluate the interexaminer agreement between phone nurses' and providers' diagnoses, and between laboratory and provider office diagnoses. The κ statistic is a measure of reproducibility between repeated assessments. In general, values of κ greater than 0.75 denote excellent agreement beyond chance, values below 0.40 indicate poor agreement beyond chance, and values in between represent fair‐to‐good agreement beyond chance.5

Back to Top | Article Outline


A total of 485 patients underwent telephone evaluation and were appointed to see a physician, nurse midwife, or physician's assistant. Office staff relied on the Call Center to alert them to a patient who had been evaluated by phone by indicating “Vaginitis Study Patient” in the appointment comments. However, because of busy schedules and other distractions, Call Center nurses recorded this information intermittently. Hence, this comment was entered for only 253 (52%) of the 485 patients; these patients constituted the subjects included in the final analysis. Demographic characteristics are not available for the 232 patients dropped from the sample, but, because the omissions were random, there is no reason to suspect any selection bias. Fifty‐one patients (20%) were evaluated by a physician, and 202 (80%) by a midlevel practitioner. One hundred fifty‐one (60%) patients had no previous history of similar vaginal symptoms, and 66 (26%) had been treated for similar symptoms in the 4 months before entry in the study.

Table 1 and Table 2 present pertinent historic data and clinical findings. There are statistically significant differences between Call Center‐elicited responses and patient self‐reports of condom use, sexual activity, and symptoms, despite the fact that patients were seen within hours of the phone interview. Moreover, there was a statistically significant difference between patient self‐reports and provider descriptions of vaginal discharge color.

Table 3 presents microscopic findings. Clue cells were seen in 30% of the examinations, whereas fungal elements were seen 40% of the time (Table 3). Lactobacilli, the predominant organisms in normal vaginal secretions, were absent 52% of the time. Trichomonads were present in eight (4%) patients by microscopy. However, with respect to microbiologic findings (Table 4), Candida species was cultured in 31% of the cases, and there were five (2%) positive Trichomonas vaginalis cultures. There were no cases of Chlamydia trachomatis or Neisseria gonorrhea.

With respect to diagnostic assessments of patients, nurses, and practitioners, candidiasis was most frequently cited among the three groups, whereas trichomoniasis was rarely diagnosed. Practitioners diagnosed bacterial vaginosis 23% of the time, whereas nurses were split evenly between bacterial vaginosis and candidiasis. Patients cited bacterial vaginosis as a cause of their symptoms infrequently. They tended to either think that they had vaginal candidiasis or not know the cause.

κ values for diagnoses (Table 5) showed poor agreement between nurses and providers for bacterial vaginosis (0.12), vulvovaginal candidiasis (0.22), and trichomoniasis (−0.05). Additionally, poor agreement was seen between providers and clinical findings consistent with bacterial vaginosis (0.31). Office providers' accuracy was poor when compared with culture (Table 6), and was worse for nurses.

Responses to the question “Would you treat this patient over the phone?” showed poor agreement between Call Center nurses and providers (κ = 0.14). Thirty‐six percent of nurses believed phone treatment was appropriate for a given patient versus 28% of providers.

One hundred fifteen (45%) patients were reached by phone at 2 weeks post‐treatment. Most patients (82%) completed therapy, and 81% reported symptom resolution.

Back to Top | Article Outline


Most studies evaluating telephone triage systems have examined patient and practitioner satisfaction, that is, service parameters. Few studies have looked at outcomes or have prospectively attempted to answer questions of accuracy. This study prospectively evaluated accuracy and outcomes of phone management of vulvovaginal complaints, a common problem in gynecologic and primary care practices. The results show that telephone assessment of vulvovaginal complaints is poor, and as such, telephone management should be discouraged. Additionally, this study indicates that diagnosis by office practitioners lacks accuracy as well. There are probably a number of factors that contribute to these inaccuracies.

The current path to the diagnosis of vaginitis relies on accurate reporting of symptoms, coupled with the use of physical examination and microscopy. In this study, eliciting patients' symptoms over the phone yielded answers that were different from those self‐reported in the office to the same questions, and most of the differences were statistically significant. This has been evaluated in the psychiatric literature, and there appears to be good agreement between the phone and face‐to‐face interviews; no one has compared the phone and self‐assessment of symptoms.6–8 One must wonder if there are other areas of medicine in general, and gynecology in particular, wherein patients are reluctant to share details. The statistically significant difference between phone‐reported sexual activity and what patients disclosed to providers in the office is an example of a reticence to share intimate or seemingly embarrassing details. Non‐disclosure of this type could influence a phone triage nurse's ability to accurately assess a patient's complaints.

Although most women are aware of vulvovaginal candidiasis, few are aware of the symptoms of bacterial vaginosis or trichomoniasis. In this study, most women did not know the cause of their symptoms. There was also a difference between how patients described their discharge (ie, color) and what providers found on examination. If phone nurses use this information to decide triage, there is the possibility that this inaccuracy could affect their treatment decisions. This is a strong reason to argue against self‐treatment.

There was a lack of agreement between phone nurses and office practitioners in diagnosing vulvovaginitis. This may reflect a relative lack of knowledge on the part of the nurses, but it may also indicate the nonspecific nature of vulvovaginal symptoms, which renders making accurate diagnoses more difficult. We were unable to point to a specific constellation of symptoms that would allow nurses or providers to arrive at a correct diagnosis with a high degree of certainty. Vulvovagnitis is a nonurgent problem, and Leprohon and Patel,1 in a paper exploring decision‐making strategies in emergency medical services, found that decisions tended to be more accurate in high‐urgency situations. Conversely, when problems became more complex and less urgent, decisions tended to be inaccurate and to show a decoupling of knowledge and action. The lack of agreement between those making a diagnosis without the benefit of a physical examination and those who see patients and do the examinations is striking, but the lack of agreement between providers and currently widely available laboratory tests to augment the diagnosis of vaginitis indicates that current clinical practice might be lacking as well.

The majority of providers in this study were nurse practitioners, but when we controlled for the knowledge level (assuming physicians would be more highly skilled and therefore would arrive at more accurate diagnoses), we found no difference in results. Given this, one must wonder whether the basic tools for diagnosing vaginitis (ie, microscopy and physical examination) are sufficient or whether there are deficiencies in provider knowledge regarding the use of all available tools, including culture, pH, and amine testing. In teaching programs, microscopy of vaginal fluids is taught early in training by those who may lack clinical experience diagnosing vaginitis, perhaps with the unintentional outcome that students and junior residents assume vaginitis is a simple condition to diagnose and treat. The results of this study question that assumption.

Telephone nurses were more likely than practitioners to be comfortable with treatment over the phone, a finding that questions their perception of their knowledge base, the reliability of protocols, or their assumptions about the connections between symptoms and diagnoses. No prestudy nurse or provider education regarding the diagnosis and management of vaginitis was done, as the purpose of the study was to evaluate typical practice patterns in a system that uses phones extensively for triage and treatment. It is difficult to assess the degree to which this approach affected our results without repeating the study after an intensive educational program in the diagnosis and treatment of vulvovaginitis.

One limitation of the study involves the lack of agreement regarding the optimal way to diagnose vaginitis. Our use of laboratory data might fall short of the definition of a gold standard. Bacterial vaginosis is diagnosed by clinical parameters (eg, Amsel's criteria, presence of clue cells, pH, or a positive whiff test). We used pH and microscopy but did not perform the whiff test, which might have changed the agreement data between providers and clinical findings in a more positive direction. Additionally, as pH results were not recorded in 20% of the examinations, pH was excluded from the final analysis.

Candida species can be found as part of the normal vaginal flora, so its presence does not necessarily mean that it is the cause of vaginal complaints. Further, microscopy is a suboptimal way to diagnose vaginitis in this case, especially in the face of extreme inflammation that can obscure the true clinical condition or in the absence of training in the preparation and interpretation of a wet mount.9 Practitioners were treating on the basis of the microscopy, which tended to yield more diagnoses of vulvovaginal candidiasis and trichomoniasis than did culture, which has a higher degree of accuracy. This raises the possibility that providers go to the microscope with a clinical bias based on history and physical findings, so they might be overdiagnosing these two conditions.

Most patients reported symptom improvement after 2 weeks despite the diagnostic incongruence. It might be that the follow‐up phone calls should have been done later than they were (ie, 4–6 weeks) to assess patients for resolution and recurrence. It is doubtful that this symptom resolution would be sustained in the case of improper treatment, although without a greater lag time in follow‐up, long‐term outcomes are unknown.

Our study points out the need to investigate outcomes of telephone triage systems more intensively and of telephone triage of vulvovaginitis specifically. It might be that, after an intensive training program, accuracy could be improved. However, the lack of accurate diagnosis of vulvovaginal complaints in the office argues against spending a great deal of time training nurses to diagnose over the phone. Focus needs to be on educating providers in the expanded role of culture, pH, and microscopic skills. Research efforts should focus on the development of accurate diagnostic modalities that are efficient, cost‐effective, and relatively inexpensive.

Back to Top | Article Outline


1. Leprohon J, Patel VL. Decision-making strategies for telephone triage in emergency medical services. Med Decis Making 1995;15:240–53.
2. Feldman-Naim S, Myers FS, Clark CH, Turner EH, Leibenluft E. Agreement between face-to-face and telephone-administered mood ratings in patients with rapid cycling bipolar disorder. Psych Res 1997;71:129–32.
3. Poole SR, Schmitt BD, Carruth T, Peterson-Smith A, Slusarski M. After-hours telephone coverage: The application of an area-wide telephone triage and advice system for pediatric practices. Pediatrics 1993;92:6709.
4. Sramek M, Post W, Koster RW. Telephone triage of cardiac emergency calls by dispatchers: A prospective study of 1,386 emergency calls. Br Heart J 1994;71:440–5.
5. Fleiss JL. Statistical methodology for rates and proportions. 2nd ed. New York: John Wiley and Sons, 1981.
6. Donovan RJ, Holman CD, Corti B, Jalleh G. Face-to-face household interviews versus telephone interviews for health surveys. Aust N Z J Public Health 1997;21:131–40.
7. Revicki DA, Tohen M, Gyulai L, Thompson C, Pike S, Davis-Vogel A, et al. Telephone versus in-person clinical and health assessment interviews in patients with bipolar disorder. Harv Rev Psych 1997;5:75–81.
8. Rohde P, Lewinsohn PM, Seeley JR. Comparability of telephone and face-to-face interviews in assessing axis I and II disorders. Am J Psych 1997;154:1593–8.
9. Iglesias EA, Alderman E, Fox AS. Use of wet smears to screen for sexually transmitted diseases. Infect Med 2000;17:175–85.
© 2002 by The American College of Obstetricians and Gynecologists.