Increasing the US cervical cancer screening intervals by introducing human papillomavirus (HPV) testing1 triggered concerns about implementation2 and increased risks for cancer.3 Widespread adoption of guidelines is lagging and inefficient.4 Colposcopy with biopsy links screening and diagnosis/treatment of precancerous lesions but is diagnostically poor and subjective.5–8 To increase sensitivity, recent evidence advocates extensive biopsying,9 directed10 or random,11,12 compromising patient experience13 and increasing health care costs, but it is uncertain to what extent these practices have been adopted by community-based clinics. Furthermore, even with multiple biopsies, almost half of precancers are undetected/underdetected, indicating that biopsy placement is inaccurate.14 Detection of initially missed disease during follow-up visits depends on access and compliance, and furthermore, subsequent cytology results are also subjective,15 increasing associated risks and costs and prolonging patient anxiety. The need to improve colposcopy and the detection of high-grade cervical intraepithelial neoplasia (CIN) is widely recognized.8,9,12
Dynamic spectral imaging (DSI) cervical mapping standardizes colposcopy and quantifies cervical acetowhitening, the most sensitive indicator of CIN 2+,16 to introduce objectivity and assist colposcopic assessment and biopsy placement. International, academic institution-based trials demonstrated that using the DSI map increases the sensitivity of colposcopy to identify CIN 2+.17,18 These were investigational studies that were performed in controlled settings, where colposcopists were required to perform a biopsy the strongest DSI indications and also had to take random biopsies to reduce verification bias. They did not use control groups for comparisons of accuracy and demonstrated “proof of principle” rather than “real-world” performance. The effectiveness of colposcopy with DSI in routine practice and wider settings has not been confirmed to date, except for a single-colposcopist report from Spain.19
After the findings of these earlier studies, the IMproved PRactice Outcomes and Value Excellence in Colposcopy (IMPROVE- COLPO) study was conducted to expand the scope to a US population. The primary objective of the study was to observe changes in colposcopy practice and CIN 2+ detection in a large representative group of US community-based clinics, after the introduction of the commercial digital colposcope (DYSIS; DYSIS Medical, Edinburgh, United Kingdom) that integrates DSI, using previous standard practice for control. In this article, we present findings on women having colposcopy preceded by low-grade abnormalities, the subgroup that represents most colposcopy patients and harbors most CIN.20
MATERIALS AND METHODS
The IMPROVE-COLPO study is a multicenter, observational, two-arm cross-sectional study in patients undergoing colposcopy-based on current US guidelines.1 The study was designed to capture routine colposcopy at US community-based clinics, was approved by a central institutional review board (E&I Review Services, Independence, MO) and local institutional review boards as required, and was conducted according to the International Conference on Harmonization Guideline for Good Clinical Practice. Minimization of subject refusal to participate was overcome with the separate arms; a randomized study would have likely been subject to a higher refusal rate, thus limiting the generalizability of study conclusions.
Facilities adopting the DSI technology, ranging from single-provider private practices to teaching hospitals, were invited to participate, without further selection criteria. At each facility, consecutive women who were having colposcopy with the DSI digital colposcope and were eligible for participation according to the study inclusion/exclusion criteria were approached by site staff and informed about the study. Those who agreed were enrolled in the prospective arm. No data of patients who refused to consent and participate were collected. Patients from consecutive historical examinations performed by the same colposcopists as in the prospective arm in the period directly preceding the study device installation, but with standard colposcopes (any type) and methods, were enrolled in the retrospective control arm. Retrospective arm subject data were compiled by chart review. The numbers of patients across the two arms were matched so that each colposcopist contributed an equal number of cases in each arm (1:1) to reduce bias due to variabilities in training/expertise levels. The study colposcopists were those conducting colposcopies at the participating sites, as to reflect colposcopy practice in US community-based clinics. There was no further quality control or conditions for selection, other than their willingness to participate and availability of retrospective cases for matching. Colposcopists involved included gynecologic oncologists, obstetrician-gynecologists, nurse practitioners, and physician assistants. They were all adequately trained in using the device before recruiting prospective patients.
Inclusion criteria for both arms were age of 21 years and older and an abnormal screening test result.1,21 Although current US guidelines recommend against it, women younger than 29 years were often co-tested for HPV, so they were included if they were HPV+ and had atypical squamous cells of undermined significance (ASC-US) cytology. Women aged 21 to 24 years were included also with a single low-grade squamous intraepithelial lesion (LSIL) result. Women undergoing colposcopy for no specified indication, after a single ASC-US or with a single HPV+ result (unless they were >25 years with HPV 16/18 from primary screening)21 were excluded. Other exclusions were for known pregnancy, HIV infection or AIDS, previous hysterectomy, and receiving (or having previously received) radiation treatment or chemotherapy for cancers concurrent with cervical disease. Women in the prospective arm signed informed consent before any study procedure; consenting was waived for retrospective arm patients.
The study device is a high-resolution digital colposcope offering magnification, green and enhanced-contrast filters, biopsy annotations for guidance, dynamic playback for image comparisons during the course of the examination, and DSI mapping.17 The DSI map is based on analyzing a baseline (pre-acetic acid) image and consecutive images captured after the homogenous application of acetic acid with an integrated applicator. Acetowhite changes are quantified and highlighted for assessment and directed biopsy with a color scale (see Figure 1). The device use followed its indications cleared by the US Food and Drug Administration, whereby the DSI map is used adjunctively, after a thorough standard colposcopy visualization.
All prospective-arm examinations were performed with the study device. Colposcopists assessed morphology and acetowhitening, forming their clinical impression and identifying biopsy sites, before seeing the DSI map. To evaluate routine colposcopy with pragmatic use of DSI, rather than choices dictated by protocol, clinical decisions were the responsibility of the colposcopist: to perform a biopsy on a patient or not, number of biopsies, and where to perform a biopsy. Therefore, although available, the DSI map was interpreted and then followed or overridden for biopsy at the colposcopists' discretion, as considered appropriate for each patient. Similarly, the collection of random biopsies, which would help reduce verification bias, was not requested but was left to clinical judgment.
For each woman, we collected basic demographics, number of biopsies, whether endocervical sampling was performed, and all relevant histopathology results at the biopsy level, except when multiple samples were collected in the same jar. Histopathology readings, the gold standard for analyses, were performed at the laboratories collaborating with the participating facilities, following routine practice.
This article discusses women with lesser/low-grade abnormalities from screening,22 encompassing LSIL, combinations of ASC-US and HPV+ results (co-testing, reflex, or persistent findings), and referrals based on HPV testing (persistent infection, HPV 16/18 after an earlier ASC-US or HPV+ result, or HPV 16/18 result from HPV primary screening). We compare the findings of colposcopic biopsy between the two arms, without detailing the origin of biopsies within the prospective arm (standard-directed or DSI-assisted), which will be studied separately.
The main outcome measures are the number of women detected with CIN 2+ (CIN 2, CIN 3, CGIN 2–3, (adeno)carcinoma in situ, and invasive (adeno)carcinoma), the number of biopsied women, and the number of biopsies. Cervical intraepithelial neoplasia grade 2 was used as the threshold, because this is the current standard for management decisions.22 To investigate the clinical importance of differences in disease detection, we also analyzed for CIN 3+, a better surrogate for cervical cancer,23 and for different age groups, as in younger women with low-grade cytologic abnormalities, CIN 2 often regresses.24 The study sample size was determined so that it would be sufficient to detect a 2% absolute increase in the percent rate of patients found with CIN 2+. Assuming that the rate of patients found with CIN 2+ in the retrospective arm was 8%, detecting a 2% absolute increase (25% relative increase) in the prospective arm, with 80% power, and a 5% two-sided type I error would require 3,350 subjects per arm.
We conducted subject- and biopsy-level analyses. The number of subjects with undetected CIN 2+ is unknown, so the sensitivity and specificity could not be directly calculated. On the subject-level analysis, we characterized each patient as true positive (“TP,” had at least one CIN 2+ biopsy), false positive (“FP,” had biopsy/biopsies that found only < CIN 2), or not biopsied. To evaluate and compare outcomes we determined, for each arm separately, the number of TP and FP patients and calculated the TP rate (defined as TP/N) as a measure of detection and the FP rate (defined as FP/N) as a measure of specificity, where N is the total number of women in each arm.
To compare the accuracy of biopsy, we calculated the biopsy-level positive predictive value (PPV), defined as the number of CIN 2+ biopsies divided by the number of biopsies taken. To model the relationship between the number of biopsies that found CIN 2+ relative to the number of biopsies that were taken per patient, we used a linear regression model with the number of biopsies taken for each patient as the independent variable and the number of CIN 2+ biopsies among them as the dependent variable to calculate the slope of the best linear fit.
The TP and FP rates and the biopsy PPV differences were compared using 95% confidence intervals (95% CIs) and p values (considered statistically significant at p ≤ .05) calculated using the two-sided Fisher exact test for absolute differences and the two-sided Miettinen test for ratios; an unpaired t test was used to compare the detection slopes.
The study sponsor was involved in the study design, collection, analysis, and interpretation of data, in the writing of the report, and in the decision to submit the article for publication.
Initiation for the 39 facilities was between September 2014 and December 2015, and prospective arm patient recruitment was between September 2014 and May 2016. The matched retrospective control examinations span from 2007 to 2016, with 98% of them coming from 2012 to 2015, so after the introduction of current screening guidelines,1 minimizing the potential for selection bias. Older retrospective cases were required in some cases to ensure that the provider had equal numbers in each arm. For practical reasons, no records of demographics (e.g., age, race) or medical information (e.g., cytology/HPV test results) were kept of patients who refused to consent and participate. However, study sites reported that most (estimated >95%) of eligible women who were approached and informed about the study agreed to participate. Data from 3,660 women with a low-grade referral collected by 148 providers among the two arms were available for analysis.
Fifteen women were excluded, the reasons being: younger than 21 years (n = 6), previous hysterectomy (n = 8), and pregnancy (n = 1). A total of 1,788 women were included in the retrospective and 1,857 in the prospective arm (see Table 1). The median age was 34.0 years in both arms and baseline characteristics were comparable and also similar to those of large US cervical cancer screening studies.25 No study or study device related adverse events were reported.
The percentage of women who received biopsy was equivalent, 71.6% in the retrospective and 71.5% in the prospective arm, but more biopsies were taken in the prospective arm. The average number of biopsies per patient was 1.032 in the retrospective and 1.256 in the prospective arm, a statistically significant relative increase of 21.63% (95% CI of ratio = 1.19 to 1.24), corresponding to roughly one extra biopsy per five women. Colposcopic biopsy identified CIN 2+ on 129 women in the retrospective and 176 in the prospective arm (see Table 2), including two with invasive cancer (both in the prospective arm). Endocervical sampling was performed in 70.2% of the women in the retrospective and 67.4% in the prospective arm (p = .068, Fisher exact test) and detected an additional 16 and 15 cases of CIN 2+, respectively, which are not considered further in the analyses.
The TP rates are 7.21% in the retrospective and 9.48% in the prospective arm (see Table 2), a difference of 2.27% (95% CI = 0.47% to 4.07%) that is statistically significant (p = .014, Fisher exact test). It corresponds to 31.4% more CIN 2+ patients in the prospective DSI arm compared with the retrospective standard colposcopy arm (95% CI of ratio = 1.06 to 1.63, p = .014, Miettenin test).
In auxiliary analyses, biopsies found 37 women with CIN 3+ in the retrospective and 60 in the prospective arm, corresponding to TP rates of 2.07% and 3.23%, a difference of 1.16% (95% CI = 0.12% to 2.24%), that is statistically significant (Fisher exact test p = .031) and corresponds to 56.1% more CIN 3+ patients in the prospective arm (95% CI of ratio = 1.045 to 2.334, Miettenin test p = .029). The increased detection was pronounced on women of 30 years and older, where the CIN 2+ rate increased from 6.29% (95% CI = 4.99% to 7.88%) in the retrospective to 9.48% (95% CI = 7.93% to 11.29%) in the prospective arm (p = .004, Fisher exact test), and the CIN 3+ rate from 1.81% (95% CI = 1.15% to 28%) to 3.57% (95% CI = 2.64% to 4.8%) (p = .008, Fisher exact test).
In the retrospective arm, 1,152 women had biopsy(ies) that did not find CIN 2+ (64.43% FP rate). In the prospective arm, 1,152 women (identical number is a coincidence) had biopsy(ies) that did not find CIN 2+ (62.04% FP rate). This difference of 2.39% (95% CI = −0.74% to 5.52%, p = .14, Fisher exact test) corresponds to a relative decrease of biopsied women with a non-CIN 2+ result by 3.72% (p = .134, Miettenin exact test) in the prospective arm. Although not statistically significant, this result indicates that increased TP rate in the prospective arm did not compromise the FP rate.
The observed increase in the number of biopsies in the prospective arm compared with the retrospective arm (~21% or approximately one extra biopsy per 5 patients) is lower than the increase in the number of CIN 2+/CIN 3+ cases (31% and 56%, respectively). Due to the close similarity of the population characteristics between the two groups in terms of demographics/referral background (see Table 1) and the large number of subjects, no multivariable analyses were performed. To investigate whether the increased detection could simply be explained by the higher number of biopsies taken in the prospective arm rather than an improved diagnostic ability of colposcopy integrating the new technology to improve biopsy selections, we analyzed the data on the biopsy level.26 From this analysis, we excluded women with CIN 2+ who had multiple biopsies and all samples were reviewed together, because it is unknown how many of these biopsies were CIN 2+. There were five women in the retrospective and one woman in the prospective arm, leaving 1,834 biopsies in the retrospective and 2,330 biopsies in the prospective arm to analyze.
There were 148 biopsies with CIN 2+ in the retrospective and 229 in the prospective arm and the biopsy-level PPV was 8.07% and 9.83%, respectively. The 1.76% difference in PPV was statistically significant (95% CI = 0% to 3.49%, p = .05, Fisher exact test) corresponding to a relative increase in the prospective arm by 21.8% (95% CI of ratio = 1 to 1.49, p = .05, Miettenin test), indicating a higher biopsies accuracy with the study device and DSI.
Finally, we calculated the slope that describes the number of biopsies that found CIN 2+ as the number of biopsies taken from a patient increased (see Figure 2). The slope is 0.0681 for the retrospective and 0.1145 for the prospective arm. Their difference, compared with an unpaired two-sided t test, is statistically significant (p = .042). The steeper slope for the prospective arm indicates that because more biopsies were taken, CIN 2+ detection was seen earlier (i.e., requiring fewer biopsies) than in the retrospective arm. The number of biopsies needed to be added to standard colposcopy practice, to reach the number of CIN 2+ found in the prospective arm, would have to be significantly higher than that in the prospective arm. Therefore, the increased detection of women with CIN 2+ in the prospective arm cannot be explained solely by the increased number of biopsies, but it is also a result of a higher efficiency/accuracy of biopsy to find CIN 2+.
Colposcopy and biopsy with the study device and adjunctive DSI mapping increased the number of women detected with CIN 2+ by 2.27% (1.16% for CIN 3+) compared with standard colposcopy, achieving a relative increase of 31.4% (56.1% for CIN 3+). Analysis by age group suggests that the difference is likely to be clinically important, because it is highest for CIN 3+ in women 30 years and older. Results were achieved with a similar percentage of women undergoing biopsy and a comparable rate of false positives, but with a higher number of biopsies taken per patient and an increased efficiency of biopsy to detect CIN 2+. These findings complement and confirm the conclusions of previous studies, conducted in academic and controlled settings, on increased sensitivity.17,18
This study is one of the largest studies regarding the number of participating clinics, colposcopists, and patients and has the advantage that it describes “real-world” colposcopy as practiced in US community-based clinics. In the control arm, data (being retrospective) are not affected by participating in a study and are thus a nonbiased representation of standard practice. In the prospective arm, the effect of the technology is demonstrated with its realistic use by colposcopists. Using historical controls was favored over a randomized control trial as this is a questionable complication for diagnostic imaging studies27 because it introduces “cross-talk” and bias to the control arm as colposcopists would be contributing to both arms in parallel and colposcopy depends on their individual diagnostic judgment and furthermore because it would be impractical to execute in community clinics.
The limitations of the observational design are that without additional random biopsies, the absolute sensitivity cannot be calculated because of the underlying verification bias and that the number of biopsies was not controlled. Furthermore, measuring the full diagnostic potential of the technology is not possible because biopsy placement at DSI indications was not forced by protocol and the use of DSI was strictly adjunctive, so that the number of biopsies could only be expected to increase. Histology was not adjudicated, because this was practically impossible; however, because the same laboratories and methodology were used for both arms, and their time spans are relatively close, this should not affect results and conclusions. Real-world practice data were collected because all colposcopists offered routine care in both arms, only with the use of the new digital colposcope in the prospective arm, minimizing the potential for bias due to varying levels of attentiveness between the two arms.
Colposcopic sensitivity is 50% to 60%,5–7,28 biopsy placement is not optimized,14,29 and interobserver agreement is poor.30 Taking multiple biopsies increases detection of disease,10–12,31 but such protocols lead to large numbers of biopsies and poorer specificity, reducing the efficiency of biopsy to detect high-grade CIN and increasing the number of women who receive biopsy unnecessarily. Biopsy is not a trivial intervention for women and carries a substantial risk of after effects, such as pain, discharge or bleeding,13 and increased pathology costs. Furthermore, the direct applicability of academic study findings in “real-world” settings has been questioned.32 The retrospective arm data indicate that despite previous evidence and recommendations to take multiple biopsies per patient, the standard colposcopy practice in US community-based clinics is to perform a biopsy at rates lower than those in academic center trials. Previous studies17,18 demonstrated an increased sensitivity for high-grade CIN with DSI, and the results of this study corroborate this in a “real-world” setting.
A study to directly compare biopsy protocols in academic versus community-based practice, for example, to include multiple directed, random, and DSI-assisted biopsies, is complicated and virtually impossible to execute in community-based clinics. In addition, if the incidence of high-grade disease is significantly lower in a well-screened community population, intensive biopsy protocols may do more harm than good.13 Our data and analyses cannot provide a definite answer on the possible increase in detection if colposcopists added directed/random biopsies to their standard practice rather than used the study device and DSI. However, the efficiency of biopsy in the control arm was significantly lower than that in the prospective arm. Therefore, even if the number of biopsies in standard practice increased to equal the number taken in the prospective arm, as they would have to detect disease of subtler appearance, it is unlikely that they would have a profound effect.
Using the study device, a digital colposcope with adjunctive DSI mapping in a wide “real-world” setting led to an increased efficiency of biopsy to detect high-grade CIN and resulted in identifying significantly more women with CIN 2+ and CIN 3+ when compared with standard practice colposcopy. This impacts on sensitivity, allows a more timely management of women with precancerous lesions, and can be expected to improve cost-effectiveness.33
The authors thank the women who participated in the study and the clinicians who performed the examinations.
1. Saslow D, Solomon D, Lawson HW, et al. American Cancer Society, American Society for Colposcopy
and Cervical Pathology, and American Society for Clinical Pathology screening guidelines for the prevention and early detection of cervical cancer. J Low Genit Tract Dis
2. Schiffman M, Wentzensen N. A suggested approach to simplify and improve cervical screening in the United States. J Low Genit Tract Dis
3. Kinney W, Wright TC, Dinkelspiel HE, et al. Increased cervical cancer risk associated with screening at longer intervals. Obstet Gynecol
4. Kim JJ, Campos NG, Sy S, et al. Inefficiencies and high-value improvements in U.S. cervical cancer screening practice: a cost-effectiveness analysis. Ann Intern Med
5. Massad LS, Collins YC. Strength of correlations between colposcopic impression and biopsy
histology. Gynecol Oncol
6. ALTS Group. Results of a randomized trial on the management of cytology interpretations of atypical squamous cells of undetermined significance. Am J Obstet Gynecol
7. ALTS Group. A randomized trial on the management of low-grade squamous intraepithelial lesion cytology interpretations. Am J Obstet Gynecol
8. Jeronimo J, Schiffman M. Colposcopy
at a crossroads. Am J Obstet Gynecol
9. ACOG. Practice Bulletin No. 140: management of abnormal cervical cancer screening test results and cervical cancer precursors. Obstet Gynecol
10. Wentzensen N, Walker JL, Gold MA, et al. Multiple biopsies and detection of cervical cancer precursors at colposcopy
. J Clin Oncol
11. Pretorius RG, Belinson JL, Azizi F, et al. Utility of random cervical biopsy
and endocervical curettage in a low-risk population. J Low Genit Tract Dis
12. Huh WK, Sideri M, Stoler M, et al. Relevance of random biopsy
at the transformation zone when colposcopy
is negative. Obstet Gynecol
13. The Tombola Group. After-effects reported by women following colposcopy
, cervical biopsies and LLETZ: results from the TOMBOLA trial. BJOG
14. Stoler MH, Vichnin MD, Ferenczy A, et al. The accuracy of colposcopic biopsy
: analyses from the placebo arm of the Gardasil clinical trials. Int J Cancer
15. Stoler MH, Schiffman M. Interobserver reproducibility of cervical cytologic and histologic interpretations: realistic estimates from the ASCUS-LSIL triage study. JAMA
16. van der Marel J, van Baars R, Quint WG, et al. The impact of human papillomavirus genotype on colposcopic appearance: a cross-sectional analysis. BJOG
17. Soutter WP, Diakomanolis E, Lyons D, et al. Dynamic spectral imaging
: improving colposcopy
. Clin Cancer Res
18. Louwers JA, Zaal A, Kocken M, et al. Dynamic spectral imaging colposcopy
: higher sensitivity for detection of premalignant cervical lesions. BJOG
19. Coronado PJ, Fasero M. Colposcopy
combined with dynamic spectral imaging
. A prospective clinical study. Eur J Obstet Gynecol Reprod Biol
20. Kinney WK, Manos MM, Hurley LB, et al. Where's the high-grade cervical neoplasia? The importance of minimally abnormal Papanicolaou diagnoses. Obstet Gynecol
21. Huh WK, Ault KA, Chelmow D, et al. Use of primary high-risk human papillomavirus testing for cervical cancer screening: interim clinical guidance. Gynecol Oncol
22. Massad LS, Einstein MH, Huh WK, et al. 2012 updated consensus guidelines for the management of abnormal cervical cancer screening tests and cancer precursors. Obstet Gynecol
23. Schiffman M, Rodríguez AC. Heterogeneity in CIN3 diagnosis. Lancet Oncol
24. Castle PE, Schiffman M, Wheeler CM, et al. Evidence for frequent regression of cervical intraepithelial neoplasia-grade 2. Obstet Gynecol
25. Castle PE, Stoler MH, Wright TC Jr, et al. Performance of carcinogenic human papillomavirus (HPV) testing and HPV16 or HPV18 genotyping for cervical cancer screening of women aged 25 years and older: a subanalysis of the ATHENA study. Lancet Oncol
26. Alvarez RD, Wright TC. Effective cervical neoplasia detection with a novel optical detection system: a randomized trial. Gynecol Oncol
27. Valk PE. Randomized controlled trials are not appropriate for imaging technology evaluation. J Nucl Med
28. Massad LS, Jeronimo J, Katki HA, et al, Colposcopy
NIoHASf, Group CPR. The accuracy of colposcopic grading for detection of high-grade cervical intraepithelial neoplasia. J Low Genit Tract Dis
29. Massad LS, Halperin CJ, Bitterman P. Correlation between colposcopically directed biopsy
and cervical loop excision. Gynecol Oncol
30. Ferris DG, Litaker M. Interobserver agreement for colposcopy
quality control using digitized colposcopic images during the ALTS trial. J Low Genit Tract Dis
31. Gage JC, Hanson VW, Abbey K, et al. Number of cervical biopsies and sensitivity of colposcopy
. Obstet Gynecol
32. Monk BJ, Brewster WR. Does the ALTS trial apply to the community-based practitioner? Am J Obstet Gynecol
33. Wade R, Spackman E, Corbett M, et al. Adjunctive colposcopy
technologies for examination of the uterine cervix—DySIS, LuViva Advanced Cervical Scan and Niris Imaging System: a systematic review and economic evaluation. Health Technol Assess