Evaluation of 3 Approaches for Assessing Adherence to Vaginal Gel Application in Clinical Trials

Original Study

Background Accurate measurement of adherence to product use is an ongoing challenge in microbicide trials.

Methods We compared adherence estimates using 2 applicator tests (a dye stain assay [DSA] and an ultraviolet light assay [UVA]), the Wisebag (an applicator container that electronically tracks container openings), and self-reported adherence (ability, frequency, and percent missed doses). Healthy, HIV-negative, nonpregnant US women aged 23 to 45 years received a Wisebag and 32 applicators filled with placebo gel were instructed to insert 1 applicator daily for 30 days, returned the Wisebag and all applicators, and completed an exit interview. Emptied applicators were tested by UVA and then DSA, and scored by 2 blinded readers. Positive and negative controls were randomly included in applicator batches.

Results Among 42 women enrolled, 39 completed the study. Both DSA and UVA yielded similar sensitivity (97% and 95%) and specificity (79% and 79%). Two participants had fully inoperable Wisebags, and 9 had partially inoperable Wisebags. The proportion of participants considered to have high adherence (≥80%) varied: 43% (Wisebag), 46% (UVA), 49% (DSA), and 62% to 82% (self-reports). For estimating high adherence, Wisebag had a sensitivity of 76% (95% confidence interval, 50%–93%) and a specificity of 85% (95% confidence interval, 62%–97%) compared with DSA. Although 28% of participants reported forgetting to open the Wisebag daily, 59% said that it helped them remember gel use.

Conclusions Dye stain assay and UVA performed similarly. Compared with these tests, self-reports overestimated and Wisebag underestimated adherence. Although Wisebag may encourage gel use, the applicator tests currently seem more useful for measuring use in clinical trials.

A study assessing methods of vaginal gel application adherence found that the Wisebag, an electronic adherence monitor, encouraged gel use, but the 2 applicator tests were more accurate for measuring use in clinical trials.

Accurately measuring adherence to study products is an ongoing challenge in microbicide trials. The Centre for the AIDS Programme of Research in South Africa (CAPRISA) 004 trial which tested 1% tenofovir gel used pericoitally, is the only trial to date to demonstrate a significant, albeit modest, protective effect against HIV acquisition.1 Other trials reported no effect or futility, likely because of low adherence.2 For instance, low adherence was found in the recently completed Vaginal and Oral Interventions to Control the Epidemic (VOICE) trial, which assessed daily application of 1% tenofovir gel, alongside daily oral preexposure prophylaxis (tenofovir and emtricitabine/tenofovir), when compared with retrospectively measured plasma drug levels.3

Self-reports typically overestimate adherence, yet reliable and accurate measures to assess microbicide use are limited.4,5 Measuring drug levels can be useful but has limitations (eg, cost and burden on participants and clinic staff, only possible with products that are systemically absorbed).5 Consequently, alternative measures are critical for the conduct and interpretation of efficacy in clinical trials.

Several tests have been developed based on the presence of vaginal mucins on the surface of applicators.6–9 The dye stain assay (DSA) uses a food dye to detect mucus on the applicator after vaginal insertion; it has shown variable sensitivity (81%–100%) and specificity (41%–94%) by dosing regimen and applicator type7–10 A test using ultraviolet light fluorescence (ultraviolet assay [UVA]) to examine daily vaginal insertion of hydroxyethylcellulose (HEC)-filled polypropylene applicators showed 84% sensitivity and 83% specificity.9 Both assays are inexpensive, can be implemented onsite with active and placebo products, and can determine vaginal insertion. However, neither test is anatomically specific; consequently, they cannot definitively establish whether gel was expelled into the vagina, the timing of use, or the amount of product used.7

An alternative approach is the Wisebag, an electronic adherence monitor (or events monitoring system [EMS]) that transmits data via cellular networks. The Wisebag is a lunch-bag style container designed for applicator storage that is equipped with a self-contained battery-operated electronic device with a chip that generates an electronic signal each time the bag is opened (“opening event”). Users are instructed to open the Wisebag solely to retrieve gel applicators, and each opening event is intended to provide individual-level data on applicator use in real time. Wisepill, a more common version of this EMS for monitoring pill use, has been validated with viral load in HIV treatment settings.11,12 The Wisebag assesses an “adherence-execution behavior” (opening the bag to remove an applicator), but it does not indicate whether the applicator was taken out or inserted, or if gel was expelled vaginally. To date, the Wisebag has been piloted in 2 studies in South Africa,10,13 but it has not been validated against objective (ie, respondent-independent) measures.

This study assessed the Wisebag for measuring adherence, compared with applicator tests and self-report.

Site and Population

The study was conducted at the Albert Einstein College of Medicine and Montefiore Medical Center (Einstein-Montefiore), Bronx, NY, to compare the DSA, UVA, and Wisebag methods of measuring adherence to daily vaginal application of HEC placebo gel. Dye stain assay was the referent method because it had been validated and used in several prior studies.7,14–16

Volunteers were recruited from the patient, student, and faculty populations at Einstein-Montefiore and from the surrounding community. Eligible participants were healthy, HIV negative, nonpregnant women aged 22 to 45 years who, if sexually active, were using an effective, nonbarrier contraceptive method.

The protocol was approved by the institutional review boards of Einstein-Montefiore and the Population Council (New York, NY). Before undergoing any procedures, all participants provided written informed consent according to ethical standards for human subjects research.

Design and Procedures

Eligibility was determined by medical history (including review of contraceptive methods and sexually transmitted infection history, pelvic examination, a urine pregnancy test, and a rapid oral HIV test or serum HIV enzyme-linked immunosorbent assay).

At enrollment, participants were issued a Wisebag and 32 Microlax-type low-density polyethylene vaginal applicators (Tectubes, Åstorp, Sweden). Applicators were filled with 7 mL of 2.7% HEC placebo gel17 (Clean Chemical Sweden AB, Borlange, Sweden) and delivered approximately 3.5 mL of gel. After a demonstration, participants inserted their first applicator witnessed by a clinic staff member; this applicator served as the participant’s positive control for DSA and UVA validation. Participants were instructed to insert 1 applicator daily for 30 days, at approximately the same time each day, and to return all used and unused applicators. Participants were instructed to open the Wisebag once daily, only for the purpose of retrieving an applicator. A study staff member explained that each time the Wisebag was opened, it would trigger a wireless signal (ie, opening event) that would be sent to a database. Participants were shown how to open and close the Wisebag, which generated the first opening event and served as each participant’s positive control. Participants were told that applicators returned opened would be tested to determine use and compared with Wisebag opening events. Participants were issued prelabeled plastic bags to store used applicators, tips of cotton swabs to plug used applicators to prevent leakage, and a diary card that served as a memory and accountability tool to record product use and problems. At study exit, staff reviewed the diary card with the participant, counted empty and unused applicators, and clarified any discrepancies between applicator counts and the diary card. Adverse events or changes in participants’ health since enrollment were documented, and a brief, structured questionnaire was administered to capture women’s experiences using the gel and the Wisebag.

Applicator Test Procedures

Returned empty applicators, positive controls, and negative controls (generated in the laboratory by staff dispensing gel ex vivo) were randomly placed into a test rack by 1 researcher and then evaluated and scored independently as positive or negative, first by UVA and then by DSA, by 2 other researchers in a blinded fashion. The UVA does not interfere with the performance of the DSA because the only manipulation required is the placement of the applicators on a tray for viewing under the UV light box. Ultraviolet light assay testing was optimized, using previously described methods,9 with the use of a single 60 × 385-nm ultraviolet wide-angle LED bulb. Applicators with a streaked pattern of blue-green fluorescence were scored as “positive.” Applicators tested with DSA were scored as positive if they exhibited the characteristic grainy/streaked pattern.7 Applicators were classified as “indeterminate” if there was discordance between the 2 independent readers. The cost per participant in materials and supplies was estimated at $177 for Wisebag (bags and device plus recurrent SIM and data hosting costs), $14 for UVA (primarily the ultraviolet viewing box), and $5 for DSA (primarily the food dye). Labor costs were not calculated.

Sample Size and Data Analysis

Sample Size

Assuming 60% of participants would achieve high (≥80%) adherence and a 90% sensitivity of the Wisebag compared with DSA, a sample size of 40 was sufficient to achieve a 95% confidence interval (CI) around the estimate of sensitivity with a half-width of 0.15 or less with a probability of 0.91.

Adherence Measures

The sensitivity and specificity of the applicator tests were calculated using the positive and negative controls. Returned emptied applicators were tested and determined to be positive, negative, or indeterminate. For UVA and DSA, adherence was assessed by dividing days with positive applicator tests by total expected days of use; unreturned applicators were conservatively considered unused. For Wisebag, adherence was assessed by calculating the percentage of operable days on which at least 1 opening event occurred. High adherence was defined as an adherence level of 80% or greater; low adherence was defined as an adherence level of 20% or less. Sensitivity of Wisebag compared with DSA was defined as the proportion of participants highly adherent per DSA, who were also highly adherent per Wisebag. Specificity was similarly defined for participants with low adherence by DSA.

Self-reported adherence was recorded at exit and included a retrospective assessment of the number of days with missed doses (subsequently converted to percentage of days with no missed doses “percent dose taken” and 2 categorical measures: a validated 6-point scale rating the ability to insert gel as instructed [rating]18 and a 6-point Likert scale asking how often gel was inserted [frequency]). The 6 categories were assigned scores of 0% to 100% adherence, following Lu et al.18 The association between all 6 adherence measures was determined via a Spearman rank correlation coefficient.

All analyses were conducted using SAS v9.3 (SAS Institute, Cary, NC).

Participant Characteristics

Between May and December 2012, 51 HIV-negative, low-risk, monogamous women were screened, 42 women enrolled, and 39 completed the study (93%), constituting the analytical sample (Fig. 1). The average age of participants was 34 years, 44% self-identified as black/African, all completed high school, 36% were married or living with a partner, and 79% were sexually active and in a monogamous relationship (Table 1).



Figure 1

Performance of the Applicator Tests

The 39 positive control applicators (inserted at the first clinic visit) and 43 negative control applicators (emptied ex vivo) yielded similar results for DSA (sensitivity 97%, specificity 79%) and UVA (sensitivity 95%, specificity 79%; Table 2). The average accuracy of the assays for all readers was the same (92%), sensitivity ranged from 96% to 100% for DSA and 88% to 100% for UVA, and specificity ranged from 80% to 92% for DSA and 80% to 90% for UVA. Readers contributed equally to false positives and false negatives.



Among the applicators distributed, 4% (n = 49) were never returned, 21% (n = 244) were returned unused, and 75% (n = 872) were returned emptied (corresponding to a median of 83% emptied applicators per participant). Of the 872 emptied applicators, 90% were positive, 6% were negative, and 5% were indeterminate according to DSA (Table 2). Indeterminate applicators were removed from the subsequent analyses when DSA was treated as the referent. The proportions were similar with UVA (88% positive, 4% negative, 8% indeterminate).

Performance and Acceptability of Wisebag

Among the 39 participants who completed the study, 36 (92%) had detectable Wisebag opening events (positive controls) and 3 received inoperable devices, 2 of whom had devices that never worked and were removed from any Wisebag analyses (Fig. 1). Among participants, 28 (72%) had a Wisebag that was always functional and 9 (23%) had devices that were functional during part of the study (mean, 73% of days; range, 48%–90%). On 63% of the days with functional Wisebag devices, at least 1 opening event was recorded (median, 1; range, 1–10); on 89% of these days, exactly one opening was recorded.

Overall, participants reported that the Wisebag was acceptable (Table 3), although a majority reported not opening it every day and 11 (28%) participants indicated it was difficult to remember to open it daily. The 3 most cited reasons for not opening the Wisebag were forgetting (54%), traveling (33%), and not returning home or returning late (21%). Most participants (90%) reported opening the Wisebag only to retrieve applicators, although 15% reported they had retrieved more than 1 applicator at a time (termed “pocket dosing”) because they could not or did not want to carry the Wisebag with them. Two participants reported that a spouse or a child opened the Wisebag, and 10% reported extra openings without retrieving an applicator (mostly to retrieve the diary card). More than half (59%) reported that the Wisebag served as a visual cue, helping them to remember to apply gel daily. Although a majority stated they could use the Wisebag in a future microbicide study, 18% said that they would not use it because it was impractical or lacked discretion.



Adherence Level and Concordance Between Measures

Participants’ adherence was assessed by DSA, UVA, Wisebag, and self-report during exit interviews. The proportion of participants achieving high adherence (≥80%) varied by measure: 43% per Wisebag (16/37), 46% for UVA (18/39), 49% for DSA (19/39), and 62% (24/39) to 82% (32/39) for self-report. Fewer than 50% of participants were classified as high adherers by any of the 3 objective measures. For estimating high adherence, Wisebag had a sensitivity of 76% (95% CI, 50%–93%) and a specificity of 85% (95% CI, 62%–97%) compared with DSA. For estimating low adherence (≤20%), similar variations were noted across measures, with 0% to 5% of participants classified as low adherers by self-report, 11% per Wisebag, 8% per DSA, and 10% per UVA (Fig. 2A).

Figure 2

Similar patterns were noted when adherence was estimated as a continuous measure. Median adherence was 77% by both DSA (range, 0%–100%) and UVA (range, 0%–97%), and 67% by Wisebag (range, 7%–100%; Fig. 2B). By self-reports, median adherence was “very good” for the rating scale, “always” for the frequency scale, and 90% for percent dose taken (Fig. 2B). All measures were significantly correlated: DSA and UVA had the highest correlation (0.92), whereas Wisebag was moderately correlated with DSA (0.58) and UVA (0.59; Table 4).



In this 30-day study of daily placebo gel use with polyethylene applicators, conducted among healthy, sexually active, HIV-negative low-risk US women, the Wisebag was compared with 2 applicator tests and self-reported adherence measures. Similar sensitivity and specificity were found between the 2 applicator assays, within the range of studies testing polyethylene or polypropylene applicators.7–9,19–21 The sensitivity with UVA (calculated with positive control applicators at the clinic without prior gel application) was higher in this study (95%) than the 65% sensitivity previously reported with first-time use of gel delivered by HTI polypropylene applicators.9 This difference may be attributable to the optimization of the UV source or to the different type of applicators.

Both DSA and UVA are simple and inexpensive to implement. Moreover, as previously reported,9 UVA is faster because it does not require 5 hours for drying after staining. With UVA, it would be feasible to give immediate adherence feedback to participants during study visits. In contrast, the time lag between staining and reading with DSA may limit the opportunity for immediate feedback to participants. Of note, neither test is anatomically specific; consequently, they cannot definitely establish whether the applicator was inserted into the vagina.

Disagreement between readers, especially for negative control applicators, led to a higher-than-expected number of indeterminate applicators (14%, DSA; 16%, UVA). No pattern of disagreement was found between readers for the same assay or between assays for the same applicator. One way to reduce the number of indeterminate results may be to, first, use UVA on all the applicators and then use DSA only for applicators found to be indeterminate by UVA. When this approach was tested with control applicators, it increased the combined accuracy of the applicator tests to 100% sensitivity and 91% specificity. Both DSA and UVA are promising methods to measure applicator insertion in clinical trials. Consequently, they should be further evaluated in parallel or in combination for pericoital and daily regimens with polypropylene applicators and for rectal insertion.

The Wisebag was deployed to electronically monitor applicator use during study follow-up, using only the “passive” monitoring feature of the system (opening events sent in real time to a database). The Wisebag system can also provide Short Message Service reminders sent when no opening event is detected,10,12 although that feature was not used because the aims of this study were to compare different measures of adherence, not to evaluate adherence interventions. Nevertheless, more than half of the participants said the Wisebag served as a visual cue reminding them to use the gel. Events monitoring system may function as adherence-enhancing tools, although without reminders, their effects seem to diminish over time.22,23 Consequently, the Wisebag Short Message Service reminder for active monitoring should be further evaluated because forgetting is a commonly cited reason given for microbicide nonadherence24 and was the most common reason for not opening the Wisebag in this study.

Compared with DSA, Wisebag provided moderate sensitivity and specificity to estimate the proportion of high adherers during the study. Several technical difficulties occurred with the system—including premature device battery depletion, suspected to be caused by 3G versus 4G network incompatibility—that further limited study power for adherence measure comparison in this pilot study. This unexpected problem was communicated to the manufacturer who since then has made several upgrades (L. Marshall, written communication, December 2012). Aside from cost, other challenges with the Wisebag, as reported by users, include bulkiness, impracticality, and low portability. Different container designs may help address these issues.

Fewer than 50% of participants were high adherers by any of the 3 objective measures: when using DSA as the referent, adherence by UVA was virtually the same. Two of the 3 self-reported measures (frequency and the percent dose taken) overestimated use compared with DSA, whereas rating provided the closest estimate of adherence to DSA, which is consistent with the reported performance of this questionnaire item in studies of HIV treatment adherence.18,25,26 Correlation between the Wisebag and the other adherence measures was moderate, similar to previous reports of EMS.12,27,28 Wisebag underestimated adherence compared with the applicator tests, a known issue with EMS typically caused by pocket dosing.29 Here, retrieving more than 1 applicator at a time from the Wisebag, as well as extra openings by participants or by family members, was reported by a small minority, which will yield biased estimates of product use. Other studies have used composite scores or algorithms to correct for misestimation of adherence with EMS.22,28,29 Wisebag or other EMS can complement use of an applicator test to generate a composite adherence measure.9

This study has several limitations. First, this was a short study with a small sample size. Because 2 Wisebag devices were completely inoperable, the overall sample with any Wisebag measures was further reduced to 37. Second, the study used a placebo product; consequently, the results might not be generalizable to a longer study or one with an active product. Third, the Wisebag was not used to its full potential (real-time active monitoring) because this study focused on adherence measurement, not on optimizing adherence in the study. Therefore, the potential benefit of the Wisebag was not fully explored.

In summary, the DSA and UVA performed similarly and should be considered as objective measures of product use in future microbicide gel trials. In addition, they are inexpensive and relatively simple to implement on-site without risk of unblinding. To increase test accuracy, combining UVA followed by DSA as a tiebreaker may be optimal but will require further study. Assessing the feasibility of UVA and the combination of both tests in low-resource settings would be important. Technical challenges experienced with the Wisebag should be reevaluated with the upgraded devices. Furthermore, the design of more discreet and portable containers may facilitate use by participants. Although Wisebag alone may not be optimal for accurate adherence measurement, further study of this or similar EMS should be pursued to evaluate its potential in concert with other measures and for promoting adherence through its active mode with reminder messaging and real-time alerts to investigators to intervene.

