THE DESIGN OF CONDOM EFFECTIVENESS studies1,2 usually hinges on the users’ self-report of their experience with condom use during follow-up. Measures of condom use and condom use problems (e.g., mechanical failure) that are based on self-report vary considerably in quality and level of detail. These differences may explain why estimates of the protective effect of condom use vary across studies.3–5
Employment of objective biomarkers of condom use and failure6–12 may strengthen the validity of intervention research in the area of STD/HIV prevention. We have developed a technique based on detecting prostate-specific antigen (PSA) in vaginal fluid as an objective marker of exposure to semen6,7 and used it in condom effectiveness studies.8,9,12 This technique provides an objective counterpart to self-reported condom failure.
In this article, we analyze data from 2 clinical trials to address the following research question: “Does the frequency of self-reported problems and rate of PSA detection vary across population groups?” We compare the rates of self-reported user problems and the rates of semen exposure as measured by PSA during use of the female condom (FC) and the male condom (MC). We use data from 2 studies with similar design, data collection, and laboratory procedures that were conducted in the United States of America (USA) and in Brazil.
Materials and Methods
The University of Alabama at Birmingham (UAB) study was a randomized crossover trial of FC and MC use conducted at a reproductive health outpatient clinic in Birmingham, AL. The eligible participants were women who were aged 19 years or older were in a mutually monogamous relationship, had had vaginal intercourse on 4 or more occasions during the past 30 days, and had no recent history of STDs. At baseline, women provided informed consent, participated in an interview, were randomly assigned to begin the study using either 10 FCs (n = 55) or 10 MCs (n = 53) (latex prelubricated), and were given a brief motivational intervention and instruction (with anatomical models) on the correct use of the first set of 10 condoms. They were instructed to fill out a brief questionnaire after the use of each condom and to return the questionnaire, the used condom, and the precoital and postcoital samples. After using the first 10 condoms, the participants returned for a follow-up visit, completed a questionnaire, and received 10 condoms of the second type as well as instructions for correct usage. When the participants had used all of the condoms, they returned to the study clinic for an exit interview.
The Universidad Estadual de Campinas (UNICAMP) study12 was also a randomized crossover trial of FC versus MC use among 400 women attending the family planning clinic at the Universidade Estadual de Campinas in the state of São Paulo, Brazil. The design, eligibility criteria, and data-collection procedures were very similar to those described above for the UAB study. In the UNICAMP study, participants were randomly assigned to receiving either in-clinic educational instruction on FC and MC use or the recommendation to read the condom package inserts, depending on the day that the participants attended the first study visit. Within each educational arm, the participants were randomly assigned to begin the study using 2 FCs or 2 MCs. Once assigned to the education arm (in clinic vs. package insert), the same educational strategy was maintained for both FC and MC (i.e., there was no crossover between education arms). After a participant had used the first 2 condoms, she came back to the clinic to return her vaginal samples, met briefly with the nurse to review her experience, and received 2 condoms of the other type, self-sampling kits, questionnaires, and written instructions. After testing the second set of condoms, the participant returned to the clinic for a second and final follow-up visit, in which she reviewed her experience and condom data forms and participated in a brief exit interview.
Self-Report of Condom Use Problems and Self-Sampling
Participants filled out a questionnaire after each condom use. The instruments used in the UNICAMP and the UAB studies were similar, with the UAB study instrument representing a refinement of the form developed for the UNICAMP study. The participants in both of the studies were trained by a nurse in collecting precoital and postcoital samples of vaginal fluid using a gynecologic swab protected by a cardboard tampon tube.6 The training session emphasized the high sensitivity of the test for semen and that it was imperative that the tip of the swab be kept inside the cardboard tube until inserted and then retracted into the tube before removing the sampling device from the vagina. In the UAB study, the participants were instructed to take a vaginal sample immediately before and after intercourse, to place the swabs in prelabeled bags with desiccant, and to return them to the study clinic on the next business day. Because PSA is stable in dry specimens, lack of compliance with this requirement was not a threat to the validity of the study, and adherence to this requirement was not closely monitored. In the UNICAMP study, the participants took 2 swabs before and 2 swabs after each condom use, followed identical sampling and storage instructions, and returned to the study clinic at the time of the next interview.
The dried swab samples were stored at room temperature and then extracted in 0.9% saline and frozen. Eluents were later thawed and tested with a PSA immunoassay. The UAB study used the IMx immunoassay (Abbott Laboratories, Abbott Park, IL), which laboratory studies have shown to be a sensitive marker for semen in vaginal specimens.6,7 The UNICAMP study employed the Immulite chemiluminescent immunoassay (Diagnostic Products Corporation, Los Angeles, CA). In a pilot study, 74 samples from the UNICAMP study were tested using both immunoassays, and the results showed a high degree of agreement.12 The range of detection was 0.01 to 100 ng/mL for IMx and 0.01–150 ng/mL for Immulite. Both assays have very high sensitivity, nearly 100% specificity, and small measurement error (published coefficients of variation of about 5%). Because the UNICAMP and UAB samples differed slightly about sampling and laboratory procedures (number of swabs and volume of diluent used in the extraction process), a correction factor of 5:9 was used to adjust the UNICAMP PSA values to the level that would have been observed using the UAB procedures.
All postcoital samples were tested for PSA. Postcoital samples with a PSA value ≤1 ng/mL were considered as negative (no exposure to semen6,7). In these instances, we did not test the precoital sample. If the postcoital PSA value was >1 ng/mL, we tested the precoital sample to assess whether the woman had been exposed to semen before using the present condom. If the precoital sample was PSA-positive (≥1 ng/mL), we concluded that exposure to semen could have occurred before the present condom use and excluded both pre- and postcoital results from the analysis (Refs. 8,12, and Macaluso et al., unpublished data). We classified informative results into 4 categories according to the postcoital PSA value: (1) nonexposed (≤1 ng/mL); (2) low (>1 ng/mL but <22 ng/mL, may be due to low exposure or random variability in self-sampling)8; (3) moderate (22–99 ng/mL); and (4) high (≥100 ng/mL). Under these self-sampling and assay conditions, PSA results ≤1 ng/mL were incompatible with recent exposure to semen.6,7 Also, the Abbot assay system employed in the UAB study has a ceiling of detection of 100 ng/mL. Finally, postcoital values lower than 22 ng/mL may be due to low semen exposure levels but may also be due to random variability in the difference between pre- and postcoital levels (i.e., by chance alone variability in self-sampling may lead to a negative precoital sample while the postcoital sample is in the 1–22 ng/mL range), whereas postcoital PSA levels above 22 ng/mL following precoital levels of less than 1 ng/mL are statistically incompatible with self-sampling variability (in a previous study, 22 ng/mL was the 95th percentile of the difference between any 2 samples taken in rapid succession by the same woman 24 hours after exposure to 1 mL of semen.7
The main objective of the analysis was to compare the frequency of self-reported mechanical problems and of semen exposure between condoms (FC, MC) and between studies. To ensure comparability of the study results, we limited the UNICAMP study results to data from the group of women who received the in-clinic training (similar to the procedures used with all of the participants of the UAB study). Between-study comparisons of the proportion of PSA-positive condom uses were carried out separately for instances of condom use in which a mechanical problem was or was not reported.
For FCs, the following events reported in the condom questionnaires were classified as self-reported mechanical problems: (1) the condom broke during intercourse, (2) the condom came out of the vagina, (3) the penis entered to the side of the condom, (4) the outer ring was pushed inside the vagina, or (5) semen leaked on the woman’s body. For MCs, self-reported mechanical problems included breakage, partial slippage, total slippage, and semen leakage.
Frequency distributions and simple univariate statistics were used to describe and compare the study groups. A χ2 test with one degree of freedom was computed to test the null hypothesis of no difference between condom types or between studies about the proportion of uses in which a positive postcoital PSA test was observed or any mechanical problems were reported. Fisher’s exact test was used when appropriate. A χ2 test for the linear trend was used to compare the distribution of postcoital PSA values between studies, or between condom types.
We ran additional sensitivity analyses to ensure that the comparison was not influenced by differences in laboratory procedures and in the number of condoms used. In 1 set of analyses, we replaced the corrected PSA values from the UNICAMP study with the original results. In a second set, we restricted UAB study results to data pertaining to the first 2 FCs or MCs used in the study to ensure comparability with the UNICAMP study, where 2 condoms of each type were used. The results of these additional analyses are not shown in detail but were very similar to the results of the main analyses presented in this report.
The human subject protocol of the UNICAMP study was reviewed and approved by institutional review boards at the UAB, the Population Council, and the Centers for Disease Control and Prevention (CDC). The human subject protocol of the UAB study was reviewed and approved by institutional review boards at the UAB and the CDC.
In the UNICAMP study, 400 women used a total of 800 FCs and 800 MCs (an unspecified number of women agreed to participate but never returned any materials, and enrollment continued until the desired group size was reached). In the UAB study, 108 women were enrolled in the study, but during the observation period, only 85 women used a total of 678 FCs and 700 MCs. This analysis considered only 199 women in the in-clinic training arm of the UNICAMP study and 84 women in the UAB study who used at least 1 condom, returned completed forms and swabs, and had a valid postcoital PSA result after a precoital PSA <1 ng/mL. We exclude condom uses with precoital PSA ≥1 ng/mL because it indicates recent semen exposure.
Of the 199 women in the UNICAMP study, 195 contributed data to 371 FCs, 194 contributed data to 376 MCs, and 190 contributed data to both types of condoms. Of the 84 women in the UAB study, 71 contributed data to 599 FCs, 77 contributed data to 635 MCs, and 64 contributed data to both types of condoms.
On average, the women in the UNICAMP study were 3 years younger than the women in the UAB study (mean age, 30 vs. 33 years, respectively, Table 1). For both the UNICAMP and the UAB studies, most of the women were married or cohabiting (89% and 79%, respectively) and were in long-term relationships (8.5 and 7.4 years, respectively). No race/ethnicity information was available in the UNICAMP study. In the UAB study, 77% of the participants were white and 21% were black (data not shown). The 2 groups differed considerably about the current contraceptive method. Most noticeably, intrauterine devices were used by 20% in the UNICAMP study and by none in the UAB study. Male condoms were also more commonly used as contraceptives by the participants at the UAB. The mean coital frequency during the past 30 days was similar (9.8 in the UNICAMP study and 10.5 in the UAB study) but the mean number of pregnancies and the mean number of live births were both higher in the UNICAMP study than in the UAB study (pregnancies: 1.9 vs. 1.5; live births: 1.7 vs. 1.2, respectively).
Postcoital PSA >1 ng/mL was detected in 69 FC uses in the UNICAMP study (19%) and in 100 (17%) in the UAB study (Table 2). Moderate-high PSA (>22 ng/mL) was detected in 32 FC uses in the UNICAMP study (9%) compared with 27 (5%) in the UAB study (Table 2). The distribution of PSA values after FC use was significantly different (shifted toward higher values) in the UNICAMP study (trend test P = 0.03). In the UNICAMP study, 48 MC uses were PSA positive (13%) compared with 86 (14%) in the UAB study. The corresponding rates of moderate-high PSA were 6% for the UNICAMP study and 3% for the UAB study (Table 2). In both studies, the frequency of postcoital PSA values >1 ng/mL was higher for FC use than for MC use, although the comparison reached statistical significance only in the UNICAMP study.
The pattern of self-reported problems differed between studies. The UNICAMP study group reported a mechanical problem in 18 FC uses, for a rate of 5 per 100 condom uses (Table 3). By contrast, the UAB study group reported a mechanical problem in 171 FC uses (29%) (P <0.0001). UNICAMP study participants reported a mechanical problem in 13 MC uses (3%), whereas UAB study participants reported a mechanical problem in 52 (8%) (P = 0.003). The increased rate of self-reported problems in the UAB study was not associated with any particular type of mechanical problem, and the difference between studies achieved statistical significance in 7 out of 10 specific categories (Table 3).
Table 4 shows the distribution of PSA values by study, condom type and whether a mechanical problem was reported. The proportion of PSA-positive FC uses in the UNICAMP study was 28% when a mechanical problem was reported compared with 18% when no mechanical problem was reported. In the UAB study, the proportion of PSA-positive FC uses was 25% when a mechanical problem was reported compared with 13% when no mechanical problem was reported. The proportion of PSA-positive MC uses in the UNICAMP study was 38% when a mechanical problem was reported compared with 12% when no mechanical problem was reported. In the UAB study, the proportion of PSA-positive MC uses was 19% when a mechanical problem was reported compared with 13% when no mechanical problem was reported. A similar pattern of results holds for moderate-high exposure (≥22 ng/mL) for women with a mechanical problem. Overall, the distribution of postcoital PSA values was similar between studies when a mechanical problem was reported with either MC or FC use. When no mechanical problem was reported, the distribution of PSA values was similar in the 2 studies for MC use, but not for FC use. The distribution of postcoital PSA values after FC use was shifted toward higher values in the UNICAMP study when no problem was reported (Table 4). Thus, whereas self-reported mechanical problems are associated with elevated PSA levels in both studies and for both condom types, the increased frequency of self-reported problems in the UAB study observed in Table 3 does not translate in an overall increase in the frequency of PSA-positive condom uses.
In a previous 1-arm trial of the female condom that employed the same methods, we documented that PSA detection rates are associated with self-report of mechanical problems, but PSA detection often occurs in condom uses for which no problems are reported.8 It is possible that undetected or unreported problems during condom use are associated with PSA detection. Self-report of problems during use depends on several factors, including the user’s knowledge of potential problems and the user’s judgment on whether a problem occurred during condom use. Although training into recognizing and reporting problems during actual use may reduce inconsistency across studies, there is a wide “gray area” that still depends on the subjective assessment of the individual study participants. This, in turn, may be influenced by psychosocial and cultural factors and by experience with condom use. By contrast, self-sampling of vaginal fluid before and after condom use is easy and highly acceptable, and the laboratory methods employed to measure PSA values are highly standardized, possibly leading to less between-group variability. The present analysis offered an important opportunity to compare sel-report and PSA detection in 2 different populations, using comparable study design and procedures.
The rates of self-reported mechanical problems with the FC could be interpreted as indicating that use of the FC was more effective in the UNICAMP study than in the UAB study (rate difference: −24%; P <0.0001). By contrast, the PSA detection rate was similar (rate difference: 2%, P = 0.45), suggesting that the overall effectiveness of FC use was similar. In fact, the distribution of PSA values was slightly, but significantly shifted toward higher values in the UNICAMP study, making the contrast between self-reported problems and detection of semen exposure even more striking. The higher PSA values in the UNICAMP study, however, result from 2 FC uses per person, whereas UAB study participants could use up to 10 FCs. Because postcoital PSA values tended to decline with repeated FC use in the UAB study,15 we interpret these findings as consistent with similar semen exposure rates in the 2 studies.
A plausible explanation for the apparent contradiction between substantially different patterns of self-reported problems and similar PSA detection rates is that whereas the study groups differed in their self-reporting behavior, their experiences were similar. In fact, the data are compatible with a scenario in which (1) the true rate of mechanical problems was the same in both studies, (2) problem-specific PSA detection rates were similar, and (3) the UNICAMP study subjects systematically underreported condom use problems.
Using the UAB study subjects as the reference group, we would expect the UNICAMP subjects to have reported about 108 (29% of 371) FC uses with mechanical problems rather than the 18 that were actually observed. The number of PSA positive FC uses with mechanical problems would have been 30 (28% of 108) rather than five. Thus, 263 uses would be classified as having no mechanical problems, including 39 PSA-positive uses. In this scenario, the total number of PSA detections in the UNICAMP study (N = 69) would be the same as observed, but would reflect the experience of mechanical problems reported in the UAB study. Similar results would be obtained if the threshold for PSA were set at 22 ng/mL (results not shown).
The 2 studies also differ about MC use: the rate of self-reported mechanical problems is lower in the UNICAMP study than in the UAB study (rate difference: −5%; P = 0.003). Overall, the proportion of MC uses that were PSA positive was very similar in the 2 studies (1% difference, P = 0.72). As for the FC results, using the UAB study subjects as the reference group, we would expect the UNICAMP subjects to have reported about 30 (8% of 376) MC uses with mechanical problems rather than 13. The number of PSA-positive MC uses with mechanical problems would have been 11 (38% of 30), rather than the 5 observed. Thus, 346 uses would be classified as having no mechanical problems, including 37 PSA-positive uses. Again, this scenario yields the same number of PSA detections in the UNICAMP study (N = 48), with identical rates of mechanical problems in both studies.
The findings of this study should be interpreted in light of some limitations. First, the characteristics of the participants were slightly different in the 2 study groups: the women in the UNICAMP study were younger, more often used intrauterine devices, more often relied on sterilization, and less often used the MCs as their current contraceptive method than the women in the UAB study. PSA detection rates varied appreciably across strata defined by age, but not by contraceptive method. Analyses that were stratified by age or by contraceptive methods yielded results similar to those reported here. Thus, it is unlikely that the different patterns of contraceptive use in the 2 studies confound the comparison of the distribution of PSA levels between studies. Second, the 2 studies differed slightly by certain design aspects (intervention, number of condoms used, sample extraction procedure). To minimize these concerns, we restricted the analysis to the most comparable subgroups and adjusted the UNICAMP PSA values. We also carried out sensitivity analyses to evaluate the potential influence of differences in design, and found similar results. Third, as noted in the methods section and in previous reports and correspondence,8,13,14 not all positive postcoital PSA results should be interpreted as indicating true semen exposure. The category of PSA values >22 ng/mL may be more specific for true semen exposure, while the category 1–21 ng/mL may be influenced by random self-sampling variation. Analyses based on setting the threshold of PSA detection at 22 ng/mL yield results that are similar to those presented in this article. Furthermore, the intermediate category of 1–21 ng/mL provides additional insight into the generalizability of PSA results. Even if all low PSA values are due to self sampling variability, the similar proportions of women with these results in the UNICAMP and UAB study groups indicate that variability in self-sampling is similar across studies. In other words, the results suggest that we can expect the same level of imprecision in PSA detection across studies, strengthening the impression that PSA-based methods can be reliably applied to a broad range of population groups. Fourth, despite the large numbers of condom uses, mechanical problems were sufficiently uncommon to make some statistical comparisons imprecise. Nevertheless, several of the contrasts of interest were large enough to be statistically significant. Finally, in both of the studies, experience with the FC was minimal at entry, whereas many participants had experience using the MC and more of the UAB participants were regular users of the MC. Thus, the comparison between failure rates may be somewhat “unfair” to the FC, especially for the first few uses. Although this limitation deserves consideration in assessing the relative effectiveness of the 2 devices, we believe that it should have little effect on the main purpose of this analysis.
The limitations discussed above are offset by considerable strengths. Both of the studies used almost identical procedures and forms. Both of the studies employed a randomized crossover assignment of the participants to the condom-use groups, so that each participant used both condom types. This feature ensures that the comparison of condom-specific use experiences is highly standardized. Most important, both of the studies employed standardized methods for gathering information on the participants’ experience with each condom use and for gathering objective evidence of semen exposure during condom use. Finally, the number of condom uses in the analysis was large enough for key contrasts to have adequate power.
In conclusion, our findings lend support to the hypothesis that objective methods for assessing condom failure, such as self-sampling and PSA testing, may be less prone to bias than self-report, and yield more consistent and accurate information about condom failure rates. Methods based on biomarkers of semen exposure may help strengthen the validity of intervention research and provide for clearer interpretation of the results of studies promoting behavior change for the prevention of STD, including HIV.