Secondary Logo

Journal Logo


Evaluating AIDS Prevention Interventions Using Behavioral and Biological Outcome Measures

Fishbein, Martin PhD*; Pequegnat, Willo PhD

Author Information
  • Free


FROM A PUBLIC HEALTH perspective, the most meaningful and appropriate measure for evaluating the efficacy or effectiveness of an AIDS prevention intervention is an assessment of the intervention's impact on HIV transmission. But using HIV incidence as an outcome measure, particularly in the United States, is rarely feasible. Because of the low incidence of seroconversions in most segments of the US population, a very large sample size is necessary to obtain a statistically significant effect of an intervention on HIV incidence. Therefore, evaluating interventions using HIV incidence as the primary outcome measure becomes expensive and often impractical.

As the statement from the Consensus Development Conference on Interventions to Prevent HIV Risk Behaviors suggested, “Direct measurement of HIV infection is a feasible and desirable outcome variable for some programs. However, practical, ethical, and fiscal barriers often make reliance on measured seroconversion undesirable. In these instances, proxy indices-including other biological markers or modeled estimates of seroincidence based on behavioral outcomes-can be used to estimate the effects of prevention programs on seroincidence.”1

Although the public health community has largely accepted sexually transmitted disease (STD) prevalence and incidence data as valid indicators of the effectiveness of intervention programs in reducing HIV incidence, there has been considerable resistance to accepting behavioral measures-particularly behavioral self-reports-as valid indicators of intervention effectiveness. Many investigators have argued that biological measures such as STDs should be used as outcome measures in behavior-change programs.2–6 This call for biological outcomes reflects the assumption that behavior change that can prevent HIV seroconversion should also prevent STD incidence; however, it also reflects a general skepticism concerning the validity of behavioral self-reports. Indeed, the veracity of self-reports of sexual behavior has often been questioned, and it has been argued that biological measures are needed to “validate” behavioral self-reports.5,7

This article examines the literature on the “validity” of self-reports. We also examine the extent to which other biological markers or modeled estimates of seroincidence based on behavioral outcomes are likely to provide acceptable estimates of the effects of prevention programs on HIV incidence. More specifically, we will consider the relationships between behavior change and both STD and HIV incidence, and the relationship between decreases in STD incidence and HIV incidence.

A primary concern of this article is with the use of either behavioral or biological measures as “surrogates” for HIV incidence. Clearly, if the purpose of an intervention is to reduce the incidence of an STD, it is appropriate and meaningful to use STD incidence as the primary outcome measure, providing one has enough power (i.e., a large enough sample) to detect meaningful differences. Similarly, if one is attempting to demonstrate that a behavioral intervention has interrupted the sexual transmission of one or more pathogens, then it is also appropriate and meaningful to assess STD incidence. But if the purpose of an intervention is to reduce HIV incidence, it is not clear that biological outcomes other than HIV incidence provide more or less information about HIV transmission than behavioral outcomes. Just as there are conditions under which “true” changes in behavior may not be related to STD or HIV incidence, there are also conditions under which “true” changes in STD incidence may not be related to HIV incidence.

The Use of STD Incidence Measures

Although we have witnessed major advances in STD diagnostics, biochemical and biological measures are not perfect, and their sensitivity and specificity must be established before they are used. Sensitivity refers to the ability of a diagnostic test to identify the true positives, or persons that are actually infected, whereas specificity refers to the ability of the test to identify the true negatives, or persons that are not infected. The importance of considering the sensitivity and specificity of diagnostic tests for STDs was clearly illustrated by Schachter and Chow.6 According to these authors, “There are no perfect diagnostic tests…if isolation of the bacteria or virus is the test being performed, then sensitivity is the problem because no culture test is 100% sensitive. However, specificity is not a problem with this kind of test. The modern non-culture tests may cause problems because of both specificity and sensitivity.”

Although some tests for some STDs are good, tests for other STDs are problematic, making it essential that one look closely at the validity of the diagnostic test being used. In attempting to explain why an STD control program seemed to reduce HIV incidence in one study8 but not in another,9 some investigators have raised questions about the “validity” of the measures used in these studies. Specifically, in the Rakai study, ligase chain reaction (LCR)- a very sensitive test-was used to diagnose both Neisseria gonorrhoeae and Chlamydia trachomatis, whereas in Mwanza, these STDs were diagnosed using Gram stain for gonorrhea and an antigen capture enzyme immunoassay for chlamydia. Thus, it has been argued that although the obtained rates of gonorrhea and chlamydia were similar in these two studies, there was actually more cases of STDs in Mwanza, because Gram stain and antigen capture enzyme immunoassay are less sensitive tests than ligase chain reaction.

But even if one assumes that a diagnostic test is “valid,” there are other issues that must be considered before using such a test in prevention research. For example, because some biological and biochemical tests are invasive, their use may increase refusal rates. In addition, it must be recognized that there is a difference between determining whether a person has or does not have an STD-the primary purpose of a diagnostic test-and determining whether the person has a new or incident infection. If the STD is incurable (e.g., viral STDs), a test cannot determine whether this STD represents a new infection or a recurrence of a previous infection. Even in the case of bacterial STDs, it is not always easy to determine whether one is identifying a new infection or a previous infection that was either not initially identified or did not respond to treatment.

Biochemical measures often identify only recent or current behaviors. Many sensitive behaviors such as sexual behavior, particularly among adolescents, occur only rarely. Moreover, the prevalence and incidence of some STDs is low, requiring very large sample sizes to detect small but meaningful changes in STD rates. Thus, several studies have combined a number of STDs into a single outcome measure. Unfortunately, this practice may be problematic, because different STDs (e.g., gonorrhea, syphilis, chlamydia) have different prevalence rates (among men and women and among different age groups), different transmission rates, different durations, and are differentially effected by condom use and other forms of contraception.10,11

Finally, biochemical measures are often costly, limiting their feasibility in studies with limited resources. For example, some tests require special equipment or specially built dedicated space. Nevertheless, when there are grounds for assuming an isomorphic relation between a biological assessment and a self-reported behavior, and when the biochemical measure is relatively noninvasive, biological and biochemical measures may provide the best evidence for the validity-or lack of validity-of behavioral self-reports, and for the efficacy and public health impact of prevention programs.

The Validity of Self-Reports

According to Catania et al,12 “sex research lacks a gold standard for validating self-reported sexual behavior.” Although the available evidence strongly supports the validity of behavioral self-reports, even in sensitive areas (e.g., those concerning private behaviors that have been implicated in acquiring and transmitting HIV and other STDs13,14) it must be recognized that there is no acceptable method that can be used to validate self-reports of sexual activity. Nevertheless, there are five ways in which investigators have tried to demonstrate the validity of self reports, four of which involve a comparison of self-reports with presumably more objective measures: (1) biological or biochemical; (2) official records; (3) proxy; and (4) behavioral logs and diaries. The fifth way to demonstrate validity considers self-reports obtained at different times.

Before considering these approaches, it is important to recognize that there are different types of validity. Validity refers to the question of whether we are measuring what we think we are measuring. Content or “face” validity, which is the least meaningful type of validity, is a subjective judgment that the items used to assess the construct appear relevant or related to the construct. Convergent or “construct” validity assesses the degree to which two different measures of the same construct are equivalent. Discriminant validity assesses the degree to which a measure is “pure” (i.e., measures only what it is supposed to measure). Predictive validity assesses the extent to which the construct is related to other nonequivalent criteria that are assumed to be influenced or determined by the construct.

Evidence For the Validity of Self-Reported Sexual Behavior

Correlations with biological and biochemical measures and medical records. To the best of our knowledge only one study has used a biochemical measure to evaluate the construct validity of self-reported sexual behavior. Udry and Morris15 assessed the convergent validity of self-reported sexual behavior in 58 blue-collar African American women by testing their urine for the presence of sperm. During a period of 90 days, Udry and Morris obtained daily urine specimens and a daily report slip indicating whether the respondent had performed several different behaviors, including vaginal intercourse. Because negative lab results do not imply that vaginal sex did not occur-perhaps condoms were used-Udry and Morris only analyzed the self-report data for the 15 women with at least one positive laboratory result. Despite this small sample, the data indicated high levels of consistency between lab reports and self-reports of coitus. Of these 15 women, 12 yielded perfect agreement between lab reports and self-reports. Of the remaining three women, two showed one discrepancy during the 90-day observation period, and one woman showed three discrepancies.

The convergent validity of self-reported STDs and biochemical STD assessments has also been explored. For example, Kleyn et al16 interviewed 134 injecting drug users about their STD history (i.e., “have you ever been told by a doctor or nurse that you had…?”). Of the 134 persons interviewed, 69 returned for counseling and testing; serology was used to test for the presence of hepatitis B, syphilis, oral herpes, and genital herpes. Among the 69 who returned for treatment, 62 records could be linked to self-reports. Generally major discrepancies were observed between the respondent's self-reports and the diagnostic tests. For all STDs except syphilis, there were a large number of positive tests among those reporting that they had not been told they had the disease. Conversely, among the five respondents who reported a history of syphilis, none were confirmed by serology. The authors attributed these poor findings to a number of factors, including the lack of veracity of some respondents, the respondent's lack of knowledge (including a lack understanding of what was told to them by health personnel), and the wording of the question (i.e., were people with asymptomatic disease ever told they had a STD?). The authors also recognized that some of the discrepancy could be due to the poor predictive value of some of the diagnostic tests.

In marked contrast to these findings, Millstein and Moscicki17 found high agreement between self-reports of STD histories and medical records. A multiethnic sample of 571 sexually active female adolescents (13-19 years) were recruited from family planning clinics to report their STD histories. “Attempts to validate the self-reports on STD history were made by reviewing the medical record of those subjects who had previously attended the clinics (n = 293). These analyses indicated 93% agreement between positive and negative self-reports and medical records.”17

Although these two studies are not comparable (i.e., one cannot equate the self-reports of male and female injecting drug users with the self-reports of female adolescents attending a family planning clinic), the point to be made here is that, at least under certain circumstances, one can obtain convergent validity between behavioral self-reports and medical records. One must also question the extent to which the low “validity” in the Kleyn et al16 study indicates a lack of veracity or a lack of knowledge. For example, the difference between these two studies may reflect the commitment of most family planning clinics to counseling and exploring options, thus providing clients with a clearer understanding of their symptoms and of STDs in general.

Other attempts to validate self-reported sexual behavior with biological markers have used STD incidence as a gold standard for validating self-reports of condom use.7 Unfortunately, particularly in this behavioral domain, biological or biochemical measures are often not isomorphic with the behavioral self-report they are supposed to validate. For example, STDs do not have a one-to-one relationship with frequency of unprotected sexual behavior. Indeed, one of the strongest arguments against using STDs or HIV as an indicator of whether unprotected sex has occurred is the finding that not all wives of HIV-infected male hemophiliacs have seroconverted, even after years of unprotected vaginal intercourse.18 Thus, “STDs are not a gold standard by which to judge the validity of self-reported condom use.”12 As Schachter & Chow6 have acknowledged, “In a number of venues, we and others have emphasized the need for biological endpoints in evaluating research in behavioral modification programs to reduce risk taking behaviors and thus reduce STD acquisition. For a variety of reasons, it is clear that the analysis of questionnaire data will not be an adequate means of determining the efficacy of such trials. However, it is naive to assume that a positive change in behaviors introduced in intervention trials necessarily will result in a detectable reduction in STD acquisition rates.”

Correlations with proxy measures and behavioral diaries and logs. Some additional evidence for the convergent validity of self-reports comes from couple studies in which each person independently responds to a questionnaire, and the responses of the two members of the couple are compared. These studies have shown good agreement with respect to some behaviors (e.g., frequency of vaginal sex, frequency of condom use), but relatively poor agreement with respect to other behaviors (e.g., frequency of masturbation, frequency of anal sex).

Studies comparing self-reports with diaries and/or behavioral logs have also indicated that high correlations between the diary entries and self-reports can be obtained.14 However, the strength of the correlation depends upon many factors, the most important of which appears to be the length of the recall period. It must be recognized that this method, like the method of obtaining repeated measures at different points in time, may not address the validity issue; ultimately, one is simply correlating one set of self-reports with another. Nevertheless, the more one can demonstrate that daily or weekly diaries provide accurate reports of behavior, the more this methodology approaches a test of validity.

This review suggests that self-reports have reasonable utility1; however, as Sudman and Bradburn19 have pointed out, “there is always the possibility that responses may not be truthful.” Spanier20 has suggested that when retrospective accounts are in error, one or both of two different processes may be involved: faulty recall and falsified accounts. Faulty recall will be defined as unintentional false reporting of the past due to the inability to remember accurately or completely, or to the respondent's changing perception of past reality. Falsified accounts are conscious and intentional false reporting of past or present behavior and beliefs due to fear of being honest with the interviewer, conscious and intentional falsification of the truth for the purpose of ego enhancement, or the desire of presenting a false image to the interviewer.

Fortunately, a number of techniques have been developed to improve recall and reduce the likelihood of falsified accounts. For example, recall can be improved by providing calenders with significant events in the respondent's life (e.g., birthdays, holidays, specific experiences).21,22 Jaccard and Wan14 have suggested that three factors appear to combine in an interactive way to influence accuracy of recall: length of recall, question format, and frequency of behavioral occurrence. For example, respondents may attempt to answer frequency questions either by focusing on specific instances and counting them up (episodic memory) or by accessing general principles (e.g., I have sex three times a week) and deriving an answer (semantic memory). Based on this distinction between episodic and semantic memory, Jaccard and Wan14 argue that (1) accurate recall of behaviors over a relatively long period (e.g., ≥ 3 months) will be facilitated by question formats that discourage the use of episodic memory and encourage the use of semantic memory; and (2) accurate recall of behaviors over a relatively short period (e.g., ≤ 1 month) will be facilitated by question formats that encourage the use of episodic memory and discourage the use of semantic memory.

The above hypotheses can be further qualified by the relative frequency of behavioral performance. Recall by persons who tend to perform the behavior in question relatively more often will be facilitated by strategies that encourage semantic memory, whereas individuals who tend to perform the behavior in question relatively less often will be facilitated by strategies that encourage episodic memory. By paying careful attention to these previously described factors, it is possible to greatly reduce faulty recall and increase accurate recall. However, those questioning the validity of self-reported behavior measures have typically been more concerned with falsified accounts than faulty recall. Fortunately, it may be easier to reduce falsified accounts than to eliminate faulty recall. Falsified accounts can be greatly reduced or eliminated by (1) assuring confidentiality and ultimate anonymity; (2) stressing the importance of honest answers for the scientific integrity of the project; (3) using methods (e.g., audio computer-assisted self-interviews) that eliminate the need for respondents to report socially undesirable answers face-to-face; and (4) asking respondents to sign a statement that they will give honest answers.

Another procedure used in psychological research to minimize falsified accounts is the “bogus pipeline” methodology, in which respondents are led to believe that their responses are being monitored physiologically (i.e., that these physiological measures make it possible to tell when the respondent is lying). Research has shown that when individuals believe they are being monitored in this fashion, “they are more likely to provide honest answers to socially loaded questions.”14 Interestingly, the introduction of the bogus pipeline methodology did not influence responses to sensitive questions when the above steps for minimizing falsified accounts had been taken.14

Summary. Despite some potential problems, there appears to be reasonable evidence for the validity of self-reports and for the validity (i.e., sensitivity and specificity) of STD diagnostic tests. It is therefore important to understand why valid self-reports of sexual behavior are not always systematically related to valid assessments of HIV seroconversion or STD incidence. Similarly, it is important to understand why measures of STD prevalence and incidence are not always systematically related to measures of HIV.

Factors Influencing HIV Transmission

To better understand the role of behavior change and STD prevention as factors influencing HIV seroconversion, consider May and Anderson's23 model of the reproductive rate for STDs, including HIV: Ro = βcD, where Ro indicates the reproductive rate of infection, “typically interpreted as the expected total number of secondary infections arising from a single primary infection early in the epidemic when virtually all individuals are susceptible.”24 When the reproductive rate is greater than one, the epidemic is growing; when Ro is less than one, the epidemic is dying out; and when Ro equals one, the epidemic is in a state of equilibrium. Beta (β) indicates transmission efficiency, or the ease with which an infected person can transmit the disease to an uninfected partner; c indicates the rate of partner exchange; and D indicates the length of time a person is infectious.

Each of the parameters on the right side of the equation can be influenced by behavior or behavior change. For example, transmission efficiency (β) can be reduced by increasing condom use or by delaying the onset of sexual activity; the rate of partner exchange (c) can be influenced by decreasing the number of partners; and the length of time a person is infectious (D) can be influenced by increasing the likelihood that one will seek care at the first sign of symptoms. Biomedical interventions can influence transmission efficiency (β) and infectiousness (D). For example, reducing the incidence or prevalence of other STDs will reduce β for HIV. Similarly, the development of an effective vaccine will also lower transmission efficiency (β), whereas STD screening and treatment will reduce infectiousness (D), at least with respect to STDs other than HIV.

Whether one takes a biomedical or behavioral approach, the impact of a change in any one parameter on the reproductive rate will depend on the values of the other two parameters. For example, if one attempted to lower the reproductive rate of HIV by reducing transmission efficiency, either by reducing STDs or by increasing condom use, the impact of such a reduction would depend on both the prevalence of the disease and the sexual mixing patterns of that population. Clearly, if there is no disease in the population, changes in transmission efficiency can have little to do with the spread of the disease. Similarly, a reduction in STD rates or an increase in condom use in those who are at low risk of exposure to partners with HIV will have little or no impact on the epidemic. In contrast, a reduction in STDs or an increase in condom use by members of the population who are most likely to transmit or acquire HIV (i.e., in the so-called core group), can have a big impact on the epidemic, depending upon prevalence of the disease in the population.25

To complicate matters further, it must also be recognized that changes in one parameter may directly or indirectly influence one of the other parameters. For example, some have argued that an intervention program that successfully increased condom use could also lead to an increase in number of partners, perhaps because now one felt safer. If this were the case, an increase in condom use or a reduced prevalence of STDs would not necessarily lead to a decrease in the reproductive rate. The impact on HIV incidence of a change in STD incidence or of a change in condom use will differ, depending on the values of the other parameters in the model. Condom use behaviors are very different with partners perceived as “safe” than with partners perceived as “risky.” Thus, one should not expect to find a simple correlation between decreases in transmission efficiency and reductions in HIV seroconversions. Moreover, it should be recognized that many other factors may influence transmission efficiency (e.g., degree of infectivity of the donor, characteristics of the host, type and frequency of sexual practices), and variations in these factors will also influence the nature of a relationship between decreased STD rates, increased condom use, and HIV seroincidence.2,3

These conclusions follow from other models of HIV transmission. For example, consider Reiss and Leik's25 model of the probability of being infected with HIV: p = 1 − [1 − π + π(1 − φα)n/s]s where π indicates prevalence, α indicates infectivity, φ indicates condom failure probability, n indicates number of sex acts, and s indicates number of partners. Condom use is built directly into this model; moreover, the impact of condom use will depend on the degree of infectivity, the prevalence of the disease in the population, the number and type of sex acts, and the number of partners.

Although this model, like the model proposed by May and Anderson, clearly indicates that there is no reason to expect a simple relationship between a behavioral change and the reproductive rate of HIV, both models make different predictions about the effect of a behavior change. For example, while the May and Anderson model suggests that changing number of partners and sexual-mixing patterns will have the strongest effect on HIV incidence, the Reiss and Liek model suggests that reductions in number of partners would be relatively ineffective, whereas increases in condom use should have maximal impact on the epidemic. These differential predictions occur, in part, because one model assumes that the probability of infection is defined per sexual act, whereas the other model assumes that the probability of infection is defined per sexual partnership. Nevertheless, depending on the assumptions made, a model may lead to very different predictions of the relationships among behaviors, STDs, and HIV. Thus, although the Consensus Development Conference on Interventions to Prevent HIV Risk Behaviors1 suggested that “modeled estimates of seroincidence based on behavioral outcomes can be used to estimate the effects of prevention programs on seroincidence,” the variability in models and assumptions must be recognized, and considerably more work is necessary before models will be able to provide valid estimates.

Factors Influencing Transmission of Other STDs

The May and Anderson model also suggests that one should not expect to find simple relationships between increased condom use and STD incidence. Like in the case of HIV, the effect of increasing condom use (i.e., decreasing transmissibility) on the incidence of an STD depends on many factors. Moreover, different STDs have very different transmissibility rates, and condoms are not equally effective in preventing all STDs. Although correct and consistent condom use can prevent HIV, gonorrhea, syphilis, and probably chlamydia, condoms are less effective in interrupting transmission of herpes and genital warts.26 Thus, although one is always better off using a condom than not using a condom, the impact of condom use is expected to vary by disease. In addition, for many STDs, transmission from men to women is more efficient than transmission from women to men. For example, with one unprotected coital episode with a person with gonorrhea, there is approximately a 50% to 90% chance of transmission from male to female, but only a 20% chance of transmission from female to male. Similarly, with HIV, male-to-female transmission is approximately 0.1% to 20%, whereas female-to-male transmission is approximately 0.01% to 10%.


There are a number of questions concerning the use of behavioral or biological measures as markers or “surrogates” for HIV seroconversion, and the use of STD as a indicator or validator of behavior change. At the same time, both behavior change and STD control can, under certain circumstances, help to reduce the transmission of HIV and other STDs.

Nevertheless, to the best of our knowledge, there is no way of knowing to what extent an increase in condom use or a decrease in the prevalence or incidence of an STD will impact on the transmissibility of HIV. Moreover, little is known about the impact of reducing the prevalence of different STDs. For example, does a reduction in syphilis have the same impact on HIV transmission as a reduction in chancroid, gonorrhea, or chlamydia?

Although we cannot answer these questions, it is important to investigate and understand the relationships between behavioral and biological measures; in particular, to understand when and under what circumstances one can expect to find a relationship between behavioral and biological outcomes. Unfortunately, it is unlikely that we will be able to do this until behaviors are assessed more precisely, and new or incident STDs can be more accurately identified. From the behavioral-science perspective, our most pressing problem and our greatest challenge is assessing correct as well as consistent condom use.

Assessing Correct and Consistent Condom Use

Condom use is most often assessed by asking respondents how many times they have engaged in sex, and then asking them to indicate how many of these times they used a condom. First, it is important to note that different answers to these questions will be obtained depending on the time frame used (e.g., lifetime, past year, past 3 months, past month, past week, last time), and the extent to which the type of sex (e.g., vaginal, anal, or oral) and type of partner (e.g., steady, occasional, paying client) is assessed. Second, no matter what the time frame, type of partner, or type of sex, these numbers (i.e., the number of sex acts and the number of times condoms were used) can be used in at least two different ways. In most of the literature, particularly in the social-psychological literature, the most common outcome measure is the percentage of times the respondent reports condom use. For each subject, one divides the number of times condoms were used by the number of sex acts. Perhaps a more appropriate measure would be to subtract the number of times condoms were used from the number of sex acts, which would yield a measure of the number of unprotected sex acts. Clearly, if one is truly interested in preventing disease or pregnancy, it is the number of unprotected sex acts and not the percentage of times condoms are used that should be the critical variable. Obviously, there is a difference in one's risk of acquiring an STD if one has sex 1000 times and uses a condom 900 times than if one has sex 10 times and uses a condom 9 times; both persons will have used a condom 90% of the time, but the former will have engaged in 100 unprotected sex acts whereas the latter will have engaged in only one unprotected sex act.

Table 1 shows the relations between these two measures for vaginal and anal sex and for men and women in a sample of STD clinic patients. Although these two measures are significantly correlated, the relationship between them is not strong. Correlations are higher for men than for women and for vaginal sex than for anal sex. Clearly, if we are going to investigate the impact of self-reported condom use on unplanned pregnancy or disease, we should be looking at the number of unprotected sex acts rather than the percentage of times condoms are used.

Correlations Between Number of Unprotected Sex Acts and Percent of Times a Condom Was Used

If the number of unprotected sex acts is more important than the percentage of times condoms are used, it is necessary to address the next parameter: correct condom use. Consistent condom use is not necessarily correct condom use, and incorrect condom use almost always equates to unprotected sex. Correct condom use involves a number of steps. One should check the expiration date before using any condom; old condoms are likely to leak or break. One should know when and how to wear a condom. The “when” is fairly simple; the condom should be put on before any penile-vaginal or penile-anal contact, and it should stay on until the penis has completely withdrawn. The “how” is also fairly simple; one should open the package carefully so as not to tear or puncture the condom. One should place the condom correctly on the penis so that it can be easily unrolled down the shaft. One should hold the tip and squeeze out the air before unrolling to leave a reservoir for the semen. Finally, one should unroll the condom to cover the entire shaft of the penis. For maximum safety, one should withdraw immediately after orgasm while the penis is still erect. One should hold the base of the condom during withdrawal, so that the condom cannot slip and the semen cannot leak out. One should only use water-based lubricants, because other lubricants can damage the latex. Obviously, one should never use the same condom more than once, and when a condom is put on incorrectly (i.e., so that it does not unroll correctly), one should never simply flip the condom over and unroll it. If a male has an STD like Neisseria gonorrhoeae, there is likely to be some discharge containing the pathogen. If the condom is put on backwards, the outside tip of the condom is coated with the pathogen or the bacteria. By flipping the condom over and then having sex, one essentially puts the pathogen directly in contact with the cervix, which may be an efficient way of transmitting an STD.

How Often Does Incorrect Condom Use Occur?

According to Warner et al,27 there is a great deal of incorrect condom use, at least among college men. Warner and his colleagues asked 47 sexually active male college students (age range 18-29 years) who had used condoms at least five times in their lifetime and who reported using condoms in the month before the study to report the number of times they had vaginal intercourse and the number of times they used condoms during the last month. In addition, the subjects were asked to quantify the number of times they experienced several problems while using condoms in the last month (e.g., breakage, slippage). The 47 men used a total of 270 condoms in the month preceding the study; 31.9% of the men reported that the condom was put on inside out and then flipped over; 17% reported that intercourse was started without a condom but they then stopped to put one on; 12.8% reported breaking a condom during intercourse or withdrawal; 8.5% started intercourse with a condom but then removed it and continued intercourse; and 6.4% reported that the condom fell off during intercourse or withdrawal. Note that all of these men could report always using a condom, yet all could have transmitted or acquired an STD.

However, these are college students, many of whom have had relatively little experience with using a condom or with sex in general. So let us consider a more sexually experienced population. Project RESPECT,28 a multisite, randomized control trial, evaluated the effectiveness of three types of counseling for STD clinic patients. As part of the three interventions, many respondents were given condom-skills training, and as part of a 12-month follow-up, a subset of respondents were asked to put a condom on a model of a penis. Table 2 shows the percentage of women and men correctly performing each step while completing this task. Note that whereas more than 90% of both men and women correctly opened the condom package, placed the condom correctly, and unrolled the condom completely, more than 10% forgot to leave a space at the tip, and more than 35% did not remove the air from the tip. Further, only 51% of women and 58% of men correctly performed all five steps. Those who received condom-skills training did, in fact, perform better on this test that those who did not receive training (data not shown); however, even those receiving training were far from perfect, and were also most likely to make the same two mistakes.

Condom Use by STD Clinic Clients

The failure to leave a space at the tip or to remove air from the tip are mistakes that are likely to lead to condom breakage. But as indicated above, there are other mistakes that can be assessed, and questions about these errors were also asked at the 12-month follow-up visit. Although more than 90% of STD clinic patients reported that they had ever used a condom, only 25% reported that they have always checked the condom's expiration date. Moreover, only 25% of the men and 15% of the women reported that either they or their partners withdraw while the penis is still erect, and almost one third of both men and women did not report holding the rim of the condom on withdrawal. Sixteen percent of women and 5% of men reported that they have had an allergic reaction to condoms; almost 75% of the men and slightly more than half the women attributed this reaction to the spermicide, although the reaction could be due to the latex. Most respondents reported using a lubricated condom sometime during the past year, with more than 60% of respondents using a lubricated condom in the past 3 months. Among those who added lubrication to nonlubricated or lubricated condoms, the most common lubricant was K-Y, which is a water-based lubricant that will not deteriorate the latex in the condom. However, Vaseline was used by approximately 6.5% of men and women and, as discussed previously, this oil-based lubricant can lead to deterioration of the condom and increase breakage. Finally, almost 23% of women and 32% of men reported putting lubricant on the shaft of the penis, whereas 9% of women and approximately 6% of men explicitly stated that they put lubricant inside the condom; both of these practices are likely to increase slippage and thus leakage.

Consistent with the above analysis, both men and women reported high rates of slippage and breakage. For example, 34% of women and 36% of men reported condom breakage during the past 12 months, with 11% of women and 15% of men reporting condom breakage in the past 3 months. Similarly, 31% of women and 28% of men reported that a condom fell off in the past 12 months, whereas 8% of both men and women reported slippage in the past 3 months. Perhaps not surprisingly, women are significantly more likely than men to report condom leakage (17% versus 9%) and significantly less likely to report flipping the condom or putting it on backwards (29% versus 39%). Again, the remarkable findings are the high proportion of both men and women reporting different types of condom mistakes. For example, 31% of the men and 36% of the women reported starting sex without a condom and putting one on later during intercourse, whereas 26% of men and 23% of women reported starting sex with a condom and then taking it off and continuing intercourse. These data probably reflect the fundamental conflict between using a condom for family-planning purposes and using one for the prevention of STDs, including HIV.

Not only do we know that people are unlikely to use more than one form of contraception, but even among condom users, those who use condoms for family-planning purposes use them differently than those who use condoms for disease prevention. Indeed, the practice of not wearing a condom during the entire sex act probably reflects incorrect beliefs about how pregnancy is prevented. Regardless of the reason for these behaviors, all of these persons could have transmitted or acquired an STD during sex, despite the fact that they had used a condom.

Within the constraints of one's ability to accurately recall past events, people do appear to be honest in reporting their sexual behaviors, including their condom use. Behavioral scientists must obtain better measures, not of condom use per se, but of correct and consistent condom use, or perhaps even more important, of the number of unprotected sex acts in which a person engages. These types of measures will better elucidate the relationships between behavioral and biological measures.


While the ultimate goal of a behavioral intervention to prevent the spread of HIV is to reduce HIV seroincidence, the use of HIV seroincidence rates as a criterion is rarely feasible, particularly in the United States. Thus, many have called for the use of other biological measures, such as incident STDs, to evaluate the effectiveness of behavioral interventions. Unfortunately, STD incidence may be no more strongly related to HIV seroincidence than self-reports of condom use. Just as there are reasons to question the validity of some behavioral self-reports, there are also reasons for questioning the validity of some STD diagnostic tests, particularly in field settings. In addition, as a number of investigators have now pointed out, it is inappropriate to assume a simple linear relationship between self-reported condom use and STD incidence. For example, the impact of an increase in condom use on the incidence of a STD will depend on the prevalence of the STD in the population being considered, the sexual mixing patterns in that population, characteristics of the host, and the transmissibility of the STD under consideration. Thus, the same 10% increase in condom use could lead to significant reductions in STD incidence in some situations but not in others.

In the same fashion, increases or decreases in STD prevalence or incidence may tell us very little about the impact of an STD-control program on HIV incidence. Here too, the impact of reducing STD incidence on HIV incidence will depend on the prevalence of both HIV and STD, the sexual mixing patterns in the population, and other previously described variables. As the Mwanza study8 has shown, we can have reductions in HIV incidence with little or no change in STD incidence; and as the Rakai study9 has demonstrated, we can have significant, albeit small, decreases in STD incidence with little or no change in HIV incidence. Clearly, if we are interested in HIV prevention, then it will be important to assess changes in STD, HIV, and behaviors wherever possible. Indeed, if we only had STD incidence data from Rakai and Mwanza, Mwanza would have been judged a failure whereas Rakai would have been judged a success. Unfortunately, as mentioned previously, it is not always possible to obtain HIV-incidence data. In such cases, STD-incidence data provide no more insight into the success of an HIV prevention program than do behavioral data. STD-incidence and behavior change are relatively independent outcome measures that provide different insights into understanding the effectiveness of a behavior-change intervention. Although it is always possible to obtain behavioral data, it may not always be possible or appropriate to obtain STD-incidence data. For example, STD outcome measures may be inappropriate criteria for development and initial testing of behavior-change interventions or when interventions are designed for primary prevention in populations with low STD prevalence. Requiring biological markers in these cases will increase the cost and complexity of these studies without yielding additional information.

It is also important to understand when and under what circumstances behavioral self-reports and biological or biochemical measures of sexual activity will be related; this can only be done when both types of measures are assessed. In particular, it would be important to understand how increases in condom use or decreases in incident STDs translate into reduced incidence of HIV or other STDs. Before we can begin to answer this question, it will be necessary to improve our assessments of condom use. In particular, we must move from simple measures of condom use frequency, or consistency of use, to measures of correct condom use. We must begin to recognize that 100% condom use does not always mean 100% protected sex; and we must begin to assess number of unprotected sex acts and condom-use errors. We will also have to improve our understanding of the sensitivity and specificity of diagnostic tests, and recognize that each STD is unique and may be differentially effected as a result of increased condom use.

Thus it will be useful to assess the effects of a behavioral intervention on both behavioral and biological measures. Whether STD incidence declines may not provide the best data for determining whether there was behavioral change, or whether the intervention had an impact on HIV seroconversions. Unfortunately, in the STD/HIV domain, we are almost always talking about behaviors that cannot be directly observed, and we must therefore rely on behavioral self-reports. Although there is no doubt that self-reports can be inaccurate as a result of faulty recall or falsified accounts, the available data suggest that when proper procedures are followed, behavioral self-reports-even those concerning sensitive sexual behaviors-can be accurate and valid. Thus, even in the absence of biological outcomes, behavioral self-reports can provide useful information about the efficacy and effectiveness of behavior change interventions.


1. NIH Consensus Development Panel. NIH Consensus Statement. Statement from the Consensus Development Conference on Interventions to Prevent HIV Risk Behaviors. 1997 15:1–41.
2. Aral SO. Peterman TA. Do we know the effectiveness of behavioural interventions? Lancet 1998; 351(suppl 3):33–36.
3. Aral SO, Peterman TA. Measuring outcomes of behavioural interventions for STD/HIV prevention. Int J STD AIDS 1996; 7(suppl 2):30–8,.
4. NIMH Multisite HIV Prevention Trial. Endpoints and other measures in a multisite HIV prevention trial: rationale and psychometric properties. AIDS 1997; 11(suppl 2):37–47.
5. O'Leary A. Why use STD incidence to evaluate behavioral interventions? Paper presentated at: CDC Colloquium, Atlanta, 1998.
6. Schachter J, Chow JM. The fallibility of diagnostic tests for sexually transmitted disease: the impact on behavioral and epidemiologic studies. Sex Transm Dis 1995; 22:191–196.
7. Zenilman JM, Weisman CS, Rompalo AM, et al. Condom use to prevent incident STDs: the validity of self-reported condom use. Sex Transm Dis 1995; 22:15–21.
8. Grosskurth H, Mosha F, Todd J, et al. Impact of improved treatment of sexually transmitted diseases on HIV infection in rural Tanzania: randomized control trial. Lancet 1995; 346:530–536.
9. Wawer MJ. The Rakai randomized community-based trial of STD control for AIDS Prevention: no effect on HIV incidence despite reductions in STDs (abstract no. 9). Paper presented at: the International AIDS Conference, 1998.
10. Cates W. Sexually transmitted diseases: epidemiologic measures. Paper prepared for: EPID 731, Graduate Summer Session in Epidemiology, University of Michigan, Ann Arbor, 1996.
11. Stone KM, Timyan J, Thomas EL. Barrier methods for the prevention of sexually transmitted diseases. In: Holmes KK, Mardh P-A, Sparling SF, Lemon SM, Stamm WE, Piot P, Wasserheit JN, eds. Sexually Transmitted Diseases. 3rd ed. New York: McGraw-Hill, Inc, 1999:1307–1321.
12. Catania JA, Gibson DR, Chitwood DD, Coates TJ. Methodological problems in AIDS behavioral research: influences on measurement error and participation bias in studies of sexual behavior. Psychol Bull 1990; 108:339–362.
13. Durante R, Sorenson S. Literature review for the design of the multimethod validation study of self-reported sexual behavior among in-school adolescents. Document prepared by Battelle for the Centers for Disease Control, Atlanta, 1995.
14. Jaccard J, Wan CK. A paradigm for studying the accuracy of self-reports of risk behavior relevant to AIDS: Empirical perspectives on stability, recall bias, and transitory influences. J Appl Soc Psychol 1995; 25:1831–1858.
15. Udry J, Morris N. A method for validation of reported sexual data. J Marriage Fam 1967; 29:442–446.
16. Kleyn J, Schwebke J, Holmes KK. The validity of injecting drug users' self-reports about sexually transmitted diseases: a comparison of survey and serological data. Addiction 1993; 88:673–680.
17. Millstein SG, Moscicki A-B. Sexually Transmitted Disease in female adolescents: effects of psychosocial factors and high risk behviors. J Adolesc Health 1995; 17:83–90.
18. Lawrence D, Jason J, Holman, R, Murphy J. HIV transmission from hemophilic men to their heterosexual wives. In: Alexander N, Gabelnick H, Hodgen G, Spieler J, eds. The Heterosexual Transmission of AIDS: Proceedings of the CONRAD 2nd International Workshop. New York: Alan R. Liss, (in press).
19. Sudman S, Bradburn N. Response Effects in Surveys. Chicago: Aldine, 1974.
20. Spanier GB. Use of recall data in survey research on human sexual behavior. Soc Biol 1997; 23:244–253.
21. Carey MP, Weinhardt LS, Carey KB, Maisto SA, Gordon CM, Gleason JR. Measuring sexual and substance use behavior: the timeline method. Paper presented at: the NIMH Exploratory Conference on the Interactions of Comorbid Mental Health Factors with High Risk Sexual Behaviors: Implications for HIV/STD Transmission. Bethesda, MD: 1998.
22. Weinhardt LS, Carey MP, Maisto SA, Carey KB, Cohen MM, Wickramasinghe SM. Reliability of the timeline followback sexual behavior interview. Ann Behav Med 1998; 20:15–30.
23. May RM, Anderson RM. Transmission dynamics of HIV infection. Nature 1987; 326:137–142.
24. Pinkerton SD, Abramson PR. An alternative model of the reproductive rate of HIV infection: formulation, evaluation, and implications for risk reduction interventions. Eval Rev 1994; 18:371–388.
25. Reiss IL, Leik RK. Evaluating strategies to avoid AIDS: number of partners vs. use of condoms. J Sex Res 1989; 26:411–433.
26. Cates W, Holmes KK. (1996). Re: Condom efficacy against gonorrhea and nongonococcol urethritis. Am J Epidemiol 1996; 143:843–844.
27. Warner L, Clay-Warner J, Boles J, Williamson J. Assessing condom use practices: implications for evaluating method and user effectiveness. Sex Transm Dis 1998; 25:273–277.
28. Kamb ML, Fishbein M, Douglas JM, et al, for the Project RESPECT Study Group. HIV/STD prevention counseling for high-risk Behaviors: results from a multicenter, randomized controlled trial. JAMA 1998; 280:1161–1167.
© Copyright 2000 American Sexually Transmitted Diseases Association