As early as 2001 there was evidence from prospective experiments that condoms, if used consistently, reduced the rate of HIV incidence.1,2 Condom use at sexual debut is important because of its documented link with subsequent consistent condom use.3,4 Thus, protecting oneself from infection at sexual debut improves the chances of avoiding infection, by subsequent behavior, at a later point. Research has also confirmed a relationship between condom use at sexual debut and reduced risk of chlamydia, gonorrhea, and other sexually transmitted infections.5
Over the past 20 years, communication programs in South Africa have educated the public about AIDS, emphasizing use of condoms. Over this period, HIV prevalence in antenatal clinics increased from 10% in 1995 to around 20% in 2000 and then to 29% by 2008.6 HIV prevalence among adults, as estimated from national surveys leveled off from 15.6% in 2002, 16.2% in 2005, to 16.9% in 2008. Condom use at last sex increased from less than 10% before 2000 to 27.3% in 2002, 35.4% in 2005, to 62.4% in 2008.7 Some stabilization came from the decrease in HIV incidence among youth younger than 20 years—1.0%–0.6% for 15-year-olds, 1.2%–0.5% for 16-year-olds, and 1.5%–0.6% for 17-year-olds, which one study suggested was “probably due to increased condom use.”7 Mathematical modeling suggested that declines in HIV incidence between 2000 and 2008 were probably because of increased condom use.8 A systematic review of HIV/AIDS mass communication campaigns has shown that they have increased condom use.9 Secondary analysis of the 2008 Human Sciences Research Council survey found a positive association between exposure to HIV communication programs and condom use at last sex.10
Yet doubts remain that this change can be attributed to condom use, primarily because, at the individual level of measurement in population surveys, the correlation between condom use at sexual debut and HIV-positive status can be positive rather than negative. Many may already be infected before they start using condoms consistently or because condoms are used inconsistently and/or incorrectly.
The role of condom use at sexual debut has been overlooked. Figure 1 presents an analysis of 3 national surveys that shows the upward trend in condom use at sexual debut by “sexual generation,” the difference between current age and age at sexual debut, grouped into 3-year cohorts.11,12 In the 2005 survey, condom use at sexual debut 34–45 years ago was 6%. It remained at that level until around 1995, when it suddenly increased over the next 10 years to 55%. Subsequent surveys found the same trend, but with condom use at sexual debut reaching 68% by 2008. The consistency among the 3 national surveys indicates high measurement reliability. Each new cohort is presumably HIV negative up until the time of the first sexual encounter, with the exception of mother-to-child transmission and infected needle use. Later infection would result from irregular or improper condom use after sexual debut.
Statistical analysis of this relationship at the individual level is missing due in part to the lack of cohort data at the population level that can track individual behavior over time as well as the programs that promote them. Overlooked are advances in statistical analysis that allow for enhanced causal arguments from cross-sectional data if conducted after interventions have been implemented and specific assumptions hold. This article presents a secondary statistical analysis of the 2005 national South African Human Sciences Research Council survey to examine whether protection from HIV infection (remaining HIV negative) might be attributed to condom use at sexual debut and other prevention behaviors.11 Two interrelated hypotheses are tested: that HIV communication programs have an indirect effect on HIV status through their direct effect on condom use at sexual debut and that condom use at sexual debut assists in the prevention of HIV infection.
All persons over 2 years of age living in households in South Africa were sampled by means of a multi-stage disproportionate, stratified sampling approach, using a sampling frame provided by Statistics South Africa for the 2001 census. The data were adjusted for nonresponse and weighted by gender, age, race, locality type, and province to produce a representative sample of the population. Our analysis used a subsample of 6829 adults aged 15–84 years who reported having sexual intercourse during the previous 12 months (sexually active), answered the question about condom use at sexual debut, and agreed to be tested for HIV. This subsample represents a frequency-weighted population of 18,922,667 adults.
During household interviews, blood specimens were collected on absorbent paper and then tested for HIV using the Vironostika HIV-1 Uniform II Plus O assay (bioMerieux, Durham, NC), then all HIV-positive samples were retested with a second enzyme-linked immunosorbent assay test (Vitros ECI; Ortho Clinical Diagnostics, Rochester, NY). Condom use was measured with 2 yes/no questions: “Did you use a condom the first time you had sexual intercourse?” and “Did you use a condom the last time you had sex?” “Have you ever used any drug by injection?” measured the risk from using injectable drugs. Faithfulness to 1 sex partner was ascertained from those who said being faithful or trusting their partner was a reason they did not feel at risk for HIV infection.
Exposure to AIDS communication programs was measured by asking respondents how many of 8 specific HIV/AIDS communication programs they were aware of—the government's Khomanani program (38% aware), loveLife (52%), Soul City (64%), Soul Buddyz (46%), provincial government campaigns (23%), Gazlam (49%), Tsha Tsha (45%), and Takalani Sesame (55%)—summed to create a continuous scale from 0 to 8 (Cronbach alpha = 0.88). For the analysis, we used multiple linear regression to obtain a predicted measure of “program awareness.” This created a measure that is exogenous to condom use at sexual debut and thus eliminates the potential threat of reverse causality, a requirement for a causal inference.13 To account for contextual influences on condom use behavior, we measured burden of disease as the average level of HIV infection in one's sampling cluster, excluding oneself (non-self mean). The natural log of this measure was used to adjust for skewness.
We used a combination of existing statistical methods to test the assumptions required for enhancing causal arguments. These include structural equation modeling with multivariate probit regression to test the path from communication awareness to condom use to HIV status, and propensity score matching (PSM) with sensitivity analysis to estimate the proportion of HIV-negative status that can be attributed to condom use at sexual debut. These complementary statistical methods provide mutually reinforcing strength to arguing that statistical associations might have a cause element.14–16 Such a multi-method approach provides a useful alternative to the randomized controlled trial (RCT), which may not be feasible at the national level and may lack generalizability when there is nonrepresentativeness of research participants.
Structural equation modeling examines two or more endogenous (dependent) variables, which are covariates in the other equations. It requires testing for exogeneity (independence) of these covariates in their respective equations and controlling for potential confounding variables. A successful exogeneity test helps reduce the threat of reciprocal causality and self-selection bias—the voluntary exposure to treatment rather than random assignment. It also requires that theoretical assumptions be made to define the equations in the SEM. In this analysis, 3 interrelated equations are used, corresponding to the following testable theoretical path model:
where y1it is HIV status, y2it is condom use at sexual debut, and y3it is awareness of the 8 communication programs for subject i measured at time t. Xit, Zit, and Wit are matrices of exogenous, socioeconomic, and demographic control variables and other explanatory variables. Coefficients β, δ, and γ are parameters to be estimated from the data with regression analysis, and ε1it, ε2it, and ε3it are the disturbance terms (residuals from the fit of each model to the data). Multivariate probit regression is used to estimate the parameters of HIV-negative status and condom use at sexual debut. The continuous variable, awareness of communication programs, is analyzed by ordinary least squares (linear) regression to obtain the predicted variable used in the other 2 equations.17
The test for exogeneity of condom indicates if there are unobserved/omitted variables that may account for its observed relationship with HIV status and if the relationship could be reciprocal (reverse causality). We use multivariate probit regression to test if correlation (rho coefficient) of the 2 residual terms, ε2it and ε1it, is significantly different from zero. If so, y2it (condom use at sexual debut) can be considered exogenous in the equation for HIV status and a causal inference would be strengthened.18
PSM seeks to emulate the random assignment characteristic of RCTs by constructing an untreated, counterfactual control group that is statistically equivalent to those treated.19 As with RCTs, the average estimated difference between the treatment and constructed control might be interpreted as causal. Logistic regression is used to calculate the propensity score for each respondent, the probability of being treated (condom use at sexual debut), that is conditional on a set of the measured confounding variables. Matching treated and untreated cases based on this propensity score creates a control group in which all included covariates are balanced (statistically equivalent) between those treated and those not treated.20
PSM assignment is said to be “strongly ignorable” (conditionally independent) when, conditional on the observed covariates, there are no systematic, unobserved pretreatment differences between exposed and unexposed subjects that are related to the outcome being studied.21,22 If the multivariate probit regression tests are systematic, then unobserved, pre-treatment differences are likely to exist. If the correlation of the residuals (rho) is confirmed as not statistically significant, these differences are less likely to exist. Residual terms that are not statistically significant are more likely to be randomly rather than systematically related; hence, the results of the PSM mimic what would be obtained if random assignment was used.23 Sensitivity analysis supports the robustness of PSM by statistically simulating a potential binary confounding variable and then adding it to the propensity score calculation to see how much it would affect the original conditional independence assumption.24,25
The descriptive statistics for HIV status and the 17 socioeconomic variables used in the regressions for the SEM are reported in column 2 of Table 1. The average rate of HIV-negative status for those sexually active in the past 12 months (N = 6829) is 83.3% (conversely, 16.7% HIV positive). The next 3 columns present the estimated parameters of the SEM for HIV communication awareness, condom use at sexual debut, and HIV-negative status, and their 95% confidence intervals (CIs). Likelihood ratio tests were performed to justify exclusion of nonsignificant variables from each equation, required to identify the model.
Fit of the Structural Equation Model to the Data
The variance explained (adjusted R2) by the model for communication program awareness is 0.41, 0.20 for condom use at last sex, and 0.18 for HIV-negative status (pseudo-R2). Fit of the models to the data was estimated with the Stukel likelihood ratio χ2 goodness-of-fit test.26,27 A model that fits well has a χ2 statistic small enough that it is not statistically significant, indicating no difference between the model and the data.
The model for condom use at sexual debut satisfies this criterion (χ2 = 0.82, P > 0.66) as does the model for HIV-negative status (χ2 = 5.13, P > 0.08). The multivariate probit analysis shows that the correlation of the residual terms, rho, from the condom use and HIV status equations is not statistically significant [rho = −0.048; χ2 = 1.06 (1df), P > 0.30].
Condom Use at Sexual Debut
The strongest predictors of awareness of communication programs are frequency of television viewing, radio listening, newspaper reading, and discussion of AIDS in community meetings, which are generally the primary channels used to promote condom use. The multivariate probit coefficient for the impact of predicted communication awareness on condom use at sexual debut is 0.056 (P < 0.001, 95% CI: 0.02 to 0.09), adjusting for all other variables.
Figure 2 shows the adjusted marginal effects of predicted awareness of communication programs on condom use at sexual debut. The strongest impact is on youth aged 15–24 years (upper trend line). This finding is consistent with the aggregate-level trend in Figure 1, which shows an inflection point in condom use at sexual debut beginning in the mid-1990s. The graph reveals a dose–response effect of the likelihood of communication exposure on condom use at sexual debut, starting at 41.4% for the lowest level of awareness and increasing to 51.4% for the highest level. For the total sample, this response increases from 18.4% to 24.6%. It is lowest for ages 25–43 years, increasing from 13.0% to 17.9%, which is not statistically significant.
The impact of condom use at sexual debut on maintaining one's negative HIV status is positive and statistically significant (0.25; 95% CI: 0.06 to 0.44). Negative status is also significantly more likely among those who say they are not at risk for infection because they are faithful and/or trust their sexual partner, among those with tertiary or higher education, students and pensioners, and white, colored, and Indians compared with blacks. HIV-negative status is significantly less likely among those who ever used injectable drugs, females, singles, the 2 youngest age groups, and those living in communities (clusters) with a higher rate of HIV prevalence. Condom use at last sex is not statistically significant.
Propensity Score Matching
Model 2 from the SEM is used for the propensity score analysis with 2 nonconfounding variables excluded, predicted communication awareness, and age at sexual debut. The propensity score (probability) ranges from 0.016 to 0.907 for the region of common support (overlap) of treated and untreated; 12 cases were excluded because of lack of common support. The analysis resulted in 9 strata (blocks) along this probability continuum that are balanced with the exception of 1 variable, secondary education, in the second strata (N = 837) with P = 0.001. In the remaining 8 strata, there are no statistically significant differences between the treated and untreated cases for any of 7 variables used to calculate the propensity score, mimicking random assignment on these variables.
Stratified matching over 8 balanced strata yielded an average treatment effect on the treated (ATT) of 0.034 (bootstrap, analytical standard error (SE) = 0.011; 95% CI: 0.01 to 0.06).
ATT estimated with nearest neighbor matching was similar, 0.036. As a percentage of all those who used condoms at sexual debut who remained HIV negative, a difference in HIV-negative status of 4.0% (3.4/85.8) may seem small, but as a percentage of those who were infected, 23.9% (3.4/14.2), the difference is substantial. It reflects a 24% lower rate of HIV-positive status for those who used condoms at sexual debut.
Among the population of 18,922,667 (frequency weighted) sexually active adults in South Africa to whom these results apply, 20.5% (3,884,302) reported using condoms the first time they had sex. The PSM estimate means that 3.4% would have become HIV positive if they had not used condoms. This amounts to an estimated 132,066 HIV infections that were averted (0.034 × 3,884,302) because of condom use at sexual debut.
Results from the sensitivity analysis (data not shown) of this PSM imply that the estimate is robust to potential violations in the assumption of strong ignorability (conditional independence).28 A simulated omitted variable similar in magnitude to that of single marital status when added to the PSM reveals that the causal effect of condom use at sexual debut on HIV-negative status would still be large and significant. An omitted variable capable of nullifying the estimated causal effect would have to be able to double the odds of condom use at sexual debut and increase the odds of HIV-negative status almost 8-fold. It is reasonable to assume that variables of such magnitude are rare; hence, the estimate from PSM with the 7 measured variables is not likely to be threatened by potential missing variables.
The aggregate and individual levels of analyses indicate that HIV communication programs have made important contributions to the national AIDS response since 1995. The SEM results support the hypothesis that communication has an indirect effect on HIV infection by inducing condom use at sexual debut. SEM and PSM support the hypothesis that using a condom at sexual debut increases the likelihood of maintaining one's HIV-negative status.
Some who used condoms at sexual debut were not protected over time. Follow-up regression analysis on just the subsample that used condoms at sexual debut (N = 1480) revealed that those who used a condom the last time they had sex were 0.61 less likely to have become infected. Those who were subsequently infected were 2.2 times more likely to be women than men, 6.5 times more likely to be 25–43 years old, 2.5 times more likely to be from a sampling cluster with HIV infection, 3.7 times more likely to be living in KwaZulu Natal province, and 1.8 times more likely to say they did not “trust a partner as a reason for feeling at risk” for infection.
The principal limitation of this kind of study is that we cannot be sure that characteristics of those who are aware of HIV communication programs and who used condoms at sexual debut might not be systematically different from the characteristics of others, such that the observed associations with condom use and HIV-negative status, respectively, are not causal but instead reflect unmeasured behavioral and other confounding variables.
The statistical methods used in this analysis, however, were designed to test and minimize this threat. PSM helps by constructing a statistically equivalent control group for condom use at sexual debut, but its effectiveness depends on variables explicitly included in the analysis; omitted variables may still be a threat. If the regression models fit the data well, SEM can increase confidence that the PSM assumption of strong ignorability—no difference between treated and untreated in ways that would affect the outcome—has been satisfied, but it cannot completely rule out this threat. It, too, is only as effective as the variables that are included in the regressions. Multivariate probit regression provides tests for this possibility statistically, but even nonsignificant relationships might still bias the results. The PSM simulation of potential missing variables offers some support that this would be unlikely: The effect size has to be relatively high for omitted variables to bias the observed results, and they would also have to be independent of the measured variables.
The self-reporting of data is another limitation of this study. HIV status is based on a laboratory test. Memory of one's first sexual experience is expected to be good, but social desirability may bias the report of condom use. The consistency of this measure across 3 surveys (Fig. 1) indicates a high degree of reliability in terms of stability over time. The reliability of all the remaining variables remains unknown, but the fact that their relationships with the 3 dependent variables are consistent with theory and results from other surveys implies a minimally acceptable level of reliability and validity.
The study has found evidence that 3 HIV prevention methods offered some protection in the population by 2005, especially condom use at sexual debut, which increased dramatically among youth during the previous 10 years. Further analysis revealed condom use at last sex adds to that initial protection. The multivariate nature of the study sheds light on many other factors that contribute to, or hinder, disease prevention—information that may help improve prevention programs. To maintain and increase the protective effect of condom use at sexual debut, prevention programs should place emphasis on condom use at sexual debut and on consistent and correct use thereafter, noting that it has become the social norm for South Africa.
The authors acknowledge the role of the Human Subjects Research Council of South Africa, which conducted the survey on which this secondary analysis was performed, and all those who participated in that study. They also thank Olive Shisana, President and Chief Executive Officer of HSRC, for making the data from this survey available and Leickness Simbayi, Research Director, for encouraging to undertake this analysis.
1. Ahmed S, Lutalo T, Wawer M, et al.. HIV incidence and sexually transmitted disease prevalence associated with condom use: a population study in Rakai, Uganda. AIDS. 2001;15:2171–2179.
2. Weller SC, Davis-Beaty K. Condom effectiveness in reducing heterosexual HIV transmission. Cochrane Database Syst Rev. 2002;1:CD003255.
3. Shafii T, Stovel K, Davis R, et al.. Is condom use habit forming? Condom use at sexual debut and subsequent condom use. Sex Transm Dis. 2004;31:366–372.
4. Hendriksen ES, Pettifor A, Lee SJ, et al.. Predictors of condom use among young adults in South Africa: the Reproductive Health and HIV Research Unit National Youth Survey. Am J Pub Health. 2007;97:1241–1248.
5. Shafii T, Stovel K, Holmes K. Association between condom use at sexual debut and subsequent sexual trajectories: a longitudinal study using biomarkers. Am J Pub Health. 2007;97:1090.
6. Shisana O, Rehle T, Simbayi LC, et al.. South African National HIV Prevalence, Incidence, Behavior and Communication Survey 2008: Turning the Tide among Teenagers? Johannesburg: Human Sciences Research Council; 2009.
7. Rehle TM, Hallett TB, Shisana O, et al.. A decline in new HIV infections in South Africa: estimating HIV incidence from three national HIV surveys in 2002, 2005 and 2008. PLoS ONE. 2010;5:e11094.
8. Johnson LF, Hallet TB, Rehle TM, et al.. The effect of changes in condom usage and antiretroviral treatment coverage on human immunodeficiency virus incidence in South Africa: a model-base analysis. J R Soc Interf. 2012;972:1544–1554.
9. Noar SM, Palmgreen P, Chabot M, et al.. A 10-year systematic review of HIV/AIDS mass communication campaigns: have we made progress? J Health Comm. 2009;14:15–42.
10. Peltzer K, Parker W, Maboso M, et al.. Impact of national HIV and AIDS communication campaigns in South Africa to reduce HIV risk behaviour. Scientific World J. 2012;2012:1–6.
11. Shisana O, Rehle T, Simbayi LC, et al.. South African National HIV Prevalence, HIV Incidence, Behavior and Communication Survey, 2005. [Computer file]. SABSSM 2005 Adult-youth. Cape Town, South Africa: Human Sciences Research Council Press; 2005.
12. Johnson S, Kincaid DL, Laurence S, et al.. Second National HIV Communication Survey 2010. Pretoria, South Africa: Johns Hopkins Health and Education South Africa; 2010.
13. Hill AB. The environment and disease: association or causation. Proc R Soc Med. 1965;58:295–300.
14. Kincaid DL, Do MP. Multivariate causal attribution and cost-effectiveness of a national mass media campaign in the Philippines. J Health Commun. 2006;11(suppl 2):1–21.
15. Babalola S, Kincaid DL. New methods for estimating the impact of health communication programs. Commun Methods Meas. 2009;3:61–83.
16. Pearl J. Causal inferences in statistics: an overview. Stat Surv. 2009;3:96–146.
17. Bollen KA, Guilkey DK, Mroz TA. Binary outcomes and endogenous explanatory variables: tests and solutions with an application to the demand for contraceptive use in Tunisia. Demography. 1995;32:111–131.
18. Greene WH. Econometric Analysis. New York, NY: MacMillan; 1993.
19. Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol. 1974;66:688–701.
20. Heckman JJ, Ichimura H, Todd P. Matching as an econometric evaluation estimator. Rev Econ Stud. 1998;65:261–294.
21. Rosenbaum P, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55.
22. Joffe MM, Rosenbaum PR. Invited commentary: propensity scores. Am J Epidemiol. 1999;150:327–333.
23. Dehejia RH, Wahba S. Propensity score matching methods for non-experimental causal studies. Rev Econ Stat. 2002;84:151–161.
24. Ichino A, Mealli F, Nannicini T. From temporary help jobs to permanent employment: What can we learn from matching estimators and their sensitivity? J Appl Econ. 2008;23:305–327.
25. Rosenbaum PR. Sensitivity analysis to certain permutation inferences in matched observational studies. Biometrika. 1987;74:13–26.
26. Stukel TA. Generalized logistic models. J Am Stat Assoc. 1988;83:426–431.
27. Hosmer DW, Hosmer T, Le Cessie S, et al.. A comparison of goodness-of-fit tests for the logistic regression model. Stat Med. 1997;16:965–980.
28. Nannicini T. Simulation-based sensitivity analysis for matching estimators. Stata J. 2007;7:334.