Institutional members access full text with Ovid®

Share this article on:

Correcting HIV Prevalence Estimates for Survey Nonparticipation Using Heckman-type Selection Models

Bärnighausen, Tilla,b; Bor, Jacoba; Wandira-Kazibwe, Speciosac; Canning, Davida

doi: 10.1097/EDE.0b013e3181ffa201
Methods: Original Article

Background: HIV prevalence estimates from population-based surveys are vulnerable to selection bias if HIV status is missing for a proportion of the eligible population. Standard approaches, such as imputation, to correct prevalence estimates for selective nonparticipation assume that data are “missing at random.” These approaches lead to biased estimates, if unobserved factors are associated with both survey participation and HIV status.

Methods: We use Heckman-type selection models to test and correct for selection on unobserved factors (separately for men and women) in the 2007 Zambia Demographic and Health Survey, in which 28% of the 7146 eligible men and 23% of the 7408 eligible women did not participate in HIV testing. Performance of these models depends crucially on selection variables that determine survey participation but do not independently affect HIV status.

Results: We identify 2 highly-plausible selection variables that are statistically significant determinants of survey participation: interviewer identity, and visit on the first day of fieldwork in a survey cluster. HIV-positive status was negatively correlated with consent to test in men (ρ = −0.75 [95% confidence interval = −0.94 to −0.18]), but not in women. Adjusting for selection on unobserved variables substantially increased the HIV prevalence estimate for men from 12% (based on measured HIV status alone) and 12% (based on imputation) to 21%. In addition, the adjustment for selection substantially changed the estimated effects of HIV risk factors.

Conclusions: Studies of HIV prevalence and risk factors based on surveys with substantial nonparticipation should routinely use Heckman-type selection models to correct for selection on unobserved variables.


From the aDepartment of Global Health and Population, Harvard School of Public Health, Boston, MA; bAfrica Centre for Health and Population Studies, University of KwaZulu-Natal, Mtubatuba, South Africa; and cConcave International, Kampala, Uganda.

Submitted 18 March 2010; accepted 16 July 2010.

Supported by Grant 1R01-HD058482-01 from the National Institutes of Health/National Institute of Child Health and Human Development (NIH/NICHD), and the William F. Milton Fund, Harvard University (to T.B.), and by Grant 2008-2302 from the William and Flora Hewlett Foundation, Grant 5 P30 AG024409 from NIH/National Institute of Aging (NIA), and Grant 1R21AG032572-01 from NIH/NIA (to D.C.).

Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (

Editors' note: A commentary on this article appears on page 36.

Correspondence: Till Bärnighausen, Department of Global Health and Population, Harvard School of Public Health, 665 Huntington Ave, 02115 Boston, MA. E-mail:;

© 2011 Lippincott Williams & Wilkins, Inc.