# Methods to Estimate the Number of Orphans as a Result of AIDS and Other Causes in Sub-Saharan Africa

Objective: To derive methods to estimate and project the fraction of children orphaned by AIDS and other causes.

Methods: HIV/AIDS affects orphan numbers through increased adult and child mortality and reduced fertility of HIV-positive women. We extend an epidemiologic and demographic model used previously to estimate maternal orphans to paternal orphans. We account for the impact of HIV/AIDS on child survival by modeling the HIV status of the partners of men who die of AIDS or other causes based on data on the concordance of heterosexual partners. Subsequently, the proportion of orphans whose parents have both died is predicted by a regression model fitted to orphanhood data from 34 national demographic and health surveys (DHSs). The approach is illustrated with an application to Tanzania and compared with DHS estimates for the years 1992 and 1999.

Results: Projections of the number and age distribution of orphans using these methods agree with survey data for Tanzania. They show the rise in orphanhood over the last decade that has resulted from the HIV epidemic.

Conclusions: The methods allow estimation of the numbers of children whose mother, father, or both parents have died for countries with generalized heterosexual HIV epidemics. These methods have been used to produce orphan estimates for high-prevalence countries published by Joint United Nations Program on HIV/AIDS, World Health Organization, United Nations Children's Fund, and US Agency for International Development in 2002 and 2004.

From the *Department of Infectious Disease Epidemiology, Imperial College Faculty of Medicine, St. Mary's Campus, London, United Kingdom; and †Centre for Population Studies, Department of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, United Kingdom.

Received for publication January 16, 2003; accepted January 7, 2005.

Supported by the Joint United Nations Program on HIV/AIDS.

Reprints: Nicholas C. Grassly, Department of Infectious Disease Epidemiology, Imperial College Faculty of Medicine, St. Mary's Campus, Norfolk Place, London W2 1PG, United Kingdom (e-mail: n.grassly@imperial.ac.uk).

The death from AIDS of large numbers of young and middle-aged adults in Africa and elsewhere is producing a parallel rise in the number of orphaned children. Estimates and projections of the numbers of children whose parents have died of AIDS or other causes are needed to inform policy and programmatic decisions. In the past, several agencies have produced estimates of AIDS orphans. These have differed because they have used different definitions and methods and have been based on different assumptions about HIV prevalence, epidemiology, and natural history. Because estimation of the number of children whose father has died (paternal orphans) or whose parents have both died (dual or double orphans) is relatively complex, some studies only present statistics on children whose mother has died (maternal orphans).^{1} Other studies have used simple assumptions to estimate paternal orphanhood that produce approximate estimates.^{2}

Clearly, the death of children's fathers as well as their mothers may have an adverse impact on their welfare. Moreover, dual orphans are particularly disadvantaged. This article proposes methods for making estimates and projections of maternal, paternal, and dual orphans as a result of AIDS and other causes. An “AIDS orphan” is defined as a child who has at least 1 parent dead as a result of AIDS, and a dual AIDS orphan is a child whose mother and father have both died, at least 1 as a result of AIDS.

Orphan numbers can be calculated from statistics on the deaths of adults by estimating how many children were born to those adults who have died and whether these children remain alive. In countries with an AIDS epidemic, the calculations need to allow for the impact of HIV infection on mortality and women's fertility while accounting for the transmission of HIV from mother to child and between parents. Key inputs to the estimation procedure are thus AIDS and other-cause mortality data and fertility data by age and sex. These may be derived from demographic projections, census data, and/or household surveys. We make no attempt to derive such data here. The methods we describe have been used by the US Census Bureau to produce estimates of AIDS and other-cause orphan numbers that were adopted by the Joint United Nations Program on HIV/AIDS (UNAIDS), United Nations Children's Fund (UNICEF) and US Agency for International Development (USAID) since 2002.^{3,4}

## METHODS

### Maternal Orphanhood

The relation between mortality of women and orphanhood of children in a stable population is well understood.^{5-7} It depends largely on the ages of women giving birth. Extension of the analysis to nonstable populations is straightforward.^{8} These relations are sufficiently insensitive to variation in most aspects of a population's demography that orphanhood data can be used to estimate adult women's mortality and vice versa. We describe methods to estimate maternal orphans as a result of AIDS and other causes in the next 2 sections.

#### AIDS Orphans

The probability that a child survives to age *a* given that his or her mother died of AIDS τ years ago can be estimated from the average of the schedules for HIV-positive and HIV-negative child survival weighted by the probability of perinatal transmission:

where *Y*_{s}(*a* − τ) is the proportion of adults who died of AIDS τ years ago that were in stage *s* of HIV infection *a**−**τ* years before death (ie, at the time of the child's birth), *n* is the number of stages of HIV infection considered, ξ_{s} is the probability of vertical transmission for a woman in HIV stage *s* (we assume an instantaneous probability of transmission at birth, because most postnatal transmission occurs within 6 months of birth^{9,10}), and ι_{a} and ι′_{a} are the probability of an uninfected or infected orphan, respectively, surviving until exact age *a*. The latter is given by

is the fraction of children who have not died specifically from AIDS by age *a*.^{11}

Recent analysis of mortality in children less than 5 years of age from cohort studies in sub-Saharan Africa reveals higher mortality among orphans compared with other children in the year preceding the mother's death as well as the year after.^{12-14} This is the case even after adjusting for the HIV status of the mother and, hence, potential perinatal HIV transmission. The adjusted mortality hazard ratios for the year before and the year after the mother's death were 3.2 (95% confidence interval [CI]: 2.0-5.2) and 6.2 (95% CI: 3.7-10.2), respectively. This increased hazard was independent of the child's age, although only data on children less than 5 years of age were included in the analysis. In the absence of further information, we apply an increased mortality hazard over and above background mortality for orphans up to the age of 15 years in the year before and the year after their mother's death.

The proportion of adults in stage *s* of HIV infection *x* years before death can obtained by solving the following:

where γ_{s} is the rate of progression from stage *s* to *s**−* 1, which is equal to the reciprocal of the mean length of time spent in stage *s* (note γ_{0} = γ_{n+1} = 0), with the boundary condition *x* = 0, *Y*_{s=n}(0) = 1, and *Y*_{s≠n}(0) = 0.

The probability that a woman who died of AIDS had a child at a given time is determined by her fertility history. This is a function of her age and time spent in different stages of HIV infection. If

is the number of women of age *i* who died from AIDS at time *t*, elaborating Equation 1, the number of maternal AIDS orphans of exact age *a* at time *t* whose mother died τ years ago can be defined as follows:

where *mi,s,t−a* is the fertility rate of woman of age *i* in HIV stage *s* at time *t**− a*. Figure 1 clarifies the relative timing and notation used for the child's birth and mother's death. Summing over τ gives the number of maternal AIDS orphans aged exactly *a* at time *t*:

The number of maternal orphans aged *a* at their last birthday is approximated by the arithmetic mean of the numbers at neighboring exact ages:

#### Non-AIDS Orphans

Estimating the number of maternal orphans whose mother died of causes other than AIDS is complicated in the presence of an HIV epidemic, because the prevalence of HIV infection among these women is rarely known. The relation of prevalence among these women to prevalence in the female population is complicated by correlations between risk of HIV infection, lifestyle, and environmental factors and risk of death as a result of causes other than AIDS. Furthermore, because the stage of progression of HIV disease in women dying as a result of other causes is unknown, it is not possible to back-calculate HIV status using Equation 2. Most non-AIDS deaths are of older women among whom the prevalence of HIV is much lower than in young women. If we make the crude assumption that women who die as a result of causes other than AIDS are uninfected, the number of maternal orphans as a result of causes other than AIDS is as follows:

where *mI,s,t-a* is the number of women of age *i* who died of other causes at time *t*. Although the prevalence of HIV infection among women dying as a result of causes other than AIDS is unlikely to be near 0, particularly for young women, the impact of failing to allow for this on projections of orphan numbers is small. Estimates of non-AIDS maternal orphans in populations with high HIV prevalence (up to 40%) differ by less than 10% between the extreme assumptions that (1) no woman who dies from causes other than AIDS has HIV and (2) HIV prevalence among these women during their fertile years matches that among all women tested at antenatal clinics (ANCs). Because the truth lies between these extremes, the error arising from assumption 1 is likely to be 5% or less even for countries with high HIV prevalence.

As for AIDS orphans, numbers of orphans as a result of causes other than AIDS by age at last birthday,

are approximated by the arithmetic mean of the neighboring age classes of the estimates by exact age,

(see Equation 5).

#### All-Cause Orphans

The total number of maternal orphans aged *a* at their last birthday at time *t*, irrespective of cause, is simply:

Note that this is the number of maternal orphans, irrespective of the fathers' survival status, and thus includes dual orphans.

### Paternal Orphanhood

The formal relations that define the prevalence of maternal orphanhood also define the prevalence of paternal orphanhood. They have seldom been used to model numbers of AIDS orphans, because data on the fertility of men are rare. However, the literature on indirect estimation of mortality demonstrates that the relation is robust to variation in the shape of male fertility distributions. Models of these distributions exist^{15} and have been used in the estimation of mortality from data on paternal orphanhood.^{16} They can also be used to address the reverse problem.

#### AIDS Orphans

The main difficulty involved in calculating the number of paternal AIDS orphans is that the HIV status of the mothers must be estimated to determine the probability of survival of the paternal orphans. Thus, estimating paternal orphans requires assumptions about concordance of the HIV status of the mother and father. Let

be the number of children of exact age *a* at time *t* whose father died of AIDS τ years ago. This is described by an analogous summation to that in Equation 3 for maternal AIDS orphans:

where

is the number of men of age *i* who died of AIDS at time *t*, *P*(*s*|*r,t*− *a*) is the probability that a pregnant woman is in HIV stage *s* if the father is in stage *r* at time *t*− *a*,

is the fertility of uninfected men aged *i* at time *t*− *a*, and *kr,s* is the relative fertility of a man in HIV stage *r* with a partner in stage *s*. We currently assume that the HIV status of the man does not affect his fecundity. Therefore, *kr,s**= ks*, where *ks* is the relative fertility of women infected with HIV and in stage *s*. All other variables are defined as before with the exception of child mortality, which is assumed to be no higher for children whose father is dead than for children whose father is alive other than as a result of pediatric AIDS.

*P*(*s*|*r, t* −*a*) can be estimated from data on the concordance of the partners of HIV-positive men. For simplicity and because of data constraints, we use a paternal orphanhood model that distinguishes HIV-negative and HIV-positive women only (*s* = 0 or 1, respectively) and ignore the complications that might arise from the impact of AIDS illness on concordance and fertility.

The concordance of HIV status of the partner of an HIV-positive man depends on the prevalence of HIV among all women, which indicates how likely it is that a woman became infected in another partnership, any additional risk arising from the HIV-positive men assortatively selecting high-risk women for partnerships, and the probability that the woman was infected by her partner. This latter transmission probability depends on the length of partnership, stage of the man's HIV infection and thus viral load, numbers and types of sex acts, prevalence of cofactors that enhance the transmission of HIV, and, possibly, viral genetic factors. Data exist on all these diverse factors for few, if any, countries. Therefore, a logistic regression was carried out predicting prevalence among women with HIV-positive partners from HIV prevalence as measured at ANCs. It is assumed that partnerships across the studies and countries reflect some kind of “average heterosexual partnership” and variation with the age of the man is ignored. Data were obtained by searching the US Census Bureau *HIV/AIDS Surveillance Database* and published literature.^{17-28} A total of 23 studies reporting prevalence among women with positive partners were identified, 8 from sub-Saharan Africa, 10 from South America, 4 from Asia, and 1 from the Middle East. Twelve studies reported prevalence among women at ANCs greater than 1%. The regression equation relates the probability that a woman is HIV-positive, given that her partner is positive, to HIV prevalence as measured at an ANC (*p*_{clinic}), such that:

where α and β are the regression coefficients. We discard the subscript referring to time for notational clarity.

The prevalence of HIV infection among women with HIV-negative partners is lower than that assessed at ANCs, because infected women tend to be in partnerships with infected men as a result of transmission within that partnership. Data on the HIV status of the spouses of HIV-negative men are rarely collected because such women do not define a specific risk group.^{22,27,29} The probability that a woman with an HIV-negative partner is herself positive can be derived from Equation 9, however. The fraction of partnerships where both partners are positive is given by the product *P*(*s* > 0|*r* > 0) · *P*(*r* > 0). If the latter term, *P*(*r* > 0), is approximated by adult male HIV prevalence *x*, this is simply *fx*. If the probability of HIV infection among women in partnerships is approximated by adult female HIV prevalence such that *P*(*s* > 0) = *y*, the probability that woman in a partnership is HIV-positive given the probability that the man is negative is as follows:

where κ is the female-to-male ratio of adult HIV prevalence. Adult HIV prevalence is usually well approximated by prevalence at an ANC, with a female-to-male prevalence ratio of 1.0 to 1.4 in a mature generalized epidemic.^{1}

#### Non-AIDS Orphans

In the presence of an HIV epidemic, estimates of paternal non-AIDS orphan numbers should account for the possible transmission of HIV from mother to child. Because of the increasing hazard with age of mortality as a result of causes other than AIDS, men who die of other causes are likely to have relatively old female partners. In the absence of information about age differences in partnerships, we assume that the prevalence of HIV among their partners reflects that seen among women attending ANCs. In this case, the number of paternal orphans whose father dies as a result of causes other than AIDS is as follows:

where

is the number of deaths from causes other than AIDS of men aged *i* at time *t*, and *ys,t* is the prevalence of HIV stage *s* in the female population at time *t*.

#### All-Cause Orphans

As for maternal orphans, estimates of paternal AIDS and non-AIDS orphans by age at last birthday (

respectively) are obtained from the arithmetic mean of the neighboring age classes of the estimates by exact age (analogous to Equation 5). The total number of paternal orphans aged *a* at their last birthday at time *t* is therefore:

Note that this is the number of paternal orphans, irrespective of the mothers' survival status and thus includes dual orphans.

### Dual Orphanhood

Dual orphanhood should be more common where the prevalence of maternal and/or paternal orphanhood is high. If the risk of a child's mother and father dying were independent, the expected proportion of dual orphans among children of any age would be simply the proportion with dead mothers multiplied by the proportion with dead fathers. In practice, the observed prevalence of dual orphanhood is always higher than this, because the 2 parents of a child are usually of much the same socioeconomic status, exposed to many of the same environmental risks, and at risk for dying together in an accident or episode of violence. Direct transmission of infections such as tuberculosis and HIV between a child's parents also occurs. Explicit models of these processes to parallel those for maternal and paternal orphans would be extremely complex, and the data required to set parameters for the model are beyond those available. We therefore analyze empiric data on the relation between dual orphanhood and maternal and paternal orphanhood from household surveys in sub-Saharan Africa.

Large numbers of nationally representative household surveys that collected data on orphaned children were conducted in sub-Saharan Africa during the 1990s. Most of them were part of the demographic and health survey (DHS) program sponsored by USAID. The surveys use a tabular schedule to collect information on members of the households surveyed. In most African surveys since 1990, this has included the questions “Is this child's mother alive?” and “Is this child's father alive?” for every child aged 0 to 14 years.

This study analyzes the orphanhood data from 34 DHS surveys conducted in 25 countries, each of which collected data on between 8000 and 31,000 children aged 0 to 14 years (Table 1). This database does not represent a random sample of African countries or points in time. Nevertheless, the surveys cover countries from all parts of the region and include 8 recent surveys of countries with severe AIDS epidemics. Data from the DHS survey conducted in Nigeria in 1999 were excluded from the analysis because the survey had a 6.5% nonresponse rate for questions about parental survival and internal consistency checks suggest that a disproportionate number of the nonresponders had dead fathers. The 1993 Ghana data were also excluded, because this survey reported an implausibly high level of dual orphanhood that the 1998 survey of the same country failed to confirm.

We use a regression model to predict dual orphanhood as the product of its expected prevalence, given the prevalence of maternal and paternal orphanhood, and the excess risk of dual orphanhood relative to this expected risk. The excess risk at age *a*, *Ea*, is the ratio of the observed (*Ot,a*) to the expected (*Õt,a*) number of dual orphans aged *a* at time *t*. If *Ct,a* is the total number of children aged *a* at time *t*:

where the ω_{t,a} and ψ_{t,a} used to fit the model are measured in the household surveys but are calculated from Equations 7 and 12, respectively, in applications of the model to forecasting.

We assume that the number of dual orphans of a given age in a particular population is a Poisson random variable with the mean equal to the number of children of that age multiplied by an underlying risk of being a dual orphan, λ*t,aCt,a*. If the equivalent risk of dual orphanhood, assuming independent risks of maternal and paternal orphanhood, is

The last term of the latter equation is the log excess risk of dual orphanhood relative to the risk if the probabilities of a child's mother and father being dead were independent. To establish whether the fitted model can be used to make projections, it is important to establish the extent to which the excess risk of dual orphanhood varies between countries and over time and whether the relation between this risk and age varies between countries. Therefore, we model the log excess risk using additive multilevel or random-effects Poisson regression:

where the subscripts *i* and *j* refer to individual children and the household surveys respectively, the θ's are coefficients to be estimated, the *Xj* is a covariate measured at the population level, *uj* is the error in the random intercept (assumed to be normally distributed with variance σ^{2}_{j}), indicating the degree to which the excess risk of dual orphanhood varies between countries, and γ_{i} is an individual-level error term that is permitted to have extra-Poisson variation, ω. We model DHS data on orphanhood in the age groups 0 to 2 years of age, 3 to 5 years of age, 6 to 9 years of age, and 10 to 14 years of age but treat age as a continuous covariate so that the fitted model can be used to predict dual orphanhood by single years of age. The model was fitted using the package MLWIN 1.10, and the final parameter estimates were obtained using the Markov Chain Monte Carlo option to eliminate bias in the iterative generalized least squares estimates. We experimented with various specifications of the regression model, including different assumptions about the population-level and individual error distributions, and fitted it using the negative-binomial random-effects regression procedure in Stata 7 as well as in MLWIN. All the approaches yielded similar conclusions and parameter estimates to those presented here.

Covariates that seem likely to affect the excess risk of dual orphanhood in a population include the ages of the children's parents, the severity of the AIDS epidemic in the country, and the background level of mortality. The model includes only simple aggregate indicators of these factors that should be possible to estimate for any African population. As proxies for the parents' ages, we use 2 indices of marriage patterns: the proportion of women aged 15 to 19 years who are married and the proportion of married women in monogamous unions. These indices were calculated from data collected in the DHSs, which also measured orphanhood. Second, we use UNAIDS' estimates of adult HIV prevalence to indicate the severity of the AIDS epidemic affecting the population. Finally, the World Health Organization's estimates of mortality for children less than 5 years of age for the quinquennium in which the data were collected are used as indicators of the background level of mortality.^{30} Although mortality in children less than 5 years of age is affected by an HIV epidemic, the contribution of HIV-related mortality is relatively small.^{31}

This study uses the UNAIDS' definition of a dual AIDS orphan as a child whose parents are both dead, with at least 1 dead as a result of AIDS. This can be calculated from the estimated number of dual orphans as a result of all causes by subtracting the number of dual orphans whose parents both died of causes other than AIDS (Fig. 2). Thus, the number of dual AIDS orphans aged *a* at their last birthday at time *t* is as follows:

### Total AIDS Orphans

Because maternal and paternal AIDS orphan numbers include children whose parents have both died of AIDS, estimates of the total number of orphans as a result of AIDS need to account for this overlap (see Fig. 2). Our approach does not distinguish those dual AIDS orphans whose parents have both died of AIDS from the smaller number whose parents have both died, but only 1 as a result of AIDS. It seems reasonable to assume that 1 parent's probability of dying of AIDS is independent of the other parent's probability of dying of some unrelated disease or condition. On this basis, the total number of AIDS orphans aged *a* at time *t* can be estimated as follows:

### Projections of the Number of Orphans

To illustrate the methods described, we project orphan numbers for Tanzania and compare these with estimates from the 1992 and 1999 DHSs. Population projections were obtained using the EasyProj function of the Spectrum System of Policy Models (Spectrum; Futures Group International) software package.^{32} This package uses United Nations Population Division demographic data^{33} and estimates the impact of HIV/AIDS on mortality and fertility using UNAIDS' estimates of HIV prevalence and the recommended methods of the UNAIDS Reference Group on Estimates, Modeling, and Projections.^{11} We reduced male and female non-AIDS mortality rates to reflect a more realistic level of adult mortality for Tanzania than the default Coale-Demeny “West” model life table used in the EasyProj function of Spectrum. Further details of the parameter estimates are provided in Table A1.

## RESULTS

### Concordance of HIV Status of Partners

The regression of prevalence of HIV among the female partners of HIV-positive men against prevalence in the general population as measured at ANCs is highly significant (α = −0.84, β = 9.1; *P* < 0.001; Fig. 3a). In the absence of HIV in the general population, 30% of women with a positive partner are predicted to be positive themselves as a result of transmission within that partnership. As prevalence in the general population rises, a greater fraction of these women are predicted to be positive, mainly because of the increased risk of preexisting infection. Significant scatter exists about the regression line; in particular, a relatively large number of sub-Saharan African populations seem to have low female HIV prevalence (early in the epidemic) but high rates of concordance. This is confirmed by the logistic regression of just the 8 sub-Saharan African populations (α = −0.22, β = 4.2; *P* = 0.005; see Fig. 3a). This may reflect a higher prevalence of other sexually transmitted infections (STIs) that enhance the transmission of HIV. This is not seen for the sub-Saharan African countries with high HIV prevalence, however, particularly Zambia and Uganda (Rakai). Because data on STI prevalence are only available for some countries, we model concordance as a function of female HIV prevalence alone.

The prevalence of HIV infection among women with uninfected partners is lower than in the general female population (see Fig. 3b). Estimates based on Equation 10 show good agreement with the data when regression coefficients are used from the analysis of all data and just the sub-Saharan African populations. As HIV prevalence for all women rises, so does the prevalence of HIV infection among women with uninfected partners. The rate of increase in discordance declines as prevalence increases; in fact, the regression coefficients from the analysis of all data suggest that concordance actually declines for extremely high antenatal HIV prevalence (>20%). A decline in discordance is not likely to be a result of only high HIV prevalence per se. High prevalence implies a relatively low proportion of recent infections and thus discordance. It is also a proxy for high STI prevalence and other risk factors that facilitate transmission of HIV and thereby reduce partner discordance.

In the analyses that follow, we use the estimated concordance relation based on all the available data rather than that on the 8 sub-Saharan African countries alone.

### Prevalence and Determinants of Dual Orphanhood

Before the onset of the HIV epidemic, approximately 5% of the fathers of children aged less than 15 years had died and approximately 2.25% of their mothers had died in most African countries. The difference reflects the higher mortality of men and the fact that they tend to be older than their wives. The prevalence of orphanhood rose to approximately twice this level in Uganda and Zambia by the mid-1990s and in Zimbabwe by 1999 (see Table 1). Orphanhood is also common in countries that have recently experienced wars or civil wars, such as Eritrea and Mozambique. In general, the countries in which orphanhood is most common are those that have high levels of HIV infection. The probability that a child is an orphan is low just after birth and rises rapidly with age. For example, by the end of the 1990s, 18% of children aged 10 to 14 years in Uganda and Zimbabwe had dead fathers.

If the risk of death of the 2 parents were independent, only approximately 1 in 500 African children aged 0 to 14 years would have been a dual orphan before the rise in mortality from AIDS. Even in Zimbabwe in 1999, one would only expect 1 in 180 children to have lost both parents. In fact, in Zimbabwe, 1 in 47 children was a dual orphan, whereas across the 34 surveys, this is true of 1 in 138 children (see Table 1). The actual risk of being a dual orphan for children aged 0 to 14 years varies from twice the independent risk in Eritrea (1995) to 5.7 times the independent risk in Burkina Faso (1992).

The final models of the excess risk of dual orphanhood are shown in Table 2. The first includes those characteristics of a population that were found to affect dual orphanhood; the other predicts the risk based on the age of the child alone. We investigated whether the age of a child is a random effect, that is, whether the relation between age and the risk of orphanhood varies between African populations, but we found no evidence of this. Thus, age is modeled as a straightforward fixed effect. Several other variables were included in the regression analysis initially but were found to have no net effect on the excess risk of dual orphanhood. They include the mortality rate for children less than 5 years of age, the date when the survey was conducted, and the contraceptive prevalence rate, which we hypothesized might affect age patterns of fertility and thus the ages of the parents. In addition, we tested for other relations between HIV prevalence and the risk of dual orphanhood and for all the possible 2-way interactions between the explanatory variables before settling on the simpler models shown in Table 2. The overdispersion of the residual individual-level errors is unsurprising, because we have made no attempt to model individual heterogeneity in risk except that arising from age.

Together, the 2 age coefficients imply that the excess risk of dual orphanhood is greatest soon after birth, when relatively few children are orphaned. Because the parents of young children tend to be young themselves, it is likely that a relatively high proportion of these deaths involve a catastrophic event that affects both parents, such an accident, act of violence, or fatal infection. As the children get older and more of them become orphaned, their excess risk of dual orphanhood falls rapidly before leveling off at approximately 8 to 9 years.

The excess risk of dual orphanhood varies significantly with the severity of the HIV epidemic. Although the proportion of children who are dual orphans rises rapidly with HIV prevalence, as maternal and paternal orphanhood become more common, the excess risk of dual orphanhood rises only moderately relative to the independent risk. After some experimentation to find the best specification of the model, we lagged adult HIV prevalence by 5 years to reflect the fact that changes in orphanhood lag behind those in HIV prevalence just as rises in AIDS deaths lag behind rises in HIV incidence. The natural log of HIV prevalence was selected as the predictor, because untransformed prevalence tends to overestimate the excess risk in countries with severe epidemics.

Our regression model is intended for use in countries with generalized epidemics and is based on data from such countries. Although log adult HIV prevalence is a good predictor of excess risk in African populations, this specification of the regression model implies that the risk of dual orphanhood tends to 0 as HIV prevalence drops to extremely low levels. In practice, the HIV epidemic has a negligible impact on the excess risk of dual orphanhood until adult HIV prevalence rises above 1%, and the predicted excess risk at this prevalence can be taken as applying to populations in which HIV prevalence is less than 1%.

Both indicators of marriage patterns affect the prevalence of dual orphanhood. The excess risk of dual orphanhood is high in populations in which women marry late, because mothers tend to be older than if women marry early, and in populations where polygyny is common, because fathers' ages tend to exceed those of their wives by more than in a monogamously marrying population.

The random intercept coefficient measures the variation between African populations in the excess risk of dual orphanhood. In both models, the coefficient is statistically significant; the unexplained variation between populations in excess risk is unlikely to be a chance result of sampling the populations. The substantive significance of the coefficient is easier to gauge from Table 3. This refers to the model that includes population-level covariates. It shows the expected excess risk of dual orphanhood for 4 age groups and 2 levels of HIV infection and the range within which the actual excess risk is predicted to fall for 95% of populations. Caution is required in the interpretation of these ranges because they have been calculated not from a random sample of African populations but from those that conducted DHSs, a disproportionate number of which occurred in either 1992 or 1998. Bearing this in mind, one can tentatively conclude that the model usually predicts the excess risk of dual orphanhood to within ±10%. Errors in the estimated excess risk map directly into errors in estimates of the prevalence of dual orphanhood and the number of dual orphans. Thus, if the numbers of children and maternal and paternal orphans are known accurately, one can usually produce an estimate of how many of these children are dual orphans to within ±10% of the actual number. With the same caveats, the random intercept of the second model suggests that even if one knows nothing about the population in question except the numbers of maternal and paternal orphans, one can estimate how many of them are dual orphans to within ±15%.

### Projections of the Number of Orphans

Comparison of orphan projections for Tanzania with the 1992 and 1999 DHSs reveals good agreement, suggesting that the methods described here are appropriate (Fig. 4). The absolute levels of all types of orphanhood projected by the model are slightly higher than those indicated in the surveys. This may be a result of inaccuracies in the underlying demographic projections. For instance, AIDS and other-cause mortality or fertility may be overestimated. Alternatively, the DHS may fail to enumerate some orphans, such as street children or children living in institutions, or may misclassify foster children who are orphans as living with their biologic parent. The patterns of orphanhood are in close agreement. Paternal orphans are most common, numbering approximately twice the maternal orphans. The fraction of children who are orphans increases with age, because the older they are, the more time has elapsed since their birth during which a parent could have died.

The model projections of orphan numbers can be broken down by cause, indicating the significance of the HIV epidemic in the orphaning of children (see Figs. 4 a, b). By 1999, half of all maternal and paternal orphans in Tanzania are estimated to have been orphaned by AIDS. For dual orphans, this fraction increases to three quarters, resulting from the transmission of HIV between parents.

## DISCUSSION

The production of reliable estimates of paternal AIDS and other-cause orphans has been obstructed by difficulties in estimating the ages of fathers and the survival of their children in populations where men and some of their spouses may be infected with HIV. This study proposes solutions to these problems based on an existing model of male fertility and regression analysis of concordance of heterosexual partners to estimate maternal HIV status. In this way, paternal AIDS orphans and non-AIDS orphans can be calculated using an extension of established methods for projection of maternal orphans. Dual orphans can then be estimated from the number of maternal and paternal orphans using a regression model fitted to DHS data. This predicts the excess risk of dual orphanhood over and above that expected if the death of parents were independent. Available DHS data suggest that the regression equation estimates the number of dual orphans to within ±10% if the prevalence of maternal and paternal orphanhood is known without error.

As an alternative to our approach, one could use a regression analysis of the DHS data to model paternal as well as dual orphan numbers. We attempted this, modeling paternal and dual orphanhood as a function of maternal orphanhood, the children's age, and characteristics of the population (eg, HIV prevalence, background mortality, levels of polygyny). Even with maternal orphanhood given directly by the surveys, our best-fitting model failed to predict the number of paternal orphans to within 20% of the correct value for 30% of the data points.

Comparison of orphan estimates based on our methodology with DHS data for Tanzania in 1992 and 1999 reveals good agreement (see Fig. 4). The relation between the fraction of children who are maternal orphans and the fraction who are paternal or dual orphans is equivalent, and both estimates show similar increases in the prevalence of orphanhood with age. Because model estimates give the prevalence of orphanhood by cause, they are able to indicate the extent to which HIV/AIDS has driven up orphan numbers in Tanzania over the last decade.

Of course, estimates of orphan numbers derived using the methods described in this study are only as accurate as the underlying demographic estimates and assumptions. The estimated numbers of maternal and paternal orphans are most sensitive to mortality and fertility estimates. Examining Equation 3 reveals a direct linear relation between changes in the numbers of adult deaths or births and the number of maternal orphans. Estimates of the number of paternal orphans show a similar relationship (Equation 8). Child survival is also directly related to the prevalence of orphanhood, but observed variation in child survival is small compared with that in the adult mortality and fertility estimates. Variation in other parameters, such as concordance of parental HIV status or the probability of mother-to-child transmission of HIV, has a much smaller impact on orphan numbers.

Estimates of dual orphanhood depend on the validity of the DHS data used to derive the regression coefficients in Table 2. One limitation of the DHS data is that household surveys do not provide information on children outside households, notably those living in institutions and street children. Such children seem more likely than children living in private households to be orphans, especially dual orphans. In most of Africa, however, relatively few children are institutionalized or homeless. The DHS may also be failing to enumerate orphans because of misclassification of foster children as living with their biologic parent or failure to interview child-headed households in which orphanhood is common. No “gold standard” exists against which DHS data on orphanhood can be evaluated. It is encouraging that both questions about the survival of parents were answered for more than 97% of children in the 34-survey database, with answers missing to either or both questions for fewer than 4% of children in every survey. In addition, the results pass crude tests of their plausibility. The proportion of children who are orphaned uniformly rises with age, paternal orphanhood is more common than maternal orphanhood, and the prevalence of orphanhood tends to be high in countries with severe AIDS epidemics or a recent history of warfare. Finally, the reported proportions of children in a survey with a dead mother or father are not associated with the proportions of living children who live with the parent in question. Thus, no evidence exists that absent parents tend to be reported as dead or vice versa.

A more extensive assessment of the validity of orphan estimates for sub-Saharan Africa produced by the UNAIDS and WHO using these methods is presented elsewhere.^{34} Comparison of model estimates of the prevalence of orphans using 19 DHSs and 24 UNICEF-sponsored multiple indicator cluster surveys (MICSs) in 39 countries found good agreement after adjusting earlier estimates of mortality as a result of causes other than AIDS. Significantly, comparisons in countries with extensive HIV epidemics were no more different than for countries with limited HIV prevalence, and in 80% of the comparisons, the model estimates of maternal and paternal orphans fell within ±40% of the survey estimates.

Without doubt, estimating and projecting the numbers of AIDS and other orphans in Africa involves numerous assumptions about demographic and epidemiologic processes and rates that remain poorly understood and inadequately quantified. Nevertheless, assumptions have to be made about most of these parameters to produce population projections for any African population experiencing an AIDS epidemic. Estimating orphanhood involves additional calculations, but the extra information required is rather limited and the assumptions involved no more heroic than those that have to be made for population projections. The methods we describe for the estimation of orphan numbers replace conjecture with explicit data-driven models that can support evidence-based policy and planning in response to the AIDS-related orphan crisis.

## ACKNOWLEDGMENTS

*The authors thank Mohammed Ali and Christl Donnelly for their statistical advice and members of the UNAIDS Epidemiology Reference Group for their suggestions and assistance. In particular, Basia Zaba shared her estimates of excess mortality among orphans in Africa, Hania Zlotnik provided us with access to details of the United Nations Population Division's demographic estimates, and John Stover and Neff Walker supplied demographic projections from Spectrum that incorporate UNAIDS estimates and projections of HIV prevalence*.

## REFERENCES

*Report on the Global HIV/AIDS Epidemic, June 2000*. Geneva: UNAIDS; 2000.

*Children on the Brink 2000: Executive Summary, Updated Estimates and Recommendations for Intervention*. Washington, DC: USAID; 2000.

*Children on the Brink 2002: A Joint Report on Orphan Estimates and Program Strategies*. Washington, DC: USAID; 2002.

*Report on the Global HIV/AIDS Epidemic, July 2002*. Geneva: UNAIDS; 2002.

*Théorie Analytique des Associations Biologique*. Paris: Hermann et Cie; 1939.

*International Population Conference, Liège, 1973*. Liège: International Union for the Scientific Study of Population; 1973:111-123.

*Popul Stud (Camb)*. 1984;38:255-279.

*Popul Stud (Camb)*. 1994;48:435-458.

*AIDS*. 2000;14(Suppl):S57-S74.

*JAMA*. 2000;283:1167-1174.

*AIDS*. 2002;16(Suppl):W1-W16.

*AIDS*. 2003;17:1827-1834.

*J Acquir Immune Defic Syndr*. 2003;33:393-404.

*AIDS*. 2003;17:389-397.

*Popul Stud (Camb)*. 1994;48:333-340.

*Popul Bull UN*. 1992;33:47-63.

*HIV/AIDS Surveillance Database*. Washington, DC: International Programs Center, Population Division, US Census Bureau; 2001.

*Health Transit Rev*. 1997;7:113-126.

*J Acquir Immune Defic Syndr*. 1990;3:83-86.

*Scand J Immunol Suppl*. 1992;11:81-83.

*AIDS*. 1995;9:951-954.

*AIDS*. 1991;5:61-67.

*AIDS*. 1989;3:519-523.

*AIDS*. 1997;11:1765-1772.

*AIDS*. 1995;9:951-954.

*Int J STD AIDS*. 2000;11:468-473.

*AIDS*. 1995;9:745-750.

*Biomed Environ Sci*. 1993;6:348-351.

*AIDS*. 1999;13:1083-1089.

*Bull World Health Organ*. 2000;78:1175-1191.

*Lancet*. 2002;360:284-289.

*Spectrum System of Policy Models*[computer program]. Glastonburg, CA, Futures Group International; 2001.

*World Population Prospects: The 2000 Revision*, vol. 1.

*Comprehensive Tables*. New York: United Nations; 2002.

*Popul Stud (Camb)*. 2004;58:207-217.

*AIDS*. 1998;12(Suppl):S41-S50.

*Lancet*. 1998;351:98-103.

*AIDS*. 1999;13:2133-2141.

*Sex Transm Dis*. 2000;27:243-248.

## APPENDIX

The data sources and assumptions for parameter estimates are presented in Table A1.

**Keywords:**

orphanhood; AIDS orphans; AIDS impact; projections; sub-Saharan Africa