In 2001 the number of people living with diagnosed HIV infection in England, Wales and Northern Ireland rose by 16% from the previous year. This rise is the largest proportional annual increase since 1995 when comparable data were first collected through the annual survey of prevalent HIV infections diagnosed (SOPHID). To predict future HIV-related care burden extrapolation of the data was undertaken.
The reduction in AIDS incidence and HIV-related deaths, due to the widespread use of anti-retroviral drugs, and the growing impact of the global epidemic, have contributed to a curvi-linear rate of growth in the number of HIV patients seen for care by 2001. This has led us to explore models, other than the previously used linear regression model , to extrapolate the data. In this paper we present the methodology used to adjust the annual survey results and to extrapolate the data, and present estimates for 2002 to 2004, using a negative binomial model, and for comparison, a linear model.
The SOPHID survey was established in 1995 to satisfy the growing need for residence-based data for individuals with diagnosed HIV infection [2,3]. It aims to include every individual in England, Wales and Northern Ireland with diagnosed HIV infection who has attended statutory services for HIV-related treatment or care within a calendar year.
The SOPHID survey collects information on patient residence and provides local health planners with estimates of the number of individuals within their local population who have diagnosed HIV infection. No names are collected. De-duplication is achieved by use of soundex code of surname , sex and date of birth. Strict attention to confidentiality at every stage of data collection, analysis and storage has established and maintained the confidence of participants in the surveys.
Adjustments to observed data 1996–2001
The annual survey figures for 1996 to 2001 were adjusted to estimate the number of people with diagnosed HIV infection for each year and used as the basis for extrapolation to 2004. Only adults (15 years of age or above) resident in England, Wales or Northern Ireland were included in the analysis. For the 5% of records where route of exposure to HIV infection was not reported, exposures were proportionally allocated on the basis of sex, region and ethnicity to the exposure groups ‘sex between men', ‘heterosexual sex’ and ‘other'. Patient residence was summarized as ‘within’ or ‘outside’ London. Estimates by exposure and patient residence combination were produced. Matching to estimate both under-reporting and non-attendance was based on soundex code , date of birth and sex.
An annual adjustment for under-reporting to SOPHID was calculated by comparing the SOPHID data to other sources of HIV reporting data that indicate that a patient has attended treatment and care services during the year and therefore qualified for inclusion in SOPHID. The two sources used were the national reports of new diagnoses of HIV infections or AIDS, and the national CD4 surveillance data.
An adjustment for the number of individuals with diagnosed HIV not attending statutory services in a single year is calculated at the end of each SOPHID survey, using the data for that year and the two previous ones. The 2001 non-attendance adjustment, for example, was based on the number of individuals seen for care in both 1999 and 2001 but not 2000, as a proportion of the 2000 SOPHID total. The proportion not seen for care in 2000 was then applied to the 2001 data on the assumption that it provided the best available estimate of non-attendance for that survey.
Using the annual adjustments a combined adjustment for under-reporting and non-attendance of 13.5% was calculated for 1996 to 2001. Combined adjustment figures were then calculated for each exposure by patient residence groups by applying proportional allocation. To obtain end of year prevalence (instead of the period prevalence for the whole year) individuals who were reported to have died, and any additional deaths identified through the HIV/AIDS reporting system, were removed from the final adjusted datasets.
Statistical modelling: extrapolations 2002–2004
STATA 7.0 (Stata Corp., College Station, Texas, USA) was used for statistical analysis. The plot of the adjusted counts, particularly for the heterosexually infected group, looked exponential in character, rather than the previously seen linear relationship.
A number of model diagnostic and fitting techniques were applied to the 1996 to 2001 adjusted counts to find a model that yielded a good fit. The linear regression model, previously used to provide extrapolation estimates , was shown to no longer adequately fit the data. The curvature was modelled using quadratic, Poisson and negative binomial models. A negative binomial model was found to be the most appropriate to model the temporal trend for each of the exposure by patient residence groups.
Model adequacy was explored by plotting residuals against the predicted values and the residuals were tested for autocorrelation. Point interval estimates of the extrapolated counts from the negative binomial model, and for comparison linear regression, are presented. These have only been projected to 2004 due to the uncertainty that the large increase in new diagnoses seen in 2001 will continue.
Adjusted data 1996–2001
Table 1 shows the adjusted data (1996 to 2001) and the extrapolated data (2002 to 2004) by route of infection for within and outside London and for England, Wales and Northern Ireland as a whole. The increase in diagnosed HIV prevalence between 1996 and 2001 in London was 89% (8529 to 16 136) and outside London 103% (5351 to 10 839). For the whole of England, Wales and Northern Ireland the increase was 94% (13 880 to 26 975). During this period the number of diagnosed HIV infections ascribed to sex between men increased by 62% (8791 to 14 234), and to heterosexual sex by 213% (3550 to 11 129). The number of infections attributed to other routes, including injecting drug use, increased by only 3% (1539 to 1612).
Extrapolations 2001–2004: 2001 adjusted totals compared to 2004 negative binomial and linear estimate totals
The negative binomial model predicted an increase of 56% (26 975 to 42 047) in diagnosed HIV prevalence in England, Wales and Northern Ireland between 2001 and 2004. In comparison the linear model would have predicted an increase of 25% (26 975 to 33 680). Figure 1 shows the best-fit curves, using the negative binomial model, by route of infection within London and outside London for 1996 to 2004.
Within London and outside London
The negative binomial model predicted a lower rate of increase between 2001 and 2004 in the number of individuals with HIV infection resident within London [54% (16 136 to 24 805)] than outside London [60% (10 839 to 17 367)]. In contrast, the linear model would have predicted a higher rate of increase within London [26% (16 136 to 20 270)] than outside London [24% (10 839 to 13 410)].
Route of infection
The negative binomial model predicted a 34% increase in prevalent diagnosed HIV infections among individuals infected through sex between men (14 234 to 19 011) between 2001 and 2004 and a 92% increase in those infected heterosexually (11 129 to 21 335). The linear model would have predicted a 22% increase in individuals infected through sex between men (14 234 to 17 344) and a 32% increase in those infected heterosexually (11 129 to 14 638).
The negative binomial model predicted that in 2004, for the first time, there will be in total more heterosexually infected individuals (21 335) receiving HIV-related care than individuals infected through sex between men (19 011). The linear model would have predicted diagnosed HIV infections acquired through sex between men to remain the larger number in 2004 (17 344 compared with 14 638 heterosexually acquired infections).
The observed increase in diagnosed HIV prevalence in the survey data 1996 to 2001 can be partially explained by the continuing role of anti-retroviral drugs in delaying death. In England, Wales and Northern Ireland the reported deaths of HIV-infected adults were at their highest in 1995 at 1530, compared with 332 deaths which were identified through the HIV/AIDS reporting system for 2001 (deaths reported by end of March 2003) . However, the main explanation for the increase is the large rise in the number of new diagnoses, in particular individuals who have acquired HIV heterosexually.
Among 4724 adult individuals newly diagnosed in England, Wales and Northern Ireland in 2001 (reported by end March 2003), 34% (1605) were reported as infected through sex between men and 58% (2747) as infected heterosexually . Of the individuals infected heterosexually, and for whom a probable country of infection had been recorded (2614), 90% (2348) acquired HIV outside the United Kingdom, and for 81% (2117) this acquisition was in an African country .
A large rise in heterosexually acquired HIV infections, as compared with a steadier rise in infections acquired through sex between men, resulted in reports of newly diagnosed individuals infected through heterosexual sex overtaking those infected through sex between men in 1999 . The negative binomial model predicts that the same will happen for prevalent diagnosed HIV infections in 2004. The linear model would not have predicted this change.
Alongside the large increase in the number of heterosexually acquired HIV infections the SOPHID data show another trend. Between 1996 and 2001 there was a small decrease in the proportion of HIV-infected individuals who were resident in London compared with those resident outside London, falling from 62 to 60%. The negative binomial model predicts that this gradual decrease in the proportion of individuals with HIV infection resident in London will continue, falling to 59% in 2004. However, any change in proportional distribution of residence is dwarfed by the large increases in actual numbers of diagnosed HIV-infected individuals.
We have chosen a simple approach to categorizing geography. Within both areas of residence the prevalence of diagnosed HIV infections varies. To adjust the observed data at smaller level geography however, would be problematic because of small numbers.
Inspection of the 2001 values found that when compared with the adjusted data both the negative binomial and linear models underestimate the prevalence of diagnosed heterosexually acquired HIV infections, in particular for individuals resident outside London. The negative binomial model was shown to provide a better fit than the linear model to the adjusted data for the heterosexual group. Therefore, the 92% increase in heterosexually acquired HIV infections predicted by the negative binomial model is probably more realistic than the 32% increase predicted by the linear model. For infections acquired through sex between men and other routes the results from the two models were similar.
Extrapolations for future numbers of diagnosed HIV infections, and their geographical distribution, are subject to many uncertainties. These have been emphasized previously [1,8] and, having been reviewed, currently are:
The movement of HIV-infected people between countries with a high prevalence of HIV (particularly from sub-Saharan Africa) and the UK.
The dispersal of new migrants, including asylum seekers, to different parts of the country.
How long highly active anti-retroviral therapy (HAART) will continue to delay death.
The incidence of new HIV infections within the UK.
The number of those with previously undiagnosed infection that are recognized as a result of developing an HIV-related illness.
The number of those with previously undiagnosed infection that are recognized as a result of promotion of HIV testing, including antenatal diagnoses.
The increases following either of the models will have serious implications for service providers. In particular, the predicted increases in diagnosed HIV will have an impact on GUM clinics in terms of workload, adding to the burden caused by the rising numbers of other sexually transmitted infections.
The SOPHID survey would not be possible without the SOPHID advisory team, the active collaboration of nominated facilitators and service providers and the Department of Health for funding. Their participation is gratefully acknowledged. The help and advice of Janet Mortimer and the statistical support of Andre Charlett and Daniella De Angelis is also gratefully acknowledged.