#### Introduction

Since the HIV pandemic began, monitoring trends in HIV incidence has been critical as this indicator reflects the rate of new infection and helps to evaluate prevention efforts. Sustained or increasing HIV incidence has been of serious concern in many populations and has driven new approaches to controlling the epidemic.

Despite its importance, HIV incidence is difficult to measure. Classically, one follows a cohort of uninfected persons to measure the rate of new infection. Cohort studies, however, are expensive, time-consuming and subject to bias introduced by selective recruitment and attrition, study effect and aging ^{[1]}. Other indirect approaches to determine incidence such as serial seroprevalence studies or seroprevalence amongst young persons may be useful but their interpretation can be problematic ^{[2]}.

In July 1998, Janssen *et al*. ^{[3]} reported a remarkable new approach to measure HIV incidence. The technique, termed the Serological Testing Algorithm for Recent HIV Seroconversion (STARHS), uses a laboratory assay that differentiates recent from long-standing HIV infection. Janssen *et al*. described a ‘detuned assay’ based on the principle that HIV antibody titre is low early in infection and increases subsequently. If the time to develop the antibody response is characterized, one can calculate HIV incidence from a single specimen. This approach is efficient as only specimens confirmed positive by the diagnostic assay are tested; also, STARHS assays are generally inexpensive. This is in contrast to approaches in which specimens are tested for p24 antigen ^{[4]} or viral RNA ^{[5]} to identify recently infected persons in whom these markers are present and antibody is still undetectable. These approaches are expensive as both positive and negative specimens must be tested and imprecise when HIV incidence is low or the preseroconversion period is short.

Measuring HIV incidence from single specimens is extremely attractive and has been likened to alchemy, in which ‘philosopher's stone’ transforms lead into gold. Nevertheless, there are potentially serious problems in the application of STARHS, which have received limited attention to date. When incidence is determined from STARHS in specimens collected for diagnostic purposes, HIV test-seeking behaviour may bias incidence estimates. Persons may present for HIV testing due to factors associated with HIV infection such as symptoms of primary HIV infection. This has been termed ‘acute retroviral syndrome’ or ‘seroconversion illness’ and commonly presents as a flu-like illness 2–4 weeks following infection ^{[6–8]} and may be severe enough to require hospitalization ^{[6,9]}. Men who have sex with men (MSM), in particular, are aware of seroconversion illness and its significance and the occurrence of influenza-like illness may precipitate a decision to be tested for HIV earlier than may have otherwise ^{[10,11]}. Recently infected persons may also present soon after exposure because of anxiety related to the high-risk sexual behaviours or other exposures. Finally, patients recently infected by HIV may present because they acquire a symptomatic sexually transmitted infection (STI), leading them to seek treatment. We have named the phenomenon of earlier HIV testing ‘seroconversion effect’ (SCE).

In Ontario, we have been using STARHS since 1999 but were concerned that the resulting HIV incidence estimates may be difficult to interpret. We observed incidence densities substantially higher ^{[12,13]} than those from cohort studies in similar populations and higher than estimates derived from independent modelling ^{[14]}. We therefore wished to better conceptualize and quantify the potential bias introduced by HIV testing behaviours.

In preparatory work, we empirically observed the impact of testing bias due to SCE. If HIV testing were independent of events related to infection, measured HIV incidence should be independent of the mean window period used in the calculation, that is, calculated HIV incidence would be the same at different mean window periods. Using data on testing from the Ontario HIV diagnostic service, we observed that HIV incidence was substantially lower using longer mean window periods. We also realized that it was theoretically possible to quantify the magnitude of this bias from the actual data. This is potentially important as it could provide the basis for adjusting for testing bias introduced by SCE.

#### Methods

##### Empirical evidence of seroconversion effect

To determine whether there is empirical evidence for SCE, we analysed data from the Ontario HIV Laboratory, which performs essentially all diagnostic HIV testing in Ontario ^{[15]}. We calculated HIV incidence varying the standardized optical density (SOD) cutoff values corresponding to a window period of 70–336 days using the basic formula described by Janssen *et al*. ^{[3]}.

##### Theoretical basis for the analytic approach

Though our study initially used a simulation approach ^{[16]}, we later realized that an exact solution was possible and developed a formula to express the measured incidence as a function of several input parameters.

We began with the basic formula to calculate incidence ^{[3]}. We then modified this expression, not only incorporating terms that included the true incidence but also incorporating other parameters that may introduce bias. Our formula incorporated HIV incidence, study duration, window period of the detuned assay and the intertest interval. We were, in particular, interested in determining to what extent measured HIV incidence varied as a function of these parameters. The output was the bias factor, that is, the ratio of the measured incidence (i.e. by STARHS) to the true incidence input in the formula.

We incorporated two potentially important sources of bias related to HIV testing. The first is the SCE as described above. We also hypothesized that persons at higher risk of HIV infection may test for HIV more frequently. This could potentially increase HIV incidence determined through STARHS.

For the purpose of our analysis, we used the period of increased probability of an HIV test related to seroconversion illness or high-risk behaviour of 90 days, or about 3 months, following infection, as if such an event were to incite a person to be tested, it would likely be within this period.

##### Derivation of the formula incorporating seroconversion effect

We developed an algebraic expression of the model parameters to calculate the estimated incidence in situations with nonuniform testing intervals and to evaluate the extent of bias in incidence estimates. We began with the basic formula used to calculate incidence from STARHS as described by Janssen *et al*.:

In which *I*_{est}, estimated (calculated) incidence; *N*_{dis}, number of discordant STARHS results (diagnostic assay + and STARHS assay −); *N*_{test}, total number of tests in the period of observation; *T*_{win}, mean window period of the detuned assay (time from diagnostic + to detuned +).

The derivation of the formula for the measured HIV incidence as a function of the true incidence and SCE is described in detail in the Appendix. It is expressed as follows:

In which *I*_{true}, true incidence; *I*_{est}, calculated incidence; *T*_{win}, window period of the sensitive/less sensitive (S/LS) assay; *N*, number of persons tested; *T*_{test}, mean time between HIV tests (intertest interval); *T*_{obs}, study period (period under observation); *T*_{sce}, time period used to define *P*_{sce}; *P*_{sce}, proportion of newly infected persons having ‘additional’ testing in *T*_{sce}.

All time period parameters (*T*) are expressed in the same units (days or years). *N* represents the number of persons and *I* represents the incidence expressed as the number of new HIV infections per person per unit time defined as in *T.* It is assumed that *I*_{true} and *T*_{test} are constant (uniform) across *T*_{obs}, *T*_{obs} is greater than *T*_{win} and *P*_{sce} is uniform across *T*_{sce}.

##### Parameter values

The formula was evaluated for parameter values amongst MSM in Ontario, a province in Canada with intermediate levels of HIV incidence and prevalence. The initial calculations were carried out using base-case values for each of four parameters. We varied the values of these parameters over a range from minimum to maximum values for each combination of the four parameters. The parameter values (see Table 1 for base-case, minimum and maximum values of model parameters) and the methods used to derive their values were as follows.

##### Incidence

We carried out a literature search of cohort studies and modelled estimates of HIV incidence amongst MSM ^{[14,17–20]}.

##### Study duration

We examined the range of likely scenarios on the basis of data availability and timeliness as well as reviewed a number of studies that have been carried out on diagnostic data.

##### Window period

We used the window period as determined by the US Centers for Disease Control using the Vironostika HIV-1 Microelisa System (Bio-Mérieux, Inc, Durham, North Carolina, USA) ^{[21]}. We used a base case of 133 days, which is the window period of the HIVAB HIV-1 EIA (Abbott Laboratories, Abbott Park, Illinois, USA) used in the original detuned assay. The Vironostika assay allows the use of a variable window period from 70 to 336 days, each with a mean value for SOD established by the developers of the test.

##### Intertest interval

There is limited data on the mean intertest interval amongst groups at risk for HIV. We derived initial estimates by determining the estimated number of tests amongst MSM in Ontario ^{[22]} and the estimated number of MSM in Ontario ^{[14]}. We also carried out a literature search on testing patterns to determine whether our initial estimates were plausible.

We also wished to explore the potential impact of an association between HIV incidence and intertest interval. Persons at high risk for HIV infection may test more frequently, potentially introducing bias into the measurement of incidence. Although this may not always be the case, empirical data from the Argus study on MSM in Montreal revealed such an association (Lambert G., personal communication) as did a study from London, England ^{[23]}. We termed this ‘*I*_{true}–*T*_{test} interaction’. To determine the impact of such an association, we divided the population into five strata of incidence varying on a monotonic increasing scale and varied the intertest intervals inversely, that is, the strata with the highest incidence had the shortest intertest interval.

##### Adjusting HIV incidence for seroconversion effect

The algebraic formula given above expresses the calculated HIV incidence as a function of true incidence and the SCE. Though we had no assurance that this formula would incorporate all potential sources of uncertainty, we observed from empirical data that the variation in HIV incidence observed as a function of mean window was expressed closely by our algebraic formula with fitted values of true incidence and *P*_{SCE.} Therefore, we developed a procedure using a goodness-of-fit approach to determine the values of true incidence and *P*_{SCE}, which generated values of incidence most precisely fitting the observed data. Using custom software written in APL+Win Version 4.0 (APL2000, Rockville, Maryland, USA), we varied the values of incidence over the range from zero to 20 per 100 person-years in increments of 0.01 and the value of the *P*_{SCE} from zero to 40% in increments of 1%. The values of incidence density and *P*_{SCE}, which generated a modelled curve of incidence as a function of window period, were varied to minimize the sum of the squares of the residuals between the modelled and observed incidence curve. The software to carry out the bias adjustment is available from the authors at www.phs.utoronto.ca/ohemu. To test this application and adjust actual data, we used data from MSM and injection drug users (IDUs) tested in the diagnostic service of the Ontario HIV Laboratory to calculate HIV incidence at five different values of the window period from 97 to 244 days.

#### Results

##### Empirical evidence of seroconversion effect

Varying the SOD cutoff values equivalent to different window periods, we found that for MSM, and to a lesser extent for IDUs, the calculated HIV incidence was substantially lower when a longer mean window period was used. For MSM, calculated incidence was 2.28 per 100 person-years for a mean window period of 133 days; 1.69 per 100 person-years for 170 days and 1.16 per 100 person-years for 336 days. For IDUs, the calculated incidence for the same three window periods was 0.29, 0.26 and 0.21 per 100 person-years. These results provide empirical evidence for the occurrence of SCE.

##### Modelling the impact of testing bias

As indicated above, we quantified the bias as a ratio of calculated incidence to the true incidence over a wide range of combinations of the four parameters indicated above. For *P*_{SCE} equal to zero, and no interaction between incidence and intertest interval, we observed no bias. However, bias was present and, in some cases, to an important extent when *P*_{SCE} was not zero and *I*_{true}–*T*_{test} interaction was present. Table 2 shows a summary of the results for no and two levels of *P*_{SCE} and *I*_{true}–*T*_{test} interaction. Clearly, SCE was a greater source of bias than *I*_{true}–*T*_{test} interaction. In the absence of SCE, the maximum bias was 1.35. However, in the presence of a high level of *P*_{SCE}, the mean and maximum biases were considerable, with bias as high as 7.26 at the highest tested level of *P*_{SCE} and *I*_{true}–*T*_{test} interaction. In general, in the presence of SCE, mean biases varied in the range of 2.12–3.57.

The bias observed over the range of values for each of the four parameters examined individually is summarized in Fig. 1(a–d). Figure 1a shows the effect of incidence and there appears to be a slightly decreased bias at increasing levels of incidence. There was no bias associated with study duration (see Fig. 1b). However, for the window period (see Fig. 1c), there was substantial bias, which was higher at shorter window periods. Finally, Fig. 1d displays the effect of varying intertest interval and its impact on bias; bias increased linearly and steeply with increasing intertest interval and the slope was dependent on the *P*_{SCE}.

##### Adjusting HIV incidence for seroconversion effect

Figure 2 displays observed and modelled data using the adjustment procedure as described above. The observed curve represents incidence as a function of mean window period amongst MSM in Toronto using the original formula provided by Janssen *et al*. ^{[3]} and the modelled curve represents incidence calculated using the more complex formula incorporating true incidence and *P*_{SCE} with the values of these parameters generated by our custom application. Table 3 presents some typical results of modelled and empirical values of incidence of density at different window periods for several exposure categories over the period 2001–2006. Thus, the iterative goodness-of-fit procedure achieved a relatively close fit to the empirical data for the exposure categories examined.

Fig. 2 Image Tools |
Table 3 Image Tools |

#### Discussion

In an analysis of potential biases in calculating HIV incidence from STARHS, we found that bias due to test-seeking behaviours was considerable. The values we used were plausible and based mostly on empirical data from MSM in Ontario. It is possible that biases may be different in other study populations though the range of values we used probably incorporated those from most other study populations. One limitation of our study is that we did not investigate the bias for different values of the duration of SCE, which, in our analysis, was taken as 90 days. For some parameters, good data are not readily available, especially for the interaction of incidence and testing frequency. Available data from both diagnostic services and special studies should be examined to further characterize these parameters. Clearly, however, anecdotal evidence and our results suggest that SCE is an important source of bias. A recent study ^{[6]} in Ontario found that a substantial majority of MSM who undergo HIV testing had symptoms suggestive of a primary HIV infection. It is unclear to what extent other populations at risk are aware of the occurrence of flu-like illness accompanying primary HIV infection. In some populations at risk or in special situations such as the blood donor setting in which symptoms and temperature are the basis for deferral, such behaviour could result in a bias in the opposite direction.

We incorporated two possible sources of bias into our model but there is a third possible source of bias, which might also be important in interpreting incidence estimates derived from STARHS. Participants included in analyses to determine incidence may not be representative of the population for which the HIV incidence estimate is desired. In this regard, data collected from diagnostic testing databases may be particularly problematic. A good example is incidence based on STARHS using specimens collected amongst STI clinic patients. Thus, nonrepresentativeness introduces a further, potentially important, source of bias and must be taken into account in interpreting the results from STARHS.

To determine whether there was empirical evidence of bias, we reviewed three types of studies, which used STARHS to calculate HIV incidence, namely those using specimens collected for diagnostic purposes, epidemiologic studies and those examining blood donors. We hypothesized that bias would be present in the first group but to a lesser extent or not at all in the latter two groups.

Several studies ^{[24–28]} used diagnostic samples collected at STI clinics. In these studies, investigators observed very high incidence rates. A study by Weinstock *et al*. ^{[24]} in nine US STI clinics yielded an overall HIV incidence amongst MSM of 7.1 per 100 person-years. In Los Angeles, for example, the rate was 9.6 per 100 person-years as compared with a population-based estimate of 1 per 100 person-years at about the same time ^{[20]}. Such results are especially difficult to interpret because of the bias introduced by examining unrepresentative, high-risk persons.

Studies ^{[29–31]} from other diagnostic sites are also biased but probably to a lesser extent than those from STI clinics. Studies carried out at anonymous test centres from 2000 to 2001 in San Francisco yielded estimates of 2.5–3.3 per 100 person-years, about double the 1.4 per 100 person-years population-based estimate ^{[20]}. In Ontario, HIV incidence amongst MSM based on STARHS was 2.1 per 100 person-years ^{[13]}; this was considerably higher than our population-based estimate of one per 100 person-years ^{[14]}.

We reviewed two epidemiologic studies; these would be less likely to show bias as we describe above since the HIV test is proposed by the investigators rather than sought by the participant. A study of IDUs by Kral *et al*. ^{[32]} observed an HIV incidence of 1.2 per 100 person-years in 1987–1998, actually slighter lower than the population-based estimate of 1.9 per 100 person-years ^{[20]}. Nevertheless, the 95% confidence limits were 0.7–2.0 per 100 person-years and thus included the latter estimate. Martindale *et al*. ^{[33]} used STARHS to calculate HIV incidence from baseline positive sera collected in a cohort study of MSM in Vancouver, Canada. Incorporating the STARHS data had no significant impact on overall HIV incidence in their cohort.

One might expect that incidence calculated from STARHS amongst blood donors would be relatively unbiased as the decision to donate blood is less likely associated with the risk of HIV infection (some persons may donate blood to obtain an HIV test following a high-risk exposure but this is probably exceptional). In fact, studies ^{[34,35]} estimating HIV incidence amongst blood donors, both repeat and first time, yield very similar results to those from other methods ^{[36]}. Thus, as expected, there was little evidence of bias in this group. The results from STARHS are comparable to those from other methods based on seroconversion rates from repeat donors or RNA-yield rates from nucleic acid screening ^{[37]}.

The US Centers for Disease Control recently published a study using back calculation and a novel approach using the STARHS result to estimate HIV incidence in the United States ^{[38,39]}. They concluded that HIV incidence was substantially higher than had been previously estimated, especially in men having sex with men and African–Americans. They examined the possibility that ‘motivated testing’ (akin to our ‘SCE’) might introduce bias into their results and lead to an overestimate of HIV incidence and concluded that the impact of such bias was about 7%. This is considerably lower than our empirical findings amongst MSM in Ontario. The difference may be due to the fact that their results were based on populations with different HIV testing patterns. Also, they based their estimate on self-reported reasons for HIV testing, which may not reliably capture the full impact of testing bias. A study ^{[40]} of MSM undergoing HIV testing in California revealed that men significantly downplayed risk behaviours that led them to be tested.

As incidence derived from STARHS is subject to important bias, it is useful to explore approaches through which this bias could be estimated and removed. Specific population-based studies could help us to better understand the patterns of HIV test-seeking behaviour and potentially estimate the values of *P*_{SCE} and *I*_{true}–*T*_{test} interaction. In particular, it would be useful to know the level of knowledge about seroconversion illness and the likelihood that a person would seek an HIV test because of symptoms compatible with primary HIV infection. Their effects could then be removed in the analysis.

In summary, studies reporting calculated incidence using STARHS, especially if based on specimens collected for diagnostic purposes, must be interpreted with caution because of the considerable bias introduced by HIV test-seeking behaviour. Our study revealed that bias due to early testing related to the SCE was important and could lead to marked overestimation of HIV incidence.

We also developed a procedure to remove the SCE. We believe that our approach provides a valid solution to control for this bias. The close fit of our algebraic formula expressing measured incidence as a function of true incidence and testing bias compared with empiric data over a range of mean window periods provides reassurance that our approach appropriately captures the functional relationship.

Our adjustment does not control for the association between testing frequency and HIV incidence. Nevertheless, in our analyses, we found this to be relatively unimportant. Also, the results we obtained can be generalized only to persons undergoing HIV testing. Most persons in highly affected populations undergo HIV testing though there may be some persons who do not undergo testing, either because they perceive themselves at low risk or are at high risk but fearful of being diagnosed HIV positive. It would be useful to explore the potential impact of nonrepresentativeness of the testing populations on HIV incidence calculated from STARHS.

#### Acknowledgements

We wish to thank the following persons at the Ontario Ministry of Health and Long-Term Care as follows: Frank McGee, Coordinator, AIDS Bureau for core funding; Carol Swantee, Carol Major, Lynda Healey and Denise Soliven, HIV Laboratory for carrying out the detuned assay testing and Mark Vandennoort, Instructional Media Centre for graphic support. Janet Raboud provided helpful advice during the development of the methods. We would also like to thank Neil Hershfield, Vancouver for developing the software for adjusting incidence for bias and Juan Liu, University of Toronto for carrying out the bias adjustment calculations. The Laboratory Enhancement Study was funded by the Ontario HIV Treatment Network from 1999 to 2001 and the Public Health Agency of Canada from 2001 to 2008.

R.S.R. developed the theoretical conception of the biases inherent in the application of the STARHS assay in estimating HIV incidence and the general approach to modelling it. He also identified the empirical evidence for the presence of the SCE. R.W.H.P. developed the statistical model to assess bias under different assumptions of parameter values and derived the algebraic formula to express measured HIV incidence as a function of true incidence and SCE.

#### Derivation of the formula

Equation 1, below, has been used to calculate the estimate of HIV incidence (*I*_{est}) using the sensitive/less sensitive (S/LS, STARHS, and detuned) assay in screened populations ^{[3]}.

In a hypothetical study of duration *T*_{obs} of an HIV testing population of size *N* who test at a mean intertest interval of *T*_{test} and have a true HIV incidence of *I*_{true} and is tested with a detuned assay with a window period of *T*_{win}, the (Equation 1) can also be expressed by another equation replacing *N*_{dis} and *N*_{test} using only those variables (*N, T*_{test}, *T*_{obs}, *T*_{win} and *I*_{true}). This assumes that all temporal parameters (*T*) are expressed in the same units and that the true incidence, *I*_{true}, and the mean intertest interval, *T*_{test}, is constant (uniform) across the study period (*T*_{obs}) and that the duration of the study period is greater than or equal to the window period of the detuned assay (*T*_{obs} ≥ *T*_{win}).

In the absence of SCE:

From ^{[3]}