Optimal timing of highly active antiretroviral therapy (HAART) initiation in HIV-infected persons is unclear.1 Current guidelines recommend starting HAART when CD4+ lymphocyte count (CD4) falls below 350 cells/mm3, although these guidelines are based on nonrandomized studies potentially subject to lead-time and selection biases.2–4
Two recent studies from the Antiretroviral Therapy Cohort Collaboration5 and the North American AIDS Cohort Collaboration on Research and Design6 addressed the optimal time to initiate HAART. Both studies used large, multisite cohorts of observational data, using statistical methods to avoid lead-time bias and minimize confounding. The former study5 made use of historical data from the pre-HAART era to account for deaths and AIDS-defining events before HAART initiation. These researchers concluded that HAART should be initiated some time before CD4 drops below 350. The latter study6 employed a method proposed by Robins et al7,8 to analyze the observational data as if they were from a randomized trial, and comparing treatment strategies for starting HAART when CD4 >500 versus ≤500 and starting HAART when CD4 = 351–500 versus ≤350. This study concluded that HAART should be initiated at CD4 >500.
These studies improve our understanding of the optimal timing of HAART initiation, but they have limitations. First, both studies were nonrandomized and used retrospective data. The Antiretroviral Therapy Cohort Collaboration study5 assumed that rates of death due to AIDS among untreated patients in the pre-HAART era were similar to those in the HAART era within particular CD4 strata—a strong assumption given temporal changes in AIDS care and outcomes.9,10 The other study6 compared broad CD4 ranges and was based on a rule defined as starting HAART within 6 months of a particular CD4 level; those who died within the 6 months before HAART initiation were assigned to the delayed treatment group, which could be a potential source of bias.11 Neither study directly considered serious non-AIDS defining events12 or incorporated the health of asymptomatic persons without events at the end of follow-up.
In the present study, we estimated the optimal CD4 threshold for HAART initiation for a cohort of HIV-infected persons in Nashville, Tennessee. We used a method proposed by Robins et al,8 which is similar to the approach applied by the North American AIDS Cohort Collaboration on Research and Design,6 in that it analyzes the observational data as if they were from a randomized trial. However, rather than comparing the impact of starting HAART in one stratum (defined using a broad range of CD4 counts) versus another CD4 stratum, our approach directly estimates the optimal CD4 at which to start therapy to maximize health at a specified time in the future. For example, in the earlier analysis6 the researchers treated their data as if they were from a 2-armed clinical trial comparing the rules “start HAART when CD4 >500” versus “start HAART when CD4 ≤500.” The method we used analyzes the data as if they came from a multiarmed clinical trial, where a study subject is assigned to 1 of 550 possible treatment rules corresponding to “starting HAART within 3 months of the first CD4 measurement below 201, 202, …, 750.” Rather than computing a single hazard ratio, this approach computes expected health for each treatment rule and then estimates which treatment rule yields the best expected health. We studied multiple measures of health based on mortality, AIDS-defining events, non-AIDS-defining events, CD4, and quality of life.
Study Population and Definitions
We conducted a retrospective observational cohort study among persons treated at the Comprehensive Care Center, an outpatient clinic in Nashville, TN. The study population included all patients who established care and had at least 2 provider visits between 1998 and 2007. Eligible individuals were those with no prior antiretroviral use at the Center who had at least one CD4 measurement in the range 200–749 before experiencing an AIDS-defining event or initiating HAART. Study entry was the date of the first CD4 level between and 749. Follow-up ended at death, 31 December 2007, or the last clinic visit for persons lost to follow-up. During the study period, healthcare coverage was available to virtually all HIV-infected Tennesseans.13
AIDS-defining events were based on US Centers for Disease Control and Prevention classification criteria, excluding CD4 <200 cells/mm3.14 Non-AIDS events were based on recommendations of an endpoints committee comprised of infectious disease clinicians. Examples include acute myocardial infarction and cirrhosis of liver; a complete list of non-AIDS events considered for this study is found in the eAppendix (https://links.lww.com/EDE/A410). Subjects were considered lost to follow-up if they had no clinic encounter for more than a year before date of death or 31 December 2007, whichever came first. The precision of all measurements involving time is in units of days. HAART was defined as any regimen containing 3 or more active antiretroviral therapy agents.3 The study was approved by the Vanderbilt University Medical Center Institutional Review Board.
Given a set of candidate rules, “start HAART within 3 months of first CD4 measured below x,” where x = 201, 202, …, 750, we sought to estimate which value of x would result in the best expected patient health k months after study entry. To address this question, we followed the methodology of Robins et al.8 This methodology requires specifying a measurement of patient health at k months (the outcome variable), and the treatment rules, x, that are compatible with each patient's CD4 and HAART initiation history (the explanatory variable). We considered k = 6, 12, 24, and 36 months and employed inverse probability weights to account for potential bias due to nonrandom assignment of treatment rules and patient dropout.
Our analyses required specifying a metric of each patient's health at time k months. This outcome, y, is termed a “utility” or “health metric” and in our analysis was a function of death, AIDS-defining events, non-AIDS-defining events, and CD4, if asymptomatic. The utility was an arbitrary but reasoned quantity; in our analyses we used the following utilities:
- CD4 count-based: y = most recent CD4 at or before month k if patient is alive and asymptomatic at month k, y = 100 if subject had an AIDS or non-AIDS event by month k but was not deceased, y = 100 × (t-12-k)/18 if subject was deceased before month k where t = months to death (Fig. 1A).
- Quality of life: Based on a validated quality-of-life scale incorporating death, type of AIDS or non-AIDS event, and CD4 at month k (see Fig. 1B and eAppendix [https://links.lww.com/EDE/A410] for formula).15 In this utility death was assigned y = 0, AIDS or non-AIDS events were assigned y between 0.56 and 0.65 depending on the specific type of event, and asymptomatic patients were assigned y between 0.78 and 0.95 depending on their most recent CD4 at or before month k.
For both utilities, a low value of y corresponded to a poor outcome. Utility 1 was elicited a priori from consultations with the Vanderbilt-Meharry Center for AIDS Research Epidemiology/Outcomes group. This utility was based on CD4 count and assigned a patient with an AIDS or non-AIDS event by month k the same utility as an asymptomatic patient with CD4 = 100; individuals who died were assigned negative utility scores, with those who died earliest given larger negative values. Utility 2 also incorporated death, AIDS and non-AIDS events, and CD4, but was based on a published quality-of-life scale15 and used type of first AIDS event. In both utilities, if a patient had either type of event and subsequently died before time k, then the death was recorded. If both AIDS and non-AIDS events occurred, the earlier took precedence.
Following Robins et al,8 consider a multiarmed trial where each patient is randomly assigned a value of x between 201 and 750, and then asked to follow the rule “start HAART within 3 months of first CD4 measurement below x.” In such a trial, suppose a patient was assigned the rule with x = 400. If this patient's first CD4 measurement below (but not equal to) 400 was 350, and if he started HAART within 3 months of this measurement, then this patient was adherent to his assigned rule. Notice that although this patient was randomized to the rule “start HAART within 3 months of the first CD4 measured below 400,” his CD4/treatment history was also consistent with the rules “start HAART within 3 months of the first CD4 measured below 399, 398, …, 351.” In contrast, if this patient had not started HAART within 3 months of his CD4 measurement of 350 or had started HAART before his CD4 was measured below 400, he would have been nonadherent to his assigned rule.
Using this model, we examined each patient's CD4 and HAART initiation history and determined compatible rules for each patient. It should be noted that for the purpose of computing rules, follow-up stopped at the earliest date of HAART initiation, first AIDS event, last visit, death, or k months.
Table 1 contains some hypothetical examples matching treatment histories to regimen rules. Consider Patient A: his first CD4 measurement was 400, his next was 350 at month 3, and he then started HAART in month 4. This patient's data were compatible with the rules “start HAART within 3 months of first CD4 measured below x = 351, …, 400.” Had this patient been assigned the rule with x = 400, he would have been compliant because the first CD4 measured below (but not equal to) 400 was 350, and he started HAART 1 month after this observation. However, this patient's data were not compatible with the rule “start HAART within 3 months of first CD4 measured below x = 401,” because his first CD4 measurement below 401 (CD4 = 400 at month 0) was taken more than 3 months before he started HAART. Similarly, this patient's data were not compatible with the rule “start HAART within 3 months of first CD4 measured below x = 350” because he started HAART without ever having a CD4 measured below 350.
Patients whose data were not compatible with any regimen rule were artificially censored at the date their data became incompatible. For example, Patient B did not start HAART within 3 months of CD4 = 250, but then started HAART in month 4, 1 month after CD4 = 300. Therefore, his data were not consistent with any x, and he was artificially censored when he started HAART. Patient C's data were also inconsistent with all rules as he started HAART more than 3 months after his last CD4 measurement. Patient D failed to start HAART within 3 months of his first CD4 below 201, so his data were not consistent with any rule and he was artificially censored 3 months after his first CD4 below 201. In contrast, Patient E was consistent with x = 201, …, 350.
It should be noted that regimen rules were based on measured rather than actual CD4. For example, Patient F started HAART within 3 months of his first CD4 measured below 750 although it is likely that his CD4 dropped below 750 cells more than 3 months before initiating HAART, but was not observed. Patient G's history is similar to Patient F's with the additional CD4 measurement of 250 taken at month 1. Although this measurement may have prompted the initiation of HAART, Patient G also started HAART within 3 months of his first CD4 measured below 750. Similarly, patient H was assigned all rules x = 201, …, 750. Patient I was also assigned all treatment rules as the study ended less than 3 months before his first CD4 measurement, and it is therefore unclear whether he was deferring treatment until a lower CD4 or preparing to start. Finally, Patients J, K, and L never started HAART, but were consistent with the rules “start HAART within 3 months of the first CD4 measured below x = 201, …, 250,” Patient J because he never had a CD4 measurement below 250 and Patients K and L because their next measurements were less than 3 months before k = 6. The complete algorithm used for determining compatible rules is given in the eAppendix (https://links.lww.com/EDE/A410).
Weighted Regression Models
To find the rule for starting HAART that maximized health at k months, we regressed y on x and found the value of x that achieved the maximum predicted value of y. Each individual contributed as many values of (x, y) as the number of rules compatible with their data. For example, a person who had data compatible with starting HAART within 3 months of their first CD4 below x = 201, …, 250 contributed 50 data points (x, y) to the analysis: (201, y), (202, y), …, (250, y); their outcome y was the same for all x. We fit a curve to the (x, y) pairs of all eligible patients using weighted least squares regression, with x expanded using restricted cubic splines to allow the relationship between x and y to be nonlinear. Our splines used 6 knots, at default positions of the Design package16 of R statistical software version 2.8.1 (http://www.r-project.org).
Inverse Probability Weights
Persons whose data were compatible with a given rule may have had characteristics different from those whose data were compatible with other rules. To address this potential source of bias, we used inverse probability weighting methods in the manner described by Cain et al.17 Briefly, for months 0, 1, …, k, we estimated the probability of initiating HAART using logistic regression and the covariates age, sex, race, injection drug use as HIV risk factor, most recent CD4, CD4%, and HIV-1 RNA, time since most recent laboratory measurements, and time in care. Months were included in the model using restricted cubic splines. For each compatible treatment rule x per person, we then computed the predicted probability of remaining compatible with x for each month of follow-up. Based on CD4 history, if a patient could not be artificially censored from a particular rule x at a given month, then the probability of remaining compatible with x for that patient in that month was 1, otherwise it was 1 minus the probability of initiating HAART. Weights were computed as the product of the inverse of these probabilities over the k months of follow-up.
Some patients' health at time k months was unknown, due either to loss of follow-up or end-of-study censoring. Separate stabilized inverse-probability weights to address loss to follow-up and end-of-study censoring were computed. Our final weights were the product of the multiple inverse-probability weights. To reduce variability, the product of the weights was truncated at the 2.5th and 97.5th percentiles.18
Confidence intervals (CIs) for the optimal rule were constructed from the 2.5th and 97.5th percentiles of 500 bootstrap replications. All model-fitting procedures were repeated for each bootstrap replication. Details of all models are in the eAppendix (https://links.lww.com/EDE/A410), and analysis scripts are posted at http://biostat.mc.vanderbilt.edu/WhenToStartHaartCode.
Of 2011 patients with at least 2 provider visits, 1034 met inclusion criteria. Four hundred thirty were excluded because their first pre-HAART CD4 <750 was under 200, 42 had no pre-HAART CD4 <750, 232 had a prior AIDS-defining event, 240 had been on prior non-HAART antiretroviral therapy, and 33 had both a prior AIDS-defining event and prior antiretroviral therapy.
Of the 1034 included patients, 73% were male, 42% African-American, and 8% had injection drug use as probable infection route. At study entry, the median age was 35 years (interquartile range (IQR) = 28–42), and the median CD4 was 403 (301–528). The median follow-up was 35 months (14–65), and the median number of visits per year was 6.4 (4.5–9.5). Sixty percent started HAART during follow-up; among those initiating HAART, the median time to initiation was 4.1 months (1.7–17.3). The median CD4 before HAART initiation was 342 (264–462). Male sex, high CD4, and low HIV-1 RNA were associated with lower odds of starting HAART.
During follow-up, 93 patients died (9%), 82 experienced at least 1 AIDS event (8%) (25 of these patients later died), and 20 had a non-AIDS event (2%) (7 of these patients later died). Table 2 contains the number of patients who had an event within 6, 12, 24, and 36 months of study entry. Table 2 also includes the number of persons without an outcome at 6, 12, 24, and 36 months due to loss to follow-up, end-of-study censoring, or artificial censoring because their data were incompatible with all treatment rules. Male sex, younger age, and lower CD4% were generally associated with more loss to follow-up.
Optimal CD4 to Initiate HAART
Figure 2 demonstrates estimation of the optimal CD4 to initiate HAART to maximize health 12 months after study entry, based on utility 1 (Fig. 2A–D) and utility 2 (Fig. 2A,B,E,F). We estimated that health 12 months after study entry was maximized by following the rules “start HAART within 3 months of first CD4 measurement below 554” (95% CI = 459–750) and 354 (288–386) for utilities 1 and 2, respectively. Notice that the confidence intervals correspond to rules in Figures 2D and 2F where the best-fitting curves were similar to their maximum levels. Note also that most patients were asymptomatic after 12 months. Therefore, most utility 2 scores were between 0.9 and 0.95 (Fig. 2E), and a relatively small number of deaths had a large influence on estimates.
Similar analyses were performed for both utilities at k = 6, 12, 24, and 36 months. Estimates and 95% confidence intervals for the optimal CD4 level to start HAART for both utilities and at all time points are given in Figure 3. Results were dependent on the choice of utility and the follow-up period.
To maximize health as defined by utility 1 at k months, the optimal rule was to “start HAART within 3 months of first CD4 measurement below” 495 (95% CI = 468–522), 554 (459–750), 489 (427–750), and 509 (460–750) for k = 6, 12, 24, and 36 months, respectively. The confidence intervals widened for increasing k because fewer people were followed for the longer periods of time. In contrast, to maximize utility 2 (quality-of-life) at k months, the optimal rule for starting HAART was to “start within 3 months of first CD4 measurement below” 337 (201–442), 354 (288–386), 358 (294–750), and 475 (287–750) for k = 6, 12, 24, and 36 months, respectively.
Secondary analyses considering alternative utilities and regimen rules are reported in the eAppendix (https://links.lww.com/EDE/A410). Briefly, analyses that did not include non-AIDS events in the utility were similar to those presented earlier (eg, optimal rule estimated as “start HAART within 3 months of CD4 measured below 563” instead of 554 for CD4-based utility at k = 12 months). Analyses that assigned worse health metrics to individuals who had changed regimens favored starting HAART at somewhat lower CD4 counts (eg, 414 for CD4-based utility at k = 12 months). Analyses that included only candidate rules in the range x = 201–500 were generally a little lower than primary estimates (eg, 449 for CD4-based utility at k = 12 months). Analyses comparing the candidate rules x = 201–500 but restricted to those with at least one pre-HAART CD4 ≥500 generally favored starting HAART at slightly lower CD4 levels (eg, 404 for CD4-based utility at k = 12 months). We also performed secondary analyses to estimate the optimal rule for starting a modern, efavirenz-based regimen, artificially censoring subjects who started other regimens. Our estimated CD4 thresholds for starting efavirenz-based HAART were slightly lower (eg, 509 for CD4-based utility at k = 12 months). The results of other secondary analyses using utilities based on survival, ADE-free survival, and AIDS/non-AIDS-events-free survival were quite variable, presumably due to small numbers of events.
We have applied a novel approach to directly estimate the optimal CD4 level for initiating HAART. The approach mimicked a series of randomized trials and defined patient health by more than just death and ADE. Our estimates were sensitive to the choice of patient health metric. Our health metric, which substantially differentiated between asymptomatic patients with widely different CD4 at the end of follow-up, favored starting HAART early, at CD4 levels around 500. In contrast, the quality-of-life health metric, which distinguished very little between asymptomatic patients with low and high CD4 at the end of follow-up, tended to favor initiating HAART later, at lower CD4 levels.
There are several advantages to this analytic approach. Our analyses account for lead-time bias without incorporating historical controls. Prior studies have classified patients into categories based on CD4, and then estimated the optimal CD4 at which to start HAART, using hazard ratios comparing the different categories. In contrast, we have directly estimated the optimal CD4 and computed confidence intervals for this estimate. This approach therefore does not require arbitrary categorization.19 Our treatment rules were based on starting HAART within 3 months of CD4 measured below a particular level, in contrast to the 6 months used recently by the North American AIDS Cohort Collaboration on Research and Design.6 Three months was chosen because it is the typical length of time between visits at our clinic. It is worth noting that with this analytic approach, pre-HAART deaths within 3 months after study entry do not bias results11 (see eAppendix, https://links.lww.com/EDE/A410). Finally, we also directly incorporated non-AIDS events into our analysis, which may be associated with HAART.20,21
An apparent disadvantage with our analysis was that it required defining a health metric at a specified time of follow-up, and the choice of this metric greatly affected our conclusions. Estimates based on utility 1 were closer to results from the North American AIDS Cohort Collaboration on Research and Design study6: start HAART at high CD4 counts, possibly more than 500. In contrast, estimates based on utility 2 using the same outcomes (death, AIDS events, non-AIDS events, and CD4 if asymptomatic) favored starting HAART at lower CD4 levels, which is more consistent with results from the Antiretroviral Therapy Cohort Collaboration study5 and current guidelines. We believe both utilities are reasonable measures for defining health, and thus we present both sets of results. Utility 2 was based on a previously published quality-of-life score, where there was substantial separation in the metric between those who died and those who were living but little separation between those who were alive with various CD4 levels. Hence, the relatively few deaths in our study had a large impact on analyses with this utility (see Fig. 2). In contrast, utility 1 was elicited from our clinicians/study investigators a priori and put more emphasis on differences between CD4 outcomes in asymptomatic patients. Perhaps if we had data from more patients, particularly those who subsequently died, estimates for the optimal CD4 to start HAART using the 2 different utilities would converge. But this may not be the case; indeed, how one defines health plausibly has great impact on when one chooses to initiate HAART. We cannot give results under all possible utilities. The arbitrary nature of utility choice may therefore lead one to favor analyses that consider only the hard end point of death—which implicitly assigns equal health scores to all living patients. However, keeping patients alive is not the only goal of modern HIV-therapy; it is also important to consider the impact of the timing of treatment initiation on other outcomes.
Our study included all patients who had at least 1 CD4 within the range 200–750. A randomized controlled trial may include only those who had a CD4 measurement above a specific threshold (eg, 750) and then subsequently dropped below that threshold. In a secondary analysis we limited our study to persons who had pre-HAART CD4 ≥500, and we defined study entry as the date of their first CD4 <500. Most patients do not present to care with CD4 ≥500; therefore, the number of patients was limited for this analysis, and results may not be as generalizable. We believe the question of when to start HAART should not be limited to the small subset of individuals who enter care with CD4 ≥500 (discussed in eAppendix, https://links.lww.com/EDE/A410).
Our study has other limitations. First, the number of patients was relatively small (particularly the number of deaths), and follow-up time was limited. Analyzing data from more patients followed for a longer time period would improve the precision of our estimates. In addition, the impact of HAART on non-AIDS events may be greater after many years of use. Second, we included patient data only from the southeastern United States, and thus conclusions may not be applicable to other populations. Third, although we controlled for many clinically important factors, as with all observational studies there may have been residual confounding or model misspecification, thus biasing results. Fourth, our goal was to estimate the rule that maximized health; estimates of maxima are generally quite variable and tend to favor the lower and upper limits of the allowable rules (201 and 750) (see the eAppendix, https://links.lww.com/EDE/A410). In addition, our candidate rules for starting HAART were defined using only CD4, whereas other factors (eg, injection drug use) should likely be included in such a decision. Because our rules for initiating HAART are based on measured CD4, not actual CD4 counts, the frequency of CD4 measurements might have affected results. Finally, our analyses considered only the health of infected patients and ignored the impact of the timing of HAART on HIV transmission.
In conclusion, we have applied a novel method to estimate the optimal CD4 for initiating HAART. Similar analyses should be performed in larger observational datasets. Ongoing and future randomized trials should consider the effect of the timing of HAART initiation on various health-metrics in addition to AIDS and death.
1.Hammer SM, Eron JJ Jr, Reiss P, et al. Antiretroviral treatment of adult HIV infection: 2008 recommendations of the International AIDS Society-USA panel. JAMA
2.United States Department of Health and Human Services. Guidelines for the use of antiretroviral agents in HIV-1-infected adults and adolescents (AIDS Info, US Dept. HHS). November 3, 2008. Available at: http://www.aidsinfo.nih.gov/guidelines/
. Accessed September 29, 2009.
3.Egger M, May M, Chene G, et al. Prognosis of HIV-1-infected patients starting highly active antiretroviral therapy: a collaborative analysis of prospective studies. Lancet
4.Cole SR, Li R, Anastos K, et al. Accounting for leadtime in cohort studies: evaluating when to initiate HIV therapies. Stat Med
5.Sterne JA, May M, Costagliola D, et al. Timing of initiation of antiretroviral therapy in AIDS-free HIV-1-infected patients: a collaborative analysis of 18 HIV cohort studies. Lancet
6.Kitahata MM, Gange SJ, Abraham A, et al. Effect of early versus deferred antiretroviral therapy for HIV on survival. N Engl J Med
7.Hernán MA, Lanoy E, Costagliola D, Robins JM. Comparison of dynamic treatment regimes via inverse probability weighting. Basic Clin Pharmacol Toxicol
8.Robins J, Orellana L, Rotnitzky A. Estimation and extrapolation of optimal treatment and testing strategies. Stat Med
9.Palella FJ Jr, Delaney KM, Moorman AC, et al; HIV Outpatient Study Investigators. Declining morbidity and mortality among patients with advanced human immunodeficiency virus infection. N Engl J Med
10.Mocroft A, Ledergerber B, Katlama C, et al; EuroSIDA Study Group. Decline in the AIDS and death rates in the EuroSIDA study: an observational study. Lancet
11.Hernan MA, Robins JM. Early versus deferred antiretroviral therapy for HIV [letter]. N Eng J Med
12.The Strategies for Management of Antiretroviral Therapy (SMART) Study Group. CD4+ count-guided interruption of antiretroviral treatment. N Engl J Med
13.Bailey JE, Van Brunt DL, Raffanti SP, Long WJ, Jenkins PH. Improvements in access to care for HIV and AIDS in a statewide Medicaid managed care system. Am J Manag Care
14.Centers for Disease Control and Prevention (CDC). 1993 revised classification system for HIV infection and expanded surveillance case definition for AIDS among adolescents and adults. MMWR Recomm Rep.
15.Freedberg KA, Scharfstein JA, Seage GR III, et al. The cost-effectiveness of preventing AIDS-related opportunistic infections. JAMA
16.Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med
17.Cain LE, Robins JM, Lanoy E, et al. When to start treatment? a systematic approach to the comparison of dynamic regimes using observational data. Int J Biostat
18.Cole SR, Hernan MA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol
19.Altman DG, Royston P. The cost of dichotomising continuous variables. BMJ
20.Palella FJ Jr, Baker RK, Moorman AC, et al; HIV Outpatient Study Investigators. Mortality in the highly active antiretroviral therapy era: changing causes of death and disease in the HIV outpatient study. J Acquir Immune Defic Syndr
21.Lau B, Gange SJ, Moore RD. Risk of non-AIDS-related mortality may exceed risk of AIDS-related mortality among individuals enrolling into care with CD4+ counts greater than 200 cells/mm3
. J Acquir Immune Defic Syndr