Reynolds, Nancy R PhD*; Sun, Junfeng PhD†; Nagaraja, Haikady N PhD‡; Gifford, Allen L MD§; Wu, Albert W MD, MPH∥; Chesney, Margaret A PhD¶
It is well established that a high degree of antiretroviral adherence is necessary for successful virologic and clinical response.1 Unfortunately, rates of adherence are often less than optimal. Effectively supporting adherence to antiretroviral medication regimens remains a critical clinical challenge.
Accurate assessment of adherence behavior is an essential component in efforts to understand and improve adherence to antiretroviral medications.1 In lieu of an established “gold standard,” several different methods have been developed to measure antiretroviral adherence.1 Electronic “real-time” measures are generally regarded as the most accurate measures currently available but are expensive and impractical to use in many clinical settings. Self-report measures are valued and popular for their convenience and practicality but have been found to overestimate rate of adherence and reduce the amount of variation captured.
The AIDS Clinical Trials Group (ACTG) Adherence Questionnaire,2 a 5-item self-report measure of adherence, has been used extensively to measure adherence to antiretroviral medications in the United States and internationally.3-5 Although investigators have employed different methods to summarize data collected with the questionnaire, the most common approach is simply to average responses to the first item (“How many doses did you miss yesterday/2 days ago/3 days ago/4 days ago?”). The other 4 questionnaire items that measure other aspects of adherence behavior (a longer period of time and adherence to antiretroviral schedule and instructions) have been analyzed to a lesser extent or disregarded completely in analyses of antiretroviral adherence. Because the full questionnaire captures multiple dimensions of adherence, we hypothesized that an estimate of adherence taking all items of the questionnaire into account would provide a stronger measure of adherence. We conducted a series of analyses to estimate the reliability of the 5-item scale, identify questionnaire items that did not add information to adherence with doses, and compare alternate methods of summarizing data collected with the ACTG Adherence Questionnaire.2 We anticipated that findings from this analysis would be useful to other investigators currently using the ACTG Adherence Questionnaire or seeking a valid and reliable self-report measure.
Study Design Overview
This was a secondary analysis of data collected prospectively from 640 subjects in 2 ACTG trials (370 and 398). We assumed that missing data were missing completely at random (MCAR)6 and only included the 464 subjects (2160 visits) with complete questionnaire data for all 5 items. This included 124 subjects from ACTG 370 (35 excluded) and 340 from ACTG 398 (141 excluded). We used principal component (PC) analysis to construct summary measures of adherence. Plasma HIV RNA level was used as the principal outcome variable for comparing the predictive validity of different combinations of adherence items. We also compared the predictive validity of questionnaire and medication event monitoring system (MEMS) data.
ACTG 3707 was a rollover study of 159 subjects who had taken lamivudine plus zidovudine, stavudine, or didanosine in ACTG 306. Participants with viral loads >500 copies/mL were randomized to stavudine + delavirdine + indinavir, to zidovudine + lamivudine + indinavir, or to zidovudine + delavirdine + indinavir.
ACTG 3988 was a randomized placebo-controlled trial of saquinavir, indinavir, or nelfinavir in combination with amprenavir (APV), abacavir, efavirenz (EFV), and adefovir in 481 subjects with protease inhibitor failure (plasma HIV RNA level >1000 copies/mL).
ACTG 370 and ACTG 398 obtained periodic measures of plasma HIV concentration and self-reported adherence with the ACTG Adherence Questionnaire. ACTG 398 also obtained an electronic measure of medication adherence for once-daily EFV and twice-daily APV using an electronic measure, MEMS (Aardex, LTD, Union City, CA). Methods used in these trials have been described previously.7,8
Self-reported adherence was assessed with the ACTG Adherence Questionnaire.2 The questionnaire queried the patient on the number of doses missed of a medication during each of the 4 days before a clinic visit (eg, “How many doses did you miss yesterday, the day before yesterday, 3 days ago, and 4 days ago?”). Because patients usually took >1 drug, the adherence ratio for each of the 4 days prior was calculated as 1 − (number of doses missed for the day/number of doses prescribed), as shown in Table 1 By using the adherence ratio, we took the numbers of different drugs and the number of pills/doses into consideration. In addition, other adherence behavior was assessed with 4 questions regarding adherence with the daily schedule (see Table 1):
“Most study medications need to be taken on a schedule… How closely did you follow your specific schedule over the last 4 days?”
“Do any of your medications have special instructions… If so, how often did you follow those instructions over the last 4 days?”
“When was the last time you missed any of your medications?”
“Some people find that they forget to take their pills on the weekend days. Did you miss any of your medications last weekend-last Saturday or Sunday?”
The ACTG Adherence Questionnaire was administered in each of the trials used in this study at baseline and at 4, 8, 16, 24, 32, 40, and 48 weeks (ACTG 370 had an extra measurement at 12 weeks).
Plasma HIV RNA Concentration
Plasma HIV RNA concentration was used as the clinical criterion and primary outcome variable in the analyses. Plasma HIV RNA concentration was assessed sequentially in each of the trials at a minimum of baseline and 4, 8, 16, 24, 32, 40, and 48 weeks (ACTG 370 had an extra measurement at 12 weeks). Plasma HIV RNA concentration was measured in each trial in copies per milliliter. The threshold for detectability of HIV viral load was different for the 2 studies (50 copies/mL for ACTG 370 and 200 copies/mL for ACTG 398). To make the analysis of those 2 studies more comparable, however, we used 200 copies/mL as the threshold for dichotomizing the plasma HIV RNA copy numbers for both studies.
Electronic monitoring data of adherence for ACTG 398 were available for weeks 4, 8, 16, 24, 32, and 48. We used 2 summary measures of adherence based on MEMS data:9 (1) the proportion of compliant days (PCD) and (2) the proportion of doses taken (PDT). A compliant day (CD) was defined as the day when the patient takes the prescribed number of doses (1 for EFV and 2 for APV). If a patient took more than the number of prescribed doses in a day, that day is considered a CD (CD = 1), but the proportion of doses taken for that day would be >1.
Stage 1 of the analyses was conducted to (1) estimate reliability of the ACTG Adherence Questionnaire items and (2) discern whether there were any items that did not provide information in addition to that provided by other questions (redundant, unnecessary items) for purposes of data reduction. The Cronbach α10 and correlational and logistic regression techniques were used. Variables were first standardized (divided by the sample standard deviation) before the Cronbach α was calculated, and acceptable reliability was set as values of 0.70 or greater.10 The presence of redundant items was assessed using correlational and logistic regression techniques. Logistic regression analysis was used to establish how well items were predicted by other variables, with an area >0.90 under the receiver operating characteristic (ROC) curve considered as producing outstanding discrimination.11
In stage 2, we used PC analysis to construct summary measures using items retained from the stage 1 analyses. PC analysis is a technique used to construct a linear combination of the variables to explain as much variability of the data as possible. The PCs were then used (along with other variables) in logistic regression models to determine predictive validity of different adherence scoring methods using plasma HIV RNA concentration (greater than/less than the threshold) as the principal outcome variable. In these logistic regression models, the plasma HIV RNA of the previous visit was included as a predictor. Including this variable automatically adjusts the model for differing overall levels of HIV RNA in different subjects, such that the adherence information can directly inform about changes in HIV RNA concentration rather than informing about absolute levels of HIV RNA.
We modeled the odds of the plasma HIV RNA measurement being lower than the threshold of 200 copies/mL in our analyses. Because of missing plasma HIV RNA measurements, only 461 subjects (with 1671 visits) were included for the validation part of the analysis. For ACTG 370, 479 (78%) of 616 visits had plasma HIV RNA measurements lower than the threshold of 200 copies/mL, whereas for ACTG 398, 448 (42%) of 1055 visits had plasma HIV RNA measurement <200 copies/mL. The plasma HIV RNA status of the previous visit (RNAp) and drug regimen were included as covariates in all our logistic regression models irrespective of whether they turned out to be significant. It is necessary to adjust for the effect of drug regimen on plasma HIV RNA concentration because different drug regimens produce different levels of plasma HIV RNA suppression or some regimens are harder for patients to comply with (eg, because of the number of daily doses, adverse effects). The potentially different effect of these treatments on RNA level was thus taken into consideration by including treatment group effect in each of the models.
We then examined how well the PCs correlated with adherence indices (PDT and PCT) based on adherence data collected with MEMS in a subset of subjects (ACTG 398).9
Finally, using weights derived from the PC analyses, we produced a formula for calculating an adherence index standardized on a scale from 0 to 100. The distribution of adherence scores calculated using the adherence formula (all weighted questionnaire items included) was then compared with the standard approach (mean adherence calculated with 4-day recall only).
Stage 1: Reliability Assessment and Data Reduction
Cronbach α coefficients for the standardized scores for the ACTG Adherence Questionnaire items (5 items/8 variables) were all >0.80, demonstrating good reliability per a priori criteria (see Table 1).
Correlation analyses demonstrated that 3 groups of ACTG Adherence Questionnaire items had high correlations within each group: group 1 (item 1-adherence ratio yesterday [Adhyest], 2 days ago [Adh2dago], 3 days ago [Adh3dago], and 4 days ago [Adh4dago]) had pairwise Spearman correlations between 0.44 and 0.53, group 2 (items 2 to 3-how closely schedule followed [Schedule] and how closely instructions followed [Instructions]) had a Spearman correlation of 0.50, and group 3 (items 4 to 5-whether any medication was missed last weekend [Weekend] and when any medication was last skipped [Lastskip]) had a Spearman correlation of −0.53.
Logistic regression analysis demonstrated that the dichotomous variable Weekend was well predicted by the other variables, with an area of 0.91 under the ROC curve,10 which is considered as producing outstanding discrimination. It was omitted from subsequent analyses.
Stage 2: Summary Measure of Adherence and Validity Assessment
PC analysis was used to generate summary measures based on the combined data set. The first 3 PCs (PC1, PC2, and PC3 shown in Table 2 in the original/nonstandardized scale) explained 50%, 17%, and 10% of the variability in self-reported adherence data, respectively. Note that although PC1 assigns weights that are equal for the first 3 factors and dominate it, the standardized scale “weights” also shown in Table 2 make evident the importance of the other items.
Interpreting the PC scores is simplest on the standardized scale, where measures are independent of the variables' units of measurement. The first PC (PC1) assigns approximately equal “weights” to all variables; thus, the component can be thought of as an average adherence component across the items. The second PC (PC2) shows small scores for the common4-day recall variables but high scores for the last 3 variables. This pattern suggests that after accounting for average adherence, the next dimension that explains variations in people's patterns is along a global adherence component. Finally, the last PC (PC3) puts heavy weight on the last variable, the time when medication was last skipped (time of Lastskip component), suggesting that this variable further differentiates people who score similarly along the first 2 components. Although all these components have some interpretation, it can be argued that only the first PC is of real importance because it explains much more variance than the other 2. It should be noted that we obtained almost identical PCs when analyzing the 2 studies separately (not shown).
To see how well the PCs correlated with the outcome measure (plasma HIV RNA), we used logistic regression models. As shown in Table 3, PC1 was highly significant in both studies. Neither PC2 nor PC3 was significant in ACTG 370 or ACTG 398. After adjusting for the treatment regimen and the plasma HIV RNA level of the previous visit, an increase of 1 unit in the PC1 score based on standardized component scores was associated with an average of 29% (95% confidence interval [CI]: 13% to 47%) higher odds of having a plasma HIV RNA level <200 copies/mL in ACTG 370. In ACTG 398, the standardized PC1 score, higher by 1, was associated with an increase of approximately 15% (95% CI: 5% to 26%) odds of an undetectable plasma HIV RNA level. These results are consistent with the composition of these PCs, as seen in Table 2.
To compare the predictive power of the PCs with a simple average of item 1 (4-day adherence recall [ADHWKALL]), separate logistic regression models were fit for the 2 studies and are summarized in Table 4. The plasma HIV RNA level of the previous visit and treatment regimen variables were included in all models but not reported in Table 4. The models were compared using Akaike information criterion (AIC),12 with a smaller value indicating better fit and estimated area under ROC curve (c statistic) and with a larger c statistic showing better discrimination. According to the AIC criteria, the model with PC1 alone fit better than the model with ADHWKALL alone in both studies. The model with PC1 alone has a larger c value than the model with ADHWKALL for ACTG 398, but they are identical for ACTG 370. Taken together, PC1 is superior to ADHWKALL.
We then compared 2-predictor models. The models were approximately the same in terms of the AIC and c statistic. Both 2-predictor models had a larger AIC than the PC1 alone model, however, and had an almost identical c value for ACTG 370. In ACTG 398, both 2-predictor models had a slightly smaller AIC and a slightly larger c value than the PC1 alone model, indicating that ADHWKALL or PC2 did contribute some additional information on top of PC1. Because PC1 assigns equal dominant weights to the past 4 days response, ADHWKALL plays a key role in its construction. This is also highlighted by the fact that ADHWKALL is highly correlated with PC1 (r = 0.93 for ACTG 370, r = 0.95 for ACTG 398) and only moderately correlated with PC2 (r = 0.28 for ACTG 370,r = 0.37 for ACTG 398). In summary, PC1 is the predictor that contains the most information for improving the prediction of plasma HIV RNA loads. ADHWKALL is collinear with PC1, and it contributes approximately the same amount of information on top of PC1 as PC2 does.
For the ACTG 398 MEMS data, similar logistic regression models were used to assess whether the PC or MEMS indices (PDT, PCD) were more strongly associated with plasma HIV RNA. Table 5 shows the results for the predictors of interest for the various models examined. Again, the common covariates (plasma HIV RNA of the previous visit and treatment regimen) were included in all models but are not shown in Table 5.
As single predictors, higher PC1, PDT, and PCD scores are all significantly associated with greater odds of having a plasma HIV RNA level <200 copies/mL. Among the 3 models, that with PC1 alone has the smallest AIC and largest c value, indicating that it fits the data better than the other 2 models. An increase of PC1 by 1 unit (unstandardized score) is associated with an average increase of 17% (95% CI: 5% to 31%) in terms of the odds of a plasma HIV RNA level <200 copies/mL. Because of the different scales, this cannot be directly compared with the estimated OR for PCD and PDT. All 3 of the 2-predictor models (with PC1 already in the model) provided some moderate decrease of AIC and increase in c statistic. The greatest improvement was seen with PDT, and the least was seen with PCD. This suggests that PDT, PCD, and PC2 do provide some (limited) information on top of PC1 for predicting the odds of an undetectable plasma HIV RNA level. PDT provides the most improvement, but its slight superiority over PC2 may not justify the effort to collect daily dose count data. Therefore, if we were to choose a 2-predictor model, we would select PC2 in addition to PC1.
Next, we looked at a few 2-predictor models with PC1 and PC2. None of the 3-predictor models were better than the model with PC1 and PDT. Interestingly, except for the PC1 and PDT model, only the estimated odds ratio (OR) for PC1 was significant in any model with 2 or more predictors. The estimated OR and its 95% CI remain stable across all the models, whereas those of PDT and PCD decrease whenever PC1 is present. Note here that the data used to calculate Table 5 are only a subset of those used for Table 4 because of missing MEMS data. Between PDT and PCD, the former has more predictive power. This is expected, because PDT preserves more information from the MEMS data. In conclusion, if we were to choose a single index of adherence, we would choose PC1.
Because PC1 accounts for a large amount of variation in the data, the values derived from this analysis were used to create a formula for calculating an adherence index normed to be between 0 and 100. Using the unstandardized regression weights, PC1 has a range from −0.65 (worst adherence) to 11.34 (best adherence) depending on the values reported in the adherence questionnaire. To make this scale easier to interpret, we converted it to an interval between 0 and 100 by adding 0.65, dividing the sum by 11.99 (the full length of the interval), and multiplying by 100. The formula for calculating an adherence index using all the weighted ACTG Adherence Questionnaire items (less the Weekend variable) and normalized between 0 and 100 is thus:
Equation (Uncited)Image Tools
Note that because this is a linear transformation of PC1, this adherence index (as applied to ACTG 370 and 398 data) retains the same relation with viral load and MEMS data as described previously.
Comparing the distribution of adherence scores with adherence calculated using the adherence formula versus the standard approach (mean 4-day recall), substantially greater variability in the distribution of adherence scores was demonstrated when all items were taken into account (Fig. 1). Likewise, in a post hoc analysis using data from a more recent trial with contemporary antiretroviral medication regimens (A509513,14), similar improvement in variability was observed (see Fig. 1).
Accurate measures of adherence are essential in efforts to understand and improve adherence to antiretroviral medications. A self-report measure of adherence, the ACTG Adherence Questionnaire, has been used extensively, but investigators have frequently analyzed only data collected with the first item of the questionnaire (4-day recall). This study suggests that a superior assessment of adherence may be obtained with the ACTG Adherence Questionnaire by including each of the ACTG Adherence Questionnaire items (excluding 1 item) and the method used in this analysis.
A summary measure of adherence using items 1 to 4 of the ACTG Adherence Questionnaire was found to be strongly associated with plasma HIV RNA outcome, to have greater variability (in contrast to 4-day recall alone), and compared favorably with adherence estimates based on discrete MEMS indices. This is a potentially important finding, because the ACTG Adherence Questionnaire is less expensive and a much more convenient method for collecting adherence data.
Using PC analyses, an examination of the weights in the first PC assigned to each of the questionnaire items revealed the strong uniform role of the most recent past (4-day recall) modulated by other longer term dimensions of adherence inherent in subjects (items 2 to 4). Findings demonstrate that the original questionnaire items are reliable and that each contributes unique information about antiretroviral adherence, with 1 exception. The variable Weekend (whether any medication missed last weekend) was found to provide redundant information that may create problems with multicollinearity if included in measures of adherence that take each of the other ACTG Adherence Questionnaire items into account.
Our findings are consistent with recent analyses demonstrating that adherence rate is only one component of antiretroviral adherence that is important to clinical outcomes. For example, recent analyses of adherence data collected with electronic measures indicate that dose timing is an important aspect of adherence behavior.9
Although results from these analyses are promising, the data used in the analyses were collected in ACTG trials testing more complex and burdensome regimens than used in current regimens. It is possible that inclusion of the extra items of the ACTG Adherence Questionnaire may add less to the measurement of current simplified antiretroviral regimens that have fewer temporal and dietary restrictions than the regimens used in this analysis. Although the adherence formula needs to be further studied in current trials, our preliminary analysis of adherence data collected in a trial that used more contemporary antiretroviral medications (A5095) shows promise. Adherence calculated using the adherence formula showed that the weights and normalized constants derived from ACTG 370 and 398 produced an index within 0 to 100 as desired. It also produced a greater distribution of adherence scores that tempered the ceiling effect found when adherence was calculated with 4-day adherence recall alone.
In summary, we conducted a series of analyses to estimate the reliability of the 5-item ACTG Adherence Questionnaire, identify questionnaire items that did not provide additional information, and compare methods of summarizing data collected with the ACTG Adherence Questionnaire. Our findings suggest that measuring antiretroviral adherence using the first item of the ACTG Adherence Questionnaire provides a valid and reliable measure of adherence. In this series of analyses, however, the strength of the adherence estimate was improved with inclusion of all items of the questionnaire (less the Weekend item). A formula for calculating an adherence index derived from PC analyses and weighting each of the ACTG Adherence Questionnaire items produced a less volatile scale, which, in contrast to 4-day recall alone, is less skewed and compares favorably with 2 different MEMS indices in association with plasma HIV RNA. The formula shows promise for retaining its value for use with other ACTG questionnaire data, but additional analyses are warranted to substantiate the stability of the index further across a range of data sets.
The authors thank the ACTG 370, ACTG 398, and A5095 teams for data used in this analysis and Christopher Holloman, PhD, Director of the Statistical Consulting Service, The Ohio State University, for his helpful suggestions and contributions to this project. In remembrance, the authors express their appreciation to Lewis B. Sheiner, MD, PhD, Departments of Laboratory Medicine and Biopharmaceutical Sciences, University of California at San Francisco, and Robert A. Zackin, ScD, Harvard School of Public Health, for their contributions to the design of this project.
© 2007 Lippincott Williams & Wilkins, Inc.