The development of an expert system to predict virological response to HIV therapy as part of an online treatment support tool : AIDS

Secondary Logo

Journal Logo


The development of an expert system to predict virological response to HIV therapy as part of an online treatment support tool

Revell, Andrew D.a; Wang, Dechaoa; Boyd, Mark A.b,c; Emery, Seanb; Pozniak, Anton L.d; De Wolf, Franke; Harrigan, Richardf; Montaner, Julio S.G.f; Lane, Cliffordg; Larder, Brendan A.a on behalf of the RDI Study Group

Author Information
AIDS 25(15):p 1855-1863, September 24, 2011. | DOI: 10.1097/QAD.0b013e328349a9c2
  • Free



The optimum selection and sequencing of combination antiretroviral therapy to maintain viral suppression can be challenging. The HIV Resistance Response Database Initiative has pioneered the development of computational models that predict the virological response to drug combinations. Here we describe the development and testing of random forest models to power an online treatment selection tool.


Five thousand, seven hundred and fifty-two treatment change episodes were selected to train a committee of 10 models to predict the probability of virological response to a new regimen. The input variables were antiretroviral treatment history, baseline CD4 cell count, viral load and genotype, drugs in the new regimen, time from treatment change to follow-up and follow-up viral load values. The models were assessed during cross-validation and with an independent set of 50 treatment change episodes by plotting receiver–operator characteristic curves and their performance compared with genotypic sensitivity scores from rules-based genotype interpretation systems.


The models achieved an area under the curve during cross-validation of 0.77–0.87 (mean = 0.82), accuracy of 72–81% (mean = 77%), sensitivity of 62–80% (mean = 67%) and specificity of 75–89% (mean = 81%). When tested with the 50 test cases, the area under the curve was 0.70–0.88, accuracy 64–82%, sensitivity 62–80% and specificity 68–95%. The genotypic sensitivity scores achieved an area under the curve of 0.51–0.52, overall accuracy of 54–56%, sensitivity of 43–64% and specificity of 41–73%.


The models achieved a consistent, high level of accuracy in predicting treatment responses, which was markedly superior to that of genotypic sensitivity scores. The models are being used to power an experimental system now available via the Internet.


Since the advent of HAART, long-term suppression of HIV and concomitant prevention of HIV disease progression has become readily achievable for the majority of patients in well resourced healthcare settings. Nevertheless, despite the availability of approximately 25 antiretroviral drugs from six classes, viral breakthrough remains a significant clinical challenge. The latter is often associated with the emergence of drug-resistant virus, necessitating a change in therapy [1,2]. Sustained re-suppression of drug-resistant virus requires optimal selection of the next regimen. The complexities of HIV drug resistance interpretation and the number of potential drug combinations available make successful individualized sequencing of antiretroviral therapy highly challenging [3]. For physicians with limited experience or resources, antiretroviral treatment decision-making can become even more problematic.

The standard of care in well resourced settings is to monitor the patient's viral load regularly, with detection of viral breakthrough triggering a re-evaluation of the efficacy of the antiretroviral drug regimen [1,2]. Once the viral breakthrough is confirmed with repeated viral load testing, a genotypic resistance test is usually performed to identify any selected viral mutations that may confer drug resistance. The interpretation of this genotype is often complex and is usually performed using rules-based interpretation software that relates point mutations to the susceptibility of the virus to single drugs [4]. However, there is no gold standard interpretation system: different systems provide different interpretations with varying degrees of agreement [5–9]. Moreover, it is difficult to relate genotypic changes and the related predicted susceptibility to individual drugs to the likely relative responses to potential drug combinations. Indeed, raw genotypic sensitivity scores have been shown to be relatively weak predictors of virological response [10–13].

Bioinformatics have been used most commonly to predict phenotype from genotype and then relate a cut-off in predicted phenotype to a categorical response [14,15]. Again, it is difficult to relate this categorical prediction for an individual drug to the relative responses that may be achieved with different candidate combinations.

Models that provide a quantitative prediction of virological response to combination therapy, rather than to individual drugs, directly from the genotype and other clinical information may offer a potential clinical advantage. However, this can be challenging given that a very large dataset is required to accommodate a range of prognostic variables, including multiple possible drug–genotype permutations and their respective drug response data [16]. The HIV Resistance Response Database Initiative (RDI) was established in 2002 explicitly to take on this challenge and be the global repository for data, collected from clinical practice around the world, required to develop such models [17].

Currently, we have collected data from approximately 84 000 patients, predominantly from western Europe and North America, but also including some from Africa, Australia and Japan. We have previously trained computational models, including artificial neural networks, random forests and support vector machines, using subsets of these data to predict virological response to treatment from genotype, viral load, CD4 cell count and treatment history [18]. When tested with independent retrospective data, the models have proved accurate, with correlations between the predicted and actual changes in viral load in excess of 0.8 (r2 ≥ 0.65), which compares favorably with the correlations typically achieved by common rules-based genotype interpretation systems [13,19]. In addition, the models are able to identify combinations of antiretroviral drugs that are predicted to be effective for a substantial proportion of cases of virological failure in the clinic following a genotype-guided change in therapy [20,21].

In order to assess the clinical utility of the RDI tool, a Web-based user interface was developed that provided clinical investigators access to predictions of virological response to alternative antiretroviral regimens. Two multinational clinical pilot studies were initiated in which 23 participating physicians entered baseline data for 114 cases of treatment failure via the interface and then registered their treatment intention based on all the laboratory and clinical information available to them [22]. The baseline information was automatically input to the RDI models, which made predictions of response to their intended regimen plus more than 200 potential alternative combinations of antiretroviral drugs. The physician received an automated report listing the five alternative regimens that the models predicted would be most effective, plus their own treatment selection, ranked in order of predicted virological response. Having reviewed the report, the physicians entered their final treatment decision.

Overall 33% of treatment decisions were changed following review of the report. The final treatment decisions and the best of the RDI alternatives were predicted to produce significantly greater virological responses and involve fewer drugs than the physicians’ original selections. The system was found to be easy to use and positively rated as a useful aid to clinical practice. Participating physicians also submitted their suggestions for maximizing the utility of the system for current clinical practice.

An alternative system for predicting short-term treatment responses specifically at 8 weeks after a change in antiretroviral treatment, using a combination of three different computational models trained with information from a European dataset, has recently been evaluated and shown to be comparable or superior to estimates of response provided by physicians [23].

Encouraged by our results, we set out to develop a new set of models to power a version of the online system that would incorporate the suggestions made by the physicians and be made available over the Internet. Here, we describe the development and evaluation of computational models trained to predict the probability of a regimen reducing the viral load to below 50 copies/ml and the use of these models to power the RDI's online treatment selection aid that was launched in October 2010.


Data selection

The unit of data used to train computational models is the treatment change episode (TCE), as first described by the RDI in 2003 [21]. This comprises the following on-treatment data collected immediately prior to and then following a change in antiretroviral therapy guided by a genotype (as illustrated in Fig. 1):

  1. Plasma viral load from a sample taken no more than 8 weeks prior to the change in treatment.
  2. CD4 cell count and genotype from samples taken no more than 12 weeks prior to the change in treatment.
  3. The drugs in the baseline regimen.
  4. Antiretroviral treatment archive.
  5. The drugs in the new regimen.
  6. The time to follow-up.
  7. A follow-up plasma viral load taken between 4 and 48 weeks following introduction of the new regimen.
Fig. 1:
The treatment change episode.VL, viral load.

Secondary treatment change episode selection rules

TCEs with all the above data were extracted and then edited according to the following additional rules:

  1. No more than three TCEs from the same change in therapy (using multiple follow-up viral loads) were extracted for use in any modeling. All TCEs from the same treatment change must have follow-up viral load determinations more than 4 weeks (>28 days) apart.
  2. TCEs involving the following drugs that are no longer in current use in clinical practice, as either the failing regimen or the new regimen were excluded: zalcitabine, delavirdine, loviride, emivirine, capravirine, atervidine and adefovir. These drugs were permitted in the treatment archive position, however.
  3. Any TCEs involving the following drugs, which were not adequately represented in the RDI database, were excluded: tipranavir, raltegravir and maraviroc.
  4. Any TCEs that included a protease inhibitor (other than nelfinavir) without ritonavir as a booster, in the failing or new regimen positions, were excluded. Any TCEs that had ritonavir as the only protease inhibitor in the failing or new regimen were also excluded.
  5. Any TCEs without any resistance mutations were excluded from modeling.
  6. TCEs with viral load values of the form ‘<X’ where X was greater than 50 or 1.7 log copies (e.g. ‘<400’ copies) were excluded as the true values were not known.
  7. Three alternative filters were initially applied related to the inclusion or exclusion of TCEs involving treatment with fewer than three full-dose drugs that were not part of a deliberate treatment simplification strategy (‘suboptimal treatment’): permitted in the treatment archive position only; permitted in the archive and baseline positions but not in the new regimen; and permitted in any position. Single models were developed using each of these filters and the filter associated with the best model performance was then taken forward for the main round of modeling.

Computational model development

Random forest models were developed to predict the probability of the follow-up viral load being less than 50 copies/ml, using the TCEs that met all the above criteria. A random forest model (see the following subsection of the same name for details) is a predictor consisting of a collection of decision trees. Each decision tree is a decision support tool that uses a tree-like graph of decisions and their possible consequences. The inputs to the trees are the values of the input variables used to train the random forest model. The 85 input variables used to train the models in this study were selected on the basis of previous modelling studies and were:

  1. the baseline viral load (log10 copies HIV RNA/ml),
  2. the baseline CD4 cell count (number of cells/ml),
  3. the treatment history up to the point of treatment change (five variables determined by previous research to have a significant impact on the accuracy of models, coded as 1 = exposure, 0 = no exposure): zidovudine; lamivudine/emtracitabine; any non-nucleoside reverse transcriptase inhibitors (NNRTIs); any protease inhibitors; and enfuvirtide,
  4. the following 59 baseline mutations in the HIV RNA regions encoding reverse transcriptase and protease, coded as binary variables (present = 1, absent = 0): reverse transcriptase (n = 32; M41L, E44D, A62 V, K65R, D67N, 69 insert, T69D/N, K70R, L74 V, V75I, F77L, V90I, A98G, L100I, L101I/E/P, K103N, V106A/I, V108I, Y115F, F116Y, V118I, Q151M, V179D/F, Y181C/I/V, M184V, Y188C/L/H, G190S/A, L210W, T215Y, T215F, K219Q/E, P236L); protease (n = 27; L10F/I/R/V, V11I, K20M/R, L24I, D30N, V32I, L33F, M36I, M46I/L, I47V, G48V, I50V, I50L, F53L, I54V/L/M, L63P, A71V/T, G73S/A, T74P, L76V, V77I, V82A/F/S, V82T, I84V/A/C, N88D/S, L89V, L90M),
  5. the following 18 antiretroviral drugs in the new regimen (present = 1, not present = 0): zidovudine, didanosine, stavudine, abacavir, lamivudine/emtracitabine, tenofovir DF, efavirenz, nevirapine, etravirine, indinavir, nelfinavir, saquinavir, (fos)amprenavir, lopinavir, atazanavir, darunavir, ritonavir (as a protease inhibitor booster), enfuvirtide,
  6. time from the change of treatment to the follow-up viral load (number of days).

The output from the trees was the follow-up viral load coded as a binary variable such that an undetectable viral load (value ≤1.7 log or 50 copies/ml) is coded as 1 and a detectable viral load (any value above 1.7 log or 50 copies/ml) as 0. The models were trained to produce an estimate of the probability of the follow-up viral load being less than 50 copies/ml.

Each random forest model was trained by building individual trees using bootstrap samples that were drawn from the training set. The trees were built to predict the out-of-bag (OOB) samples, which were not present in the bootstrap samples. A randomly selected subset of input variables (covariates) was used to build an optimized tree with each node splitting the data into finer branches, which resulted in a classification of patients into a number of clusters. As there are many covariates and treatment change outcomes, it is computationally too intensive to find the optimal tree model. Therefore, a very large number (200–300) of trees were built using random selections of subsets of covariates at the nodes. The predictions from these decision trees (a random forest) were averaged across the forest.

Initially, three random forest models were trained using the TCEs obtained by applying each of the filters described in point 7 of the section ‘Secondary TCE selection rules’ above relating to suboptimal treatments. A common independent set of 200 TCEs was randomly selected for testing these random forest models, with the constraints that no patient could have TCEs in both the training and test sets and only one TCE per patient was used in the test set. The results of this modeling were then used to select the filter used for the development of a committee of 10 random forest models.

The performance of the models as predictors of virological response was evaluated by plotting receiver–operator characteristic (ROC) curves and assessing the area under the ROC curve, the overall accuracy, the sensitivity and the specificity.

Random forest models

A random forest model is a group of tree predictors

, where x represents m observed input variables with associated random vector X, and θt are independent and identically distributed outputs of random variables, which are used to determine how the successive cuts are performed when building the individual trees, such as selection of coordinate to split and the position of the split. The training dataset is assumed to be independently drawn from the joint distribution of (X, Y), where Y is the probability of viral load being undetectable under the current/new regimen; that is, it is taken from patterns (xi,yi), i = 1,2,...,N (N is the total number of TCEs). The ‘patterns’ are rows of data representing TCEs with m + 1 variables consisting of m observed input variables, denoted by x and one outcome, the probability of viral load being undetectable under the current/new regimen, denoted by y. In fact, individual trees are fitted using the training dataset, which is a subsample (K TCEs) of the whole dataset (N TCEs), where

, because the sampling is done with replacement, so about 37% of TCEs are left out of the bootstrap sample and not used in the construction of trees. The random forest prediction is calculated by (1):

According to the law of large numbers,


. The training procedure of random forest models included the following steps. First, a bootstrap sample is drawn from the whole training dataset. Second, a tree is built for each bootstrap sample. At each node, the best split among a randomly selected subset of input variables is chosen. The tree building is stopped when the tree is grown to the maximum size (the number of cases in a node is below a threshold of 5). Third, these steps are repeated to generate a sufficiently large number of trees. The random forest model is trained using the random forest package in R [25]. The predicted probability of viral load being undetectable under the current/new regimen from the committee of random forest models is estimated by (2):

where L is the size of the committee of random forest models.

Internal cross-validation

The committee of 10 random forest models was developed using a 10 cross-validation schema whereby 10% of the TCEs were selected at random and the remainder used to train numerous models and their performance gauged by cross-validation with the 10% that had been ‘left out’. Model development continued until further models failed to yield improved accuracy. This process was repeated 10 times until all the TCEs had appeared in the validation set once. With each cross-validation partition, the best performing random forest model was selected as a member of the final committee of models.

External validation

The random forest models were also validated using external data. The three initial random forest models were tested using the independent set of 200 TCEs from patients partitioned at random from the initial set of available TCEs. Following internal cross-validation, the final committee of 10 random forest models were tested using an independent set of 50 TCEs from two clinics in Sydney Australia (Immunology B Ambulatory Care Service at St Vincent's Hospital and Taylors Square Private Clinic). A smaller test set was used for this purpose in order to maximize the TCEs available for training and because the accuracy of the 10 random forest models had been established during cross-validation.

In addition to the performance of the 10 individual random forest models, the committee average performance (CAP) was evaluated using the mean of the predictions of the 10 models for each of the 50 test TCEs. It has been shown that the average prediction across the forests, known as the committee vote, is usually more accurate than the prediction from a particular forest when the system is used to make predictions for external data [24].

Performance of the models for regimens including etravirine, the newest drug to be included in the RDI's modeling, was evaluated separately in order to check for acceptable performance because of the relatively small number of TCEs available with this drug.

The random forest models were compared with genotypic sensitivity scores (GSSs) derived using three interpretation systems in common use (Stanford HIVDB 6.0.10, REGA V8.0.2 and ANRS V2010.07, accessed via the Stanford Web site 03 February 2011) in terms of the accuracy of the their predictions for the 50 test TCEs. The full list of mutations derived from population-based sequencing, rather than the subset used for the development of the computational models, was used to obtain these scores. In each case, the GSS for each regimen was derived by adding the score for each constituent drug and using the total score for the regimen as a predictor of response. The basic version of each system (which classified the virus as being sensitive, intermediate or resistant) was used, with sensitive scored as 1, intermediate as 0.5 and resistant as 0. In addition, the expanded versions of HIVDB with five categories and REGA with six were used, with HIVDB scoring from 0 to 1 in 0.25 intervals and REGA categories from 0 to 1.5.


Characteristics of the training and test datasets

After the application of the three alternative TCE selection filters relating to the inclusion or exclusion of TCEs with suboptimal treatments, training sets of 3692, 5334 and 6136 TCEs were obtained. The single random forest models developed using these datasets were tested with the independent, randomly selected test set of 200 TCEs producing ROC curves with area under the curve (AUC) values of 0.74, 0.78 and 0.79 respectively. The overall accuracy was 71, 74 and 71%, respectively. Sensitivity (percentage of responses correctly predicted) was 73, 63 and 73 and specificity (percentage of failures correctly predicted) was 70, 79 and 70. Performance of the models with 25 etravirine-containing TCEs gave AUC values of 0.74, 0.80 and 0.76, overall accuracy of 72, 80 and 72% with sensitivity of 89, 88 and 89% and specificity of 63, 81 and 63, respectively.

On the basis of the figures for overall accuracy, AUC and specificity with the entire test set and the etravirine TCEs, the second filter (suboptimal TCEs permitted in the treatment archive and baseline positions) was selected and used for the selection of TCEs for the main round of modeling. This resulted in 5752 TCEs, of which 553 included etravirine-based regimens.

The selected data came from 24 sources: five cohorts, nine individual clinics and 10 clinical trials, with data from more than 15 countries. The characteristics of the two sets of test TCEs are summarized in Table 1.

Table 1:
Characteristics of the treatment change episodes in the training and test sets.

Results of the modeling

The performance characteristics from the ROC curves of the 10 individual models during cross-validation and testing with the independent test set of 50 TCEs from Sydney, Australia, are summarized in Table 2. The 10 models achieved an AUC during cross-validation ranging from 0.77 to 0.87, with a mean of 0.82. The overall accuracy ranged from 72 to 81% (mean = 77%), the sensitivity from 62 to 80% (mean = 67%) and the specificity from 75 to 89% (mean = 81%). The ROC curve for the best performing model during cross-validation is presented in Fig. 2.

Table 2:
Performance parameters for the 10 random forest models and the committee average.
Fig. 2:
Receiver–operator characteristic curve from best performing random forest models during cross-validation.AUC, area under the curve.

When tested with the 50 independent test TCEs, the 10 models achieved an AUC ranging from 0.70 to 0.88, with a mean of 0.79 and a CAP value of 0.83. The overall accuracy ranged from 64 to 82% (mean = 71%, CAP = 76%), the sensitivity from 57 to 71% (mean = 66%, CAP = 71%) and the specificity from 68 to 95% (mean = 79%, CAP = 82%).

The performance of the 10 models with etravirine-containing regimens during cross-validation gave an AUC of 0.84, overall accuracy of 80%, sensitivity of 71% and specificity of 91%.

When the GSSs for the 50 test TCEs from the basic ANRS, HIVDB and REGA genotype interpretation systems were used as predictors of response, predictions were close to chance. The AUC values for the three systems were 0.53, 0.56 and 0.55, respectively (Table 2). Overall accuracy was 46, 52 and 50%; sensitivity 50, 50 and 46% and specificity was 41, 55 and 55% , respectively. The expanded version of HIVDB and REGA gave AUC values of 0.57 and 0.55 with the overall accuracy of 58 and 52%, sensitivity of 71 and 50% and specificity of 41 and 55%, respectively.


These results demonstrate that the random forest models achieved a consistent, high level of accuracy in predicting virological responses to combination antiretroviral treatment, which was markedly superior to that of GSSs.

The results of the initial round of modeling, using test data selected using alternative filters for suboptimal regimens, suggested that the inclusion of TCEs with monotherapy or dual therapy in the treatment history was associated with better performing models. In selecting this filter for the main round of modeling, more emphasis was given to specificity (percentage of failures correctly predicted) rather than sensitivity, as clinically, it is more critical to predict treatment failure reliably than success.

The 10 random forest models that were subsequently developed achieved consistently accurate predictions of responses to treatment, whichever measure was considered. Specificity was consistently higher than sensitivity during cross-validation and with the test set, averaging approximately 80 vs. 67%. The random forest committee as a whole (CAP) performed better than all but one or two individual models.

The performance of all the models was markedly superior to that of GSSs from rule-based genotype interpretation systems in common use. This finding is consistent with previous studies [13]. This may in part reflect the inherently superior accuracy of a system developed to predict virological response to combination therapy compared with one that makes categorical predictions of sensitivity or resistance to individual drugs. The GSSs performed unusually poorly here: historically, GSSs from these systems have predicted treatment response with accuracy typically in the region of 60–65%. However, it should be pointed out that the test set was too small to be considered adequate to test these systems, the outputs from which are not truly continuous variables.

On the basis of the overall accuracy of the 10 random forest models during cross-validation and independent testing, it was decided that these models could be used to power an experimental treatment support system and made available for open testing via the Internet. HIV-TRePS was launched in October 2010 (available at Long-term evaluation of this Internet-based system is currently underway. The main shortcoming of the current models is that they do not include some of the newest drugs (maraviroc, raltegravir and tipranavir). New models are under development to include these drugs and to replace the current models in 2011.


This project has been funded in whole or in part with federal funds from the National Cancer Institute, National Institutes of Health, under Contract No. HHSN261200800001E. This research was supported by the National Institute of Allergy and Infectious Diseases. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government.

RDI data and study group: The RDI wishes to thank all the following individuals and institutions for providing the data used in training and testing these models.

Cohorts: Frank De Wolf and Joep Lange (ATHENA, the Netherlands); Julio Montaner and Richard Harrigan (BC Center for Excellence in HIV & AIDS, Canada); Brian Agan. Vincent Marconi and Scott Wegner (US Department of Defense); Wataru Sugiura (National Institute of Health, Japan); Maurizio Zazzi (MASTER, Italy).

Clinics: Jose Gatell and Elisa Lazzari (University Hospital, Barcelona, Spain); Brian Gazzard, Mark Nelson, Anton Pozniak and Sundhiya Mandalia (Chelsea and Westminster Hospital, London, UK); Lidia Ruiz and Bonaventura Clotet (Fundacion IrsiCaixa, Badelona, Spain); Schlomo Staszewski (Hospital of the Johann Wolfgang Goethe-University, Frankfurt, Germany); Carlo Torti (University of Brescia); Cliff Lane and Julie Metcalf (National Institutes of Health Clinic, Rockville, USA); Maria-Jesus Perez-Elias (Instituto Ramón y Cajal de Investigación Sanitaria, Madrid, Spain);

Andrew Carr, Richard Norris and Karl Hesse (Immunology B Ambulatory Care Service, St. Vincent's Hospital, Sydney, NSW, Australia); Dr Emanuel Vlahakis (Taylor's Square Private Clinic, Darlinghurst, NSW, Australia); Ms Emma Fist, (University of NSW).

Clinical trials: Sean Emery and David Cooper (CREST); Carlo Torti (GenPherex).

John Baxter (GART, MDR); Laura Monno and Carlo Torti (PhenGen); Jose Gatell and Bonventura Clotet (HAVANA); Gaston Picchio and Marie-Pierre deBethune (DUET 1 & 2 and POWER 3); Maria-Jesus Perez-Elias (RealVirfen).

Role of authors: A.D.R.: experimental design, some statistical analysis and primary manuscript development; D.W.: mathematical modeling and statistics; M.A.B.: identification, extraction and provision of the Sydney dataset, input into manuscript; S.E.: supervision of Sydney group involvement, extraction and provision of the Sydney dataset, input into experimental design and input into the development of the online system; A.L.P.: provision of a substantial amount of the data used to develop the models, input into experimental design and manuscript development; F.D.W.: provision of a substantial amount of the data used to develop the models, input into experimental design; R.H.: provision of a substantial amount of the data used to develop the models, input into experimental design and the development of the online system; J.S.G.M.: provision of a substantial amount of the data used to develop the models, significant contributions to the experimental design, manuscript and the development of the online system; H.C.L.: provision of a substantial amount of the data used to develop the models, significant input into experimental design and into the development of the online system; B.A.L.: overall supervision of and involvement in the design and conduct of the study, the manuscript and the development of the online system.

Conflicts of interest

There are no conflicts of interest.


1. Panel on Antiretroviral Guidelines for Adults and Adolescents. Guidelines for the use of antiretroviral agents in HIV-1-infected adults and adolescents.Department of Health and Human Services; 10 January 2011. pp. 1–166.
2. Thomson MA, Aberg JA, Cahn P, Montaner JS, Rizzardini G, Telenti A, et al. Antiretroviral treatment of adult HIV Infection: 2010 Recommendations of the International AIDS Society-USA Panel. JAMA 2010; 304:321–333.
3. Hirsch MS, Günthard HF, Schapiro JM, Brun-Vézinet F, Clotet B, Hammer SM, et al. Antiretroviral drug resistance testing in adult HIV-1 infection: 2008 recommendations of an International AIDS Society-USA panel. Clin Infect Dis 2008; 47:266–285.
4. Sturmer M, Doerr HW, Preiser W. Variety of interpretation systems for human immunodeficiency virus type 1 genotyping: confirmatory information or additional confusion?. Curr Drug Targets Infect Disord 2003; 3:373–382.
5. Schapiro JM, De Luca A, Harrigan R, Hellman N, McCreedy B, Pillay D, et al. Resistance assay interpretation systems vary widely in method and approach. Antivir Ther 2001; 6 (Suppl 1):131.
6. Shafer RW, Gonzales MJ, Brun-Vezinet F. Online comparison of HIV-1 drug resistance algorithms identifies rates and causes of discordant interpretations. Antivir Ther 2001; 6:101.
7. Torti C, Quiros-Roldan E, Keulen W, Scudeller L, Caputo SL, Boucher C, et al. for the GenPherex Group of the MaSTeR CohortComparison between rules-based human immunodeficiency virus type 1 genotype interpretations and real or virtual phenotype: concordance analysis and correlation with clinical outcome in heavily treated patients. J Infect Dis 2003; 188:194–201.
8. Sturmer M, Doerr HW, Staszewski S, Preiser W. Comparison of nine resistance interpretation systems for HIV-1 genotyping. Antivir Ther 2003; 8:239–244.
9. De Luca A, Cingolani A, Di Giambenedetto S, Trotta MP, Baldini F, Rizzo MG, et al. Variable prediction of antiretroviral treatment outcome by different systems for interpreting genotypic human immunodeficiency virus type 1 drug resistance. J Infect Dis 2003; 187:1934–1943.
10. DeGruttola V, Dix L, D’Aquila R, Holder D, Phillups A, Ait-Khaled M, et al. The relation between baseline HIV drug resistance and response to antiretroviral therapy: re-analysis of retrospective and prospective studies using a standardized data analysis plan. Antivir Ther 2000; 5:41–48.
11. Frentz D, Boucher CAB, Assel M, De Luca A, Fabbiani M, Incardino F, et al.Comparison of HIV-1 genotypic resistance test interpretation systems in predicting virological outcomes over time. PLoS One 2010; 5:e11505. doi: 10.1371/journal.pone.0011505.
12. Gallego O, Martin-Carbonero L, Aguero J, de Mendoza C, Corral A, Soriano V. Correlation between rules-based interpretation and virtual phenotype interpretation of HIV-1 genotypes for predicting drug resistance in HIV-infected individuals. J Virol Methods 2004; 121:115–118.
13. Larder BA, Revell D, Wang D, Harrigan R, Montaner J, Wegner S. Neural networks are more accurate predictors of virological response to antiretroviral therapy than rules-based genotype interpretation systems [abstract 653]. In: Proceedings of 13th Conference on Retroviruses and Opportunistic Infections; Denver, CO, USA; 5–8 February 2006.
14. Beerenwinkel N, Daumer M, Oette M, Korn K, Hoffman D, Kaiser R, et al. Geno2pheno: estimating phenotypic drug resistance from HIV-1 genotypes. Nucleic Acids Res 2003; 31:3850–3855.
15. Beerenwinkel N, Sing T, Lengauer T, Rahnenführer J, Roomp K, Savenkov I, et al. Computational methods for the design of effective therapies against drug resistant HIV strains. Bioinformatics 2005; 21:3943–3950.
16. DiRienzo G, DeGruttola V. Collaborative HIV resistance-response database: sample size for detection of relationships between HIV-1 genotype and HIV-1 RNA response using a nonparametric approach. Antivir Ther 2002; 7:S93.
17. Larder BA, DeGruttola V, Hammer S, Harrigan R, Wegner S, Winslow D, et al. The international HIV resistance response database initiative: a new global collaborative approach to relating viral genotype treatment to clinical outcome. Antivir Ther 2002; 7:S111.
18. Wang D, Larder BA, Revell AD, Montaner J, Harrigan R, De Wolf F, et al. A comparison of three computational modelling methods for the prediction of virological response to combination HIV therapy. Artif Intell Med 2009; 47:63–74.
19. Larder BA, Wang D, Revell A, Montaner J, Harrigan R, De Wolf F, et al. The development of artificial neural networks to predict virological response to combination HIV therapy. Antivir Ther 2007; 12:15–24.
20. Revell AD, Wang D, Harrigan R, Hamers RL, Wensing AMJ, De Wolf F, et al. Modelling response to HIV therapy without a genotype: an argument for viral load monitoring in resource-limited settings. J Antimicrob Chemother 2010; 65:605–607.
21. Wang D, Larder BA, Revell A, Harrigan R, Montaner J. A neural network models using clinical cohort data accurately predicts virological response and identifies regimens with increased probability of success in treatment failures [abstract 102]. In: Proceedings of VII International HIV Drug Resistance Workshop; 10–14 June 2003; Los Cabos, Mexico.
22. Larder BA, Revell AD, Mican J, Agan BK, Harris M, Torti C, et al. Clinical evaluation of the potential utility of computational modeling as an HIV treatment selection tool by physicians with considerable HIV experience. AIDS Patient Care STDs 2011; 25:29–36.
23. Zazzi M, Kaiser R, Sönnerborg A, Struck D, Altmann A, Prosperi M, et al. Prediction of response to antiretroviral therapy by human experts and by the EuResist data-driven expert system (the EVE study). HIV Med 2010; 12:211–218.
24. Gatnar E. Randomization in aggregated classification trees. In: Baier D, Wernecke K-D, editors. Innovations in classification, data science, and information systems. Berlin: Springer-Verlag; 2005. pp. 207–216.
25. Liaw A, Wiener M. Breiman and Cutler's random forests for classification and regression. [Accessed 30 November 2009]

antiretroviral therapy; computer models; HIV drug resistance; predictions; treatment outcome

© 2011 Lippincott Williams & Wilkins, Inc.