Chronic lymphocytic leukemia (CLL) is characterized by a highly variable clinical course and prognosis, with some patients showing no or minimal symptoms over many years and others having symptoms at diagnosis and needing treatment within a short time. To facilitate prediction of the individual risk of occurrence of a patient-relevant outcome for newly diagnosed CLL patients, several prognostic models were developed through the combination of several established prognostic factors. Up until now, it is unclear which models qualify for use in clinical practice due to a lack of comparison of their performance in external cohorts.
This systematic review aims to identify all prognostic models developed in untreated patients with CLL, as well as their validation studies in external cohorts, to assess and pool the performance of these models. In addition, we aim to assess the reporting of prognostic model studies.
Based on a-priori Cochrane protocol, we systematically searched MEDLINE via OvidSP until September 2018 for primary studies developing or externally validating a prognostic model for untreated CLL patients to predict either overall survival (OS), time-to-first-treatment (TTFT) or progression-free survival (PFS). Two review authors independently assessed the publications for eligibility using pre-defined in- and exclusion criteria, extracted the relevant performance measures and assessed the methodological quality of the studies with the Prediction model Risk Of Bias ASsessment Tool (PROBAST). Where sufficient data was available, the performance measures for calibration and discrimination were pooled. This project was funded by the German Federal Ministry of Education and Research (grant number 01KG1711).
From the 16,047 results of the sensitive search strategy (search date 09/2018), we identified 47 studies that aimed to develop a model or score, of which only nine were externally validated. Of these validated models, five aimed to predict OS, three TTFT and one PFS. Three models that predict OS (the CLL-IPI, the MDACC 2007 index score and the Barcelona-Brno score) were validated with adequate reporting of the concordance index to calculate a pooled measure for discrimination. The pooled concordance indexes were 0.73 (95% CI: 0.68 – 0.76, eight external cohorts, see figure 1), 0.67 (95% CI: 0.61 – 0.72, six external cohorts) and 0.64 (95% CI: 0.60 – 0.68, four external cohorts), respectively. Only the CLL-IPI passed the recommended threshold of 0.70 to make individualized prognoses, with a 95% prediction interval ranging from 0.61 to 0.82. This prediction interval, which describes a range for the predicted model performance in a new validation study of the model, is relatively wide and suggests a degree of between-study heterogeneity. Calibration was rarely reported. Therefore, as all three models assign point scores and create risk groups, survival per risk group was extracted ad-hoc and summarized in separate graphs.
Only the CLL-IPI achieved the required precision for predicting individual outcome probabilities. Due to a lack of reporting, no conclusion can be drawn concerning the calibration of the models in the external cohorts.