Left ventricular assist devices (LVADs) have become the standard of care for carefully selected patients with end-stage heart failure (HF). Over the past decade, the rates of LVAD implantation in the United States has grown consistently each year.1 However, LVAD implantation continues to be associated with significant morbidity and mortality. Preoperative risk estimation allows for appropriate patient selection but is difficult to generalize, given the heterogeneity of the HF population.2
The designation of patients with advanced HF as New York Heart Association (NYHA) class III or IV is not sufficiently discriminatory, as these NYHA classes are subjective and represent a large spectrum of illness. Similarly, the Seattle Heart Failure Risk Model (SHFM) and the Heart Failure Survival Score (HFSS) have limited applicability in patients with class IV HF. Because of this limitation, the Interagency Registry for Mechanically Assisted Circulatory Support (INTERMACS, or IM) defined seven clinical profiles to given a more granular description of the level of illness at the time of implantation.3 These range from patients in IM profile 1, who are in critical cardiogenic shock and would need intervention within hours, to IM profile 7, which includes patients with NYHA class IIIb symptoms, who may not have severe enough HF to currently be considered implant candidates. Although not designed as a tool for risk stratification, IM profiles at the time of LVAD implantation have been shown to correlate with mortality post LVAD implant.4 The HeartMate Risk Score (HMRS), derived from patients enrolled in the HMII trials had a receiver operator curve – area under curve (ROC – AUC) of 0.62 and 0.60 at 3 month and 2 years, respectively, when applied to the IM population,5 which has limited its widespread applicability in clinical practice. There is, therefore, an unmet need for accurate, flexible, and improved risk stratification model to account for the heterogeneity of end-stage HF patients.
We have previously published algorithms based on Bayesian modeling to predict mortality at various time points post-LVAD implantation.6 In the current study, we validate these mortality models for their performance across IM profiles, as a function of disease severity. We also evaluated the performance of the models based on device type (axial versus centrifugal) and strategy (bridge to transplantation [BTT] versus destination therapy [DT]).
The IM database is a United States registry for patients who received a durable LVAD to treat end-stage HF. We have previously published the methodology for derivation of the Bayesian based mortality models from the IM registry6 (see Appendix, Supplemental Digital Content, http://links.lww.com/ASAIO/A361). Briefly, modeling was performed using preimplant variables captured within IM from January 2012 to December 2015, for adults who received a primary continuous flow LVAD or LVAD and right ventricular assist device (RVAD) in combination (n = 10,206). We chose this time frame to include current generation, continuous flow LVADs with a minimum of 1 year of follow-up. Total artificial heart recipients, pulsatile flow LVAD, and RVAD only receipts were excluded from this study. Survival while on mechanical circulatory support was calculated from interval from implant to death. Patients who had heart transplant between the time endpoints were noted as “alive” for the subsequent time point and censored thereafter.
All variables (e.g., clinical events and interventions during hospitalization) used to create the models were replicated from INTERMACS users guide and limited to preoperative interventions only. Variables were presented across three categories: demographics, medical history, test results (laboratory, exercise, and imaging). Those with over 40% missing data were excluded from the analysis (n = 42). After variable preprocessing, 203 preimplant variables were used in the model construction.
Subsequently, this dataset was randomly divided into a training dataset (derivation cohort) consisting of 80% of the data records selected at random (n = 8,222) and test (validation cohort) consisting of the remaining 20% patients (n = 2,055) using Weka test/train split function. The proportions of validation data were chosen to optimize the model learning without sacrificing validation accuracy.7 All continuous variables were discretized using a hybrid approach of expert binning, equal frequency, and equal width binning. To select variables for inclusion in the model, information gain was run in a 10-fold cross validation on the training data, with the recurring top variables being selected for model inclusion. Cutoff for selection was information gain > 0.003 for all three model time points. The selected features from the training sets were used to learn both Tree-augmented Naïve Bayes and Naïve Bayes (NB) graphical models with GeNIe software (BayesFusion, Pittsburgh, Pennsylvania). Each model was optimized by running 10-fold cross-validation and removing or adding variables that either had low diagnostic value (as calculated in GeNie) or were on the cusp of the information gain cutoff. At all three time points, the NB models had superior performance, as measured by the area under the ROC curve.
Mortality post-LVAD implantation was modeled using NB at 1, 3, and 12 months using the training dataset. The final models used 28, 26, and 21 independent variables, respectively. Individual variables at each time point were assessed for diagnostic value and give a “weight of influence” as calculated in GeNie. These models were collectively the Cardiac Outcomes Risk Assessment (CORA) mortality models. For the current study, the derived models were assessed using the 2,055 patients in the test cohort, subdivided by their IM profile. The IM profile is documented by the implanting surgeon or the advanced HF team and captured within the registry at the time of implant. The performance of the three mortality models were also compared across device strategies (BTT versus DT, as documented in IM) and LVAD types (centrifugal versus axial) to assess their applicability to these specific patient populations. Lastly, mortality predictions by IM profile at 12 months post implant were assessed using the SHFM, HFSS, and HMRS in the validation cohort to compare their performance to the NB model. This study was approved by the IM Data, Access, Analysis, and Publication Committee.
Differences in the patient cohort by IM level were measured by Student’s t-test, when continuous, or χ2 analysis otherwise. The overall performance of the models was measured by ROC AUC.
The baseline parameters of both the derivation and validation cohort were similar and are shown in Table 1. The average age of IM profile 1 patients was 53.5 years, with the profile 4–7 patients having an average age of 60.8 years. The majority of profile 1 patients received their LVAD as BTT (54%), while the majority of profile 4–7 patients received their LVAD as DT (52%). Ischemic disease was listed as the cause for cardiomyopathy in the majority of the profile 4–7 patients (55.5%).
Performance of Bayesian Algorithms across IM Profiles
The Kaplan-Meier survival curves, as stratified by IM profile at the time of implant is shown in Figure 1. Bayesian models as a predictor of 1 month, 3 month, and 12 month mortality yielded an AUC ranging between 0.63 and 0.74 for patients designated as IM profiles 1–3 (Table 2 and Figure 2A, 2B, and 2C). The C-statistics for predicting mortality for the less sick HF population, IM profiles 4–7, exhibited less discrimination, between 0.61 and 0.69.
Performance of Bayesian Algorithms Segregated by Device Type
The performance of the Bayesian models for 1-month, 3-month, and 1-year mortality were very similar for both centrifugal and axial flow pumps. The corresponding AUC ranged from 0.68 to 0.74 (Table 3 and Figure 3A, 3B, and 3C).
Performance of Bayesian Algorithms Based on Device Strategy
The performance of the Bayesian models was better for BTT indication (AUC range, 0.70–0.73) relative to DT (AUC range, 0.66–0.69), across different time points (Table 4 and Figure 4A, 4B, and 4C).
Performance of Comparator Risk Scores across IM Profiles
The performance of the Bayesian model (AUC range, 0.64–0.71) at 12 months postimplant was superior to the SHFM (AUC range, 0.51–0.55), HFSS (AUC range, 0.49–0.54) and HMRS (AUC range, 0.49–0.69) for all four IM categories (Table 5 and Figures 5A, 5B, 5C, and 5D).
Risk stratification of patients with end-stage HF is key to optimizing outcomes with LVAD therapy.8 Current guidelines from the International Society of Heart and Lung Transplantation recommend that HF patients who are at “high-risk for 1-year mortality using prognostic models” should be referred for advanced therapies as appropriate (class II a, level of evidence C).9 Although the guidelines do not specify any particular prognostic models, there are very few available. Validated tools such as the SHFM10 and HFSS11 tend to underestimate absolute risk in high risk population and overestimate survival in end-stage HF. Similarly, the NYHA classification, although commonly used in HF, is subjective, inconsistent and susceptible to significant interobserver variability.8 More recently, patients have been categorized into IM profiles based on their provider’s clinical assessment.3 Between 2008 and 2010, 16% patients in the IM registry who received a durable LVAD were classified as IM profile 1. In recent years, the majority of patients receiving a durable LVAD were classified as profile 2 and 3. Risk stratification in these sicker patient profiles1–3 becomes even more relevant to allow clinicians to select patients who most likely to benefit from this highly invasive and expensive treatment option, while recommending against LVAD therapy for those who are too sick to derive benefit. Other considerations at the time of implant would include device type, device strategy, assessment of short- and long-term risk of mortality and major morbidities. The HMRS, derived from a clinical trial population who all received the same axial flow device had a C-index at 3 month and 2 year of 0.62 and 0.60, respectively, when applied to the IM population.5 With its limited application, the need for improved models, derived from a diverse patient population cannot be overemphasized. We have previously published the application of Bayesian analysis in stratifying the risk of mortality and right ventricular failure in the IM population.12,13 Our Bayesian-based CORA models outperformed the SHFM, HFSS, and destination therapy risk score predictions at 12 months post-LVAD across all IM profiles. Moreover, it performed equally well, irrespective of device type or implant strategy.
Bayesian models provide a considerable advantage over traditional risk models in their ability to take into account for a wide range of clinical variables and their interplay with each other. These include multiple preoperative characteristics such as demographics, medical history, and test results (laboratory, exercise, and imaging) etc. In fact, IM profile at the time of implant was one of the variables predictive of 1- and 3-month mortality.6 This becomes increasingly relevant when risk stratifying outcomes in patients with end-stage HF, given their inherent heterogeneity and associated medical comorbidities. In addition, our models have been derived from a large, multicenter “real life” patient population undergoing LVAD implantation (IM) and subsequently validated across all IM profiles, device types, and device indications. The Bayesian risk stratification models have the potential for providing clinicians decision support to sharpen and individualize their decision regarding LVAD implantation. Despite some pitfalls, it is becoming increasingly evident that the best way to make decisions going forward in medicine on the basis of data will be through the application of techniques drawn from artificial intelligence and machine learning.14 Once the data feeding these models can be imported seamlessly from electronic medical records, this risk stratification tool can be integrated seamlessly into clinical practice.
The Bayesian models become increasingly accurate when applied to categories of patient populations that constituted a higher percentage of the derivation cohorts. In other words, the more data that is entered into the algorithms, the “smarter” their predictions become. We believe that this accounts for the higher AUC for prediction for the lower IM profiles, given that vast majority of patients who faced subsequent mortality post LVAD fall into these categories. Moreover, the clinical course of patients with higher profile 4–7 is more variable and hence less predictable. Similarly, the higher percentage of patients in the BTT category were in profile 1 and 2, resulting in a higher AUC for prediction in BTT category, relative to DT. This suggests that the models have the potential to get better over time and with increasing quality and quantity of data.
We acknowledge that individual risk stratification in end-stage HF patients relies heavily on clinical experience and cannot be fully captured in an algorithm. IM profiles do not have strict definitions and may be applied differently between centers and even by different providers within the same institution. Factors contributing to post-LVAD implant mortality rely on several factors (both medical and others) that cannot be captured or predicted in the preimplant setting. Importantly, our algorithms have been derived in a retrospective fashion from a registry data with the understanding that these events (mortality) have already occurred after the LVAD implant in patients who were deemed to be candidates for the device by their respective healthcare teams. Any risk assessment tool derived from a registry data will be handicapped by its inherent limitations, such as incomplete, missing, or incorrectly entered data elements. Our analysis did not include HeartMate III implants, since they were under clinical trial setting at the time, and therefore not included in the IM registry. However, despite these limitations, Bayesian models use a large, comprehensive database, encompassing the reported experience in over 10,000 LVAD patients with various device types and strategies.
As we continue to learn how to use this tool in medical decision making, we hope that it will help us effectively shape our decisions for patient selection with an aim for optimal clinical outcomes.
We validated the Bayesian models for risk stratifying mortality in a large, multicenter cohort of patients. We demonstrated that these models correlate with short-term and long-term mortality across IM profiles, regardless of device type or strategy. This allows clinicians to use a single tool to aid in decision making and appropriate patient selection in LVADs. When supplemented with patient and care giver engagement and risk estimates of qualify of life, these tools can improve the shared decision-making abilities of the LVAD team.
1. Kirklin JK, Pagani FD, Kormos RL, et al. Eighth annual INTERMACS report: Special focus on framing the impact of adverse events. J Heart Lung Transplant 2017.36: 1080–1086.
2. Cowger J, Shah P, Stulak J, et al. INTERMACS profiles and modifiers: Heterogeneity of patient classification and the impact of modifiers on predicting patient outcome. J Heart Lung Transplant 2016.35: 440–448.
3. Stevenson LW, Pagani FD, Young JB, et al. INTERMACS profiles of advanced heart failure: the current picture. J Heart Lung Transplant 2009.28: 535–541.
4. Boyle AJ, Ascheim DD, Russo MJ, et al. Clinical outcomes for continuous-flow left ventricular assist device patients stratified by pre-operative INTERMACS classification. J Heart Lung Transplant 2011.30: 402–407.
5. Adamo L, Nassif M, Tibrewala A, et al. The Heartmate Risk Score predicts morbidity and mortality
in unselected left ventricular assist device recipients and risk stratifies INTERMACS class 1 patients. JACC Heart Fail 2015.3: 283–290.
6. Kanwar MK, Lohmueller LC, Kormos RL, et al. Low accuracy of the HeartMate risk score for predicting mortality
using the INTERMACS registry data. ASAIO J 2017.63: 251–256.
7. Guyon I. A scaling law for the validation-set training-set size ratio, AT & T Bell Laboratories, 1996.Berkeley, Calif, USA.
8. Miller LW, Guglin M. Patient selection for ventricular assist devices: A moving target. J Am Coll Cardiol 2013.61: 1209–1221.
9. Feldman D, Pamboukian SV, Teuteberg JJ, et al; International Society for Heart and Lung Transplantation: The 2013 International Society for Heart and Lung Transplantation guidelines for mechanical circulatory support: Executive summary. J Heart Lung Transplant 2013.32: 157–187.
10. Levy WC, Mozaffarian D, Linker DT, et al. The Seattle Heart Failure Model: Prediction of survival in heart failure. Circulation 2006.113: 1424–1433.
11. Aaronson KD, Schwartz JS, Chen TM, Wong KL, Goin JE, Mancini DM. Development and prospective validation of a clinical index to predict survival in ambulatory patients referred for cardiac transplant evaluation. Circulation 1997.95: 2660–2667.
12. Loghmanpour NA, Kormos RL, Kanwar MK, Teuteberg JJ, Murali S, Antaki JF. A Bayesian model to predict right ventricular failure following left ventricular assist device therapy. JACC Heart Fail 2016.4: 711–721.
13. Loghmanpour NA, Kanwar MK, Druzdzel MJ, Benza RL, Murali S, Antaki JF. A new Bayesian network-based risk stratification
model for prediction of short-term and long-term LVAD mortality
. ASAIO J 2015.61: 313–323.
14. Johnson KW, Torres Soto J, Glicksberg BS, et al. Artificial intelligence in cardiology. J Am Coll Cardiol 2018.71: 2668–2679.
LVAD; risk stratification; mortality; INTERMACS profile
Supplemental Digital Content
Copyright © 2019 by the American Society for Artificial Internal Organs