As demonstrated by Fried et al1 more than a decade ago, there is a measurable frailty phenotype. Affected individuals can be reproducibly identified with standardized measures that can be used in clinical and research settings. The 5 measures that define the phenotype are unintentional weight loss of 10 pounds in the past year, self-reported exhaustion, weakness as measured with grip strength, slow walking speed, and low physical activity. This phenotype has been validated as predictive of mortality and disability.2–4 Individuals with this phenotype, however, cannot be easily identified in data that do not have these specialized, although simple, performance measures. Although these measures may be frequently generated in research and in geriatric practice settings, they are not routinely measured in most clinical encounters.
It would be broadly useful to identify individuals with a frailty phenotype using only administrative data. Given the ubiquity of claims data, an indicator of frailty with good positive and negative predictive values may be widely used as an exposure variable to understand the health outcomes of individuals with frailty.5 In addition, it may be used as an outcome variable for evaluating the impact of interventions either designed to prevent or delay frailty, or to assess frailty as an unwanted outcome of interventions such as hospitalization or surgery. A claims-based index may also be used as an effect modifier in investigation of treatment effectiveness where there is expected to be differences in treatment-response due to frailty. A claims-based index may also be valuable for health services delivery planning or emergency disaster preparedness, where identification of high concentrations of frail individuals may be essential for public safety. With these as motivators, we aimed to develop a tool, the Claims-based Frailty Indicator (CFI), that can be generated easily with Claims data alone, and that has
good operating characteristics. We considered the Fried frailty phenotype to be the reference standard and developed a parsimonious model to approximate the frailty phenotype and thus identify frail individuals using claims data alone.
The Johns Hopkins Institutional Review Board approved the conduct of this study, which also received approval from the Cardiovascular Health Study (CHS) investigators and from the Center for Medicare and Medicaid Services.
The data were from the CHS that had been previously linked to Medicare claims.6 The CHS is a population-based longitudinal cohort that recruited 5201 older people, from 4 US communities, beginning in 1989. An additional 687 African Americans were recruited later. Those eligible included all persons living in the household of each individual sampled from the Health Care Financing Administration sampling frame, who were 65 years or older at the time of examination, were noninstitutionalized, were expected to remain in the area for the next 3 years, and were able to give informed consent and did not require a proxy respondent at baseline. Potentially eligible individuals who were wheelchair-bound in the home at baseline, receiving hospice treatment, radiation therapy, or chemotherapy for cancer were excluded. Cohort participants were seen for annual examinations through 1999, and then followed with phone calls every 6 months subsequently. This dataset includes Medicare claims data for the enrolled participants from 1991 through 2009. For inclusion in our study population, the participants needed to have continuous enrollment in Medicare parts A and B during the 6-month windows from which we used their claims data for construction of the CFI.
In the CHS, Fried et al1 defined a phenotypic measure of frailty consisting of 5 criteria: slow walking, weak grip strength, low physical activity, exhaustion, and weight loss. An individual is said to be frail if ≥3 of this criteria are satisfied, prefrail if 1 or 2 criteria are met, and nonfrail if none of the criteria are met. This frailty phenotype was rigorously validated by Bandeen-Roche et al,7 and it is the most widely used instrument for assessing frailty.8 We considered both the prefrail and nonfrail cohort members as not frail for this study.
Candidate Variable Selection
To identify relevant candidate variables for inclusion in a model, we searched Pubmed and Web of Science for relevant published literature. A combination of subject headings and key words were used including: “claims based frailty measurement,” “frailty,” “frailty variables,” “frailty index,” “phenotype frailty indicators,” “definition of frailty.” In addition, references cited in the articles were scanned to identify other relevant articles not found by the initial search.
We extracted lists of clinical conditions that had been used in the identified articles for classifying individuals as frail or disabled, either with claims data or with electronic medical records. We reviewed the list to assess the feasibility of operationalizing each condition with claims data alone. The list was reviewed by clinical experts in geriatrics who made recommendations about categorizing the variables by their proximity to the hypothesized underlying feature of frailty (ie, inflammation), and then by the consequences of frailty (eg, falls), and then other conditions of aging that are probably not frailty related (eg, cataracts). To these, we added all of the variables that have been classified by the Agency for Healthcare Research and Quality with their Clinical Classifications Software (CCS) for use with the International Classification of Disease (ICD-9-CM) codes that had not already identified by the process above.9 We used the 268 single level diagnosis labels from the CCS. For the clinical conditions identified in the literature that did not have a CCS diagnosis level, a clinician (J.B.S.) identified the relevant ICD-9 codes using Flash Code medical coding software (Flash Code Solutions, LLC).
The independent variables that were considered as possible correlates with frailty were derived from the individuals’ billed claims from up to 6 months before the fifth and ninth year study visits. Therefore, each individual could contribute data up to 2 times to the cohort. We restricted the claims to those from inpatient and outpatient encounters, excluding claims from nursing home care or skilled nursing care. Individuals needed to be enrolled in both Medicare parts A and B continuously during the 6-month windows of interest.
We aimed to identify the best linear combination of variables, obtainable from claims data alone, which can predict whether an individual is frail. We approached this as a predictive learning problem. Our goal was to have a claims-based index that is: (1) both sensitive and specific for frailty; and (2) reasonably parsimonious so that it can be operationalized easily across different data. We considered 3 techniques widely used in the machine-learning field: logistic regression with lasso penalty, gradient boosting machine, and random forests. In logistic regression with lasso, a logistic regression model is fit to the data with a penalty on the sum of the absolute magnitudes of the regression coefficients.10 This has the consequence of eliminating variables with tiny effects as well as shrinking the magnitude of those with large, but uncertain effects. Gradient boosting and random forests are similar in that they produce collections (ensembles) of tree learners. The difference is in how they arrive at the final aggregate tree learner. Boosting combines simple prediction models (eg, decision trees with 1 or 2 branches), each of which has large bias and small variance, to produce a complex aggregate model,11 whereas the random forest starts by generating multiple, uncorrelated decision trees and combines them using a technique called “bagging” to produce an aggregate (forest) of trees.12 The degree of complexity in all the methods is determined by cross-validation on leave-out samples.
Our method of choice was the logistic regression with lasso penalty for model building, for reasons to be explained later. The optimal lasso penalty was chosen via 10-fold cross-validation. Our goal was to maximize the area under the receiver operator curve in the cross-validation samples to yield a prediction model that, depending on the chosen cutoff, can identify frail individuals with good sensitivity and specificity. The optimal algorithm, based on this cross-validation, was the algorithm with no interaction terms. Hence, age by sex interactions is not included in the final model. We also tested the alternative modeling techniques including the random forest approach, the gradient boosting model, logistic regression with forward and backward selection of variables for retention in the model, and an approach which uses a count of comorbid diagnoses.13 In our tree learning algorithms, we tried different interaction depths, 1 (no interactions), 2 (2-way interactions), and 3 (3-way interactions). The models were estimated using the glmnet package in R.14 To test whether the addition of these claims-based indicators provide added value beyond age and sex alone for predicting the frailty phenotype, we evaluated the statistical significance of the incremental area under the curve (AUC) using the DeLong test.15
Assessment of Validity
The adaptive lasso approach included a cross-validation procedure so that the area under the receiver operating curve (ROC) reflects that performance in the validation set. We further assessed the predictive validity of the CFI by examining the rates of events among individual classified as frail or nonfrail with the CFI by choosing a cutoff that yielded a high positive predictive value. We explored several cutoffs including one chosen to maximize specificity and one to simultaneously maximize sensitivity and specificity. We used the predicted frailty, with our chosen cutoff, at the fifth-year visit of cohort participation and quantified the frequency of hospitalizations, fractures, nursing home admission, disability (impairment in at least 1 activity of daily living) and death for individuals classified as frail and nonfrail over the subsequent 5-year time window. Death and hospitalization were obtained from the Medicare claims data. Fractures, nursing home admission entry, and impairment in activity of daily livings were taken from the CHS interview data.
We calculated hazards ratios, odds ratios, and incidence rate ratios for these events, as appropriate, both without and with adjustment for age and sex. We compared these measures of association to those ones generated using the frailty phenotype for prediction of outcomes. We also compared these measures of association to those from using the Charlson Comorbidity Index alone for risk prediction.16,17
We reviewed 9 articles in which investigators aimed to identify frail or disabled individuals using knowledge of their clinical conditions.5,18–25 In total, 99 unique clinical or functional variables were extracted from primarily 3 articles.5,21,25 Of these, 44 variables were deemed as feasible to operationalize with claims data alone (Appendix
Table 1, Supplemental Digital Content 1, http://links.lww.com/MLR/B369). Most variables were discarded as they required measured clinical data (such as body mass index) or results from physical examination (such as bradykinesia, or diminished vibratory sense). Others were discarded because they would only be known from interview (problems cooking, difficulty with stairs). Others could not be retained because they required demographic information that is typically unavailable in claims data (marital status, level of education). As stated, we also included 268 CCS variables operationalized by the Agency for Healthcare Research and Quality in the modeling.
Our CHS study cohort included the 5888 participants from 4 clinical sites. As we required continuous fee-for-service enrollment in the two 6-month windows of interest, our final analytic cohort included 4454 people. The individuals who were not continuously enrolled differed modestly from those continuously enrolled (Appendix Table 2, Supplemental Digital Content 1, http://links.lww.com/MLR/B369). Our cohort participants were 84% white; 59% were women and their mean age was 75 years at enrollment in CHS. Approximately 11% of the cohort was frail at visits 5 and 9 (Table 1). This population has been described in detail.1
The adaptive lasso regression yielded a model that identified individuals as frail, using their claims data from the 6 months preceding their assignment of a frailty phenotype, with an area under the ROC curve of 0.75 (Fig. 1). From the initial large candidate set of variables, the lasso approach selected 21 variables, including age and sex (Table 2). A model including only age and sex had an area under the ROC curve of 0.69. Thus the incremental AUC for our final model was 0.06, which was highly significantly better (P<1e–15) for prediction of the phenotype than a model with age and sex alone. A model that was unconstrained by the lasso technique (included many variables), yielded an AUC of 0.74 for out-of-sample prediction.
Using the model with 21 variables and a predicted probability cutoff of 0.12 to classify individuals as frail, the sensitivity and specificity were both maximized with a sensitivity of 66% and a specificity of 73%. At a cutoff of 0.20, the sensitivity and specificity were 35% and 91%, respectively.
The alternative modeling techniques yielded similar results but were thought to be less interpretable. The random forest model yielded an area under the ROC curve of 0.73. The gradient boosted model had an area under the ROC curve of 0.76. The model which included a simple count of any 40 comorbidities yielded an area under the ROC of 0.72. The logistic regression models with forward and backward stepwise selection selected >40 covariates, and had areas under the ROC curve of 0.73 (Appendix Table 3, Supplemental Digital Content 1, http://links.lww.com/MLR/B369).
The CFI, using a probability cutoff of 0.20, significantly predicted death in 5 years [odds ratio (OR), 3.8; 95% confidence interval (CI), 3.2–4.6], time to death (hazards ratio, 3.2; 95% CI, 2.7–3.7), hospital admission in 5 years (OR, 2.2; 95% CI, 1.7–2.9), and nursing home admission (OR, 3.8; 95% CI, 3.0–4.9) in unadjusted models (Table 3). Adjustment for sex and age attenuated the strength of the association considerably as these variables are already included in the CFI.
More patients classified as frail had ≥2 hospital admissions relative to those not classified as frail (Fig. 2). The CFI’s ability to predict outcomes was slightly less good than the frailty phenotype, but with markedly overlapping CIs surrounding the measures of association (Figs. 3A, B).
Our model, the CFI, classifies individuals as frail or not frail using only administrative claims. The area under the ROC curve was 0.75. The cutoff for classifying individuals as frail or not frail can be chosen based on the intended use. If the desire is to specifically identify frail individuals, perhaps for enrollment in a care management program, a higher cutoff would be selected, for example, 0.25. If the desire is to more inclusively identify frail individuals, such as for planning evacuation services for a natural disaster, a lower cutoff may be more appropriate, for example, 0.12. With this unique linked data, we are the first to build a claims-based index which is validated against an accepted reference standard for assessing the frailty syndrome.8,26
The field of frailty assessment is complex due to the rapid proliferation of numerous instruments to assess frailty. Buta et al8 identified 67 frailty instruments of which the frailty phenotype developed by Fried et al in 20011 was by far the most widely used one. The second most common frailty instrument is the deficit accumulation index of Rockwood and colleagues, which takes a different approach to assessing frailty.13,25 Here we chose to use the Fried phenotype for 2 major reasons: the phenotype instrument was developed in the CHS study, which we used for this study, and second, CHS study is linked to Medicare claims making it feasible for us to develop claims-only frailty index.
We found that the CFI more accurately classifies individuals as frail in comparison to a count of accumulated age-related deficits, referred to as the Frailty Index.13,25,27 However, we recognize the utility of the Frailty Index as there have been many published comparisons of the frailty phenotype and the Frailty Index, which largely suggest reasonable concordance between these measures,28,29 as individuals with a frailty phenotype have accumulated more aging-related deficits. However, the Frailty Index is often constructed using measures that can only be known from clinical encounters; no one has previously demonstrated that frailty can be predicted with reasonable accuracy solely with information from administrative data. Our model had an area under the ROC curve of 0.75 in comparison to 0.72 with the Frailty Index approach applied to claims data alone.
Our attention to using alternative methods of deriving a prediction model suggests that there may be models with just slightly better predictive accuracy (ie, with higher areas under the curve), but at the cost of parsimony and transparency. For example, with random forests and gradient boosting, two of the most popular and powerful prediction approaches, we do not obtain a parsimonious model and in addition, the prediction model is totally opaque and cannot be easily communicated to other stakeholders. In fact, the AUC of ROC for random forest was the same as that of lasso, and for gradient boosting it was only larger by 0.01. We expect that the utility of having a transparent regression model, which can be implemented with access to 6 months of claims data, is more broadly useful than the black-box machine-learning algorithms that would require distribution of specialized software to reproduce the model. Similarly, our choice to use lasso regression to identify a parsimonious model is
driven by our interest in having a model that will generalize well to new datasets.
For this first construction of the CFI, we collapsed the prefrail and nonfrail cohort members into a not-frail category. We intentionally opted to categorize prefrail individuals with nonfrail individuals for this exercise because we were interested in maximizing the positive predictive value of the CFI; if the CFI categorizes an individual as frail (using a chosen cutoff), we expect them to be truly frail. If we had been most interested in identifying a robust population, we would have categorized the prefrail with the frail individuals to assure that the model could identify a population free of frailty.
Our validation processes were 3-fold: (1) the adaptive lasso technique includes a cross-validation process so that the reported prediction accuracy, area under the ROC, is from out-of-sample validation sets; (2) the face validity of the included covariates is good—many of the included covariates have been previously identified as being correlated, cross-sectionally, with frailty, including congestive heart failure, depression, and orthopedic issues that contribute to slow gait30,31; and (3) the predictive validity of the CFI for aging-related clinical outcomes including hospitalization, disability and mortality that are known to occur at higher rates among individuals who have been classified as frail with the phenotype.
We modeled frailty with claims data using an accepted measure of frailty—the phenotype developed in the CHS. The challenge of this study was that it required access to a dataset that has both measured frailty (such as the phenotype) and claims. Increasingly, there are appropriate datasets available including the National Health and Aging Trends Study (NHATS), which is linked to Medicare data,32 and the Atherosclerosis Risk in Communities (ARIC) cohort that is linked to Medicare claims and whose investigators have recently begun measuring frailty.33 These provide opportunities for testing the reproducibility of this work—that is, will the same variables be selected in these data?
As described briefly above, the CFI is expected to have many uses for research, for population health management, and for emergency preparedness. A claims-based index may also be used as an effect modifier in investigation of treatment effectiveness where there is expected to be differences in treatment-response due to frailty. This may be valuable for analyses of trial or cohort study data where the risk-benefit balance of treatments may differ or not differ from that which is seen in nonfrail individuals. For example, clinicians are often reluctant to offer chemotherapy to older adults with cancer, as they think that the risks exceed the potential benefits. If we were to analyze claims data (perhaps registry-linked claims data), and find that individuals with leukemia and frailty as identified with the CFI have survival outcomes following chemotherapy comparable to nonfrail individuals, this may encourage oncologists to offer therapy and payers to provide coverage. Presently, such investigations require the measurement of frailty in a clinical setting.
The use of frailty as an effect modifier is likely more appropriate in many situations than using age as an effect modifier, although this remains to be tested. A claims-based index may be valuable for population health planning—the identification of a population of frail individuals may help in distributing services to those most likely to benefit, such as care management services, transition of care services, or other high-intensity management services aimed at high-value care delivery and best use of resources.
A claims-based index may also be valuable for emergency disaster preparedness where identification of high concentrations of frail individuals may be essential for public safety. It was recognized most poignantly during Hurricane Katrina, but also during more recent storms, that the most vulnerable individuals are likely to suffer severely during natural disasters.34,35 There is increasing attention to planning for nursing home evacuations,36,37 but less planning for service delivery to frail individuals living in the community. Although voluntary registration of frail or disabled individuals may be helpful, a method to identify these individuals, through their Medicare claims, may be valuable in an emergency situation. More likely, identification of communities that are enriched with frail individuals will allow for better preparedness in these communities.
Claims data alone can be used to classify individuals as frail and nonfrail. Although other models may exist or be developed that have higher predictive validity for clinical outcomes, they have not been designed to specifically identify a frail population, validated against a well-established assessment of frailty. Many investigators believe that frailty is a distinct clinical condition and that interventions for primary prevention of frailty may be developed.38 Similarly, the interventions required for prevention of adverse outcomes among those with frailty may differ from those needed to prevent adverse outcomes among other populations who are known to be at high risk (such as a multimorbid population). Therefore, this CFI was developed to identify specifically a frail population.
1. Fried LP, Tangen CM, Walston J, et al. Frailty
in older adults: evidence for a phenotype. J Gerontol A Biol Sci Med Sci. 2001;56:M146–M156.
2. Chang SF, Lin PL. Frail phenotype and mortality predication: a systematic review and meta-analysis of prospective cohort studies. Int J Nurs Stud. 2005;52:1362–1374.
3. Frisoli A Jr, Ingham SJ, Paes ÂT, et al. Frailty
predictors and outcomes among older patients with cardiovascular disease: data from Fragicor. Arch Gerontal Geriatr. 2015;61:1–7.
4. Macklai NS, Spagnoli J, Junod J, et al. Prospective association of the SHARE-operationalized frailty
phenotype with adverse health outcomes: evidence from 60+ community-dwelling Europeans living in 11 countries. BMC Geriatr. 2013;13:3.
5. Kim DH, Schneeweiss S. Measuring frailty
using claims data for pharmacoepidemiologic studies of mortality in older adults: evidence and recommendations. Pharmacoepidemiol Drug Saf. 2014;23:891–901.
6. Fried LP, Borhani NO, Enright P, et al. The Cardiovascular Health Study: design and rationale. Ann Epidemiol. 1991;1:263–276.
7. Bandeen-Roche K, Xue Q-L, Ferrucci L, et al. Phenotype of frailty
: characterization in the Women’s health and Aging Studies. J Gerontol A Biol Sci Med Sci. 2006;61:262–266.
8. Buta BJ, Walston JD, Godino JG, et al. Frailty
assessment instruments: Systematic characterization of the uses and contexts of highly-cited instruments. Ageing Res Rev. 2016;26:53–56.
9. Healthcare Cost and Utilization Project. Clinical Classifications Software. Healthcare Cost and Utilization Project (HCUP). Rockville, MD: Agency for Healthcare Research and Quality; 2015.
10. Tibshirani R. Regression shrinkage and selection via the lasso. J R Statistic Soc B. 1996;58:267–288.
11. Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29:1189–1232.
12. Breiman L. Random forests. Machine Learning. 2001;45:5–32.
13. Mitnitski AB, Mogilner AJ, Rockwood K. Accumulation of deficits as a proxy measure of aging. Sci World J. 2001;1:323–336.
14. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Software. 2001;33:1–22.
15. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–845.
16. Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol. 1992;45:613–619.
17. Romano PS, Roos LL, Jollis JG. Adapting a clinical comorbidity index for use with ICD-9-CM administrative data: differing perspectives. J Clin Epidemiol. 1993;46:1075–1079.
18. De Vries NM, Staal JB, Van Ravensberg CD, et al. Outcome instruments to measure frailty
: a systematic review. Age Res Rev. 2011;10:104–114.
19. Faurot KR, Jonsson Funk M, Pate V, et al. Using claims data to predict dependency in activities of daily living as a proxy for frailty
. Pharmacoepidemiol Drug Saf. 2015;24:59–66.
20. Fried LP, Ferrucci L, Darer J, et al. Untangling the concepts of disability, frailty
, and comorbidity: implications for improved targeting and care. J Gerontol A Biol Sci Med Sci. 2004;59:M255–M263.
21. Kamaruzzaman S, Ploubidis GB, Fletcher A, et al. A reliable measure of frailty
for a community dwelling older population. Health Qual Life Outcomes. 2010;8:123.
22. Makary MA, Segev DL, Pronovost PJ, et al. Frailty
as a predictor of surgical outcomes in older patients. J Am Coll Surg. 2010;210:901–908.
23. Pialoux T, Goyard J, Lesourd B. Screening tools for frailty
in primary health care: a systematic review. Geriatr Gerontol Int. 2012;12:189–197.
24. Pijpers E, Ferreira I, Stehouwer CDA, et al. The frailty
dilemma. Review of the predictive accuracy of major frailty
scores. Eur J Int Med. 2012;23:118–123.
25. Rockwood K, Andrew M, Mitnitski A. Unconventional views of frailty
: a comparison of two approaches to measuring frailty
in elderly people. J Gerontol A Biol Sci Med Sci. 2007;62:738–743.
26. Afilalo J. The road to frailty
is paved with good intentions. Circ Cardiovasc Qual Outcomes. 2016;9:194–196.
27. Rockwood K, Song X, MacKnight C, et al. A global clinical measure of fitness and frailty
in elderly people. CMAJ. 2005;173:489–495.
28. Blodgett J, Theou O, Kirkland S, et al. Frailty
in NHANES: comparing the frailty
index and phenotype. Arch Gerontol Geriatr. 2015;60:464–470.
29. Hoogendijk EO, Van Kan GA, Guyonnet S, et al. Components of the frailty
phenotype in relation to the frailty
index: results from the Toulouse frailty
platform. J Am Med Dir Assoc. 2015;16:855–859.
30. Ophet-Veld LPM, Van Rossum E, Kempen G, et al. Fried phenotype of frailty
: cross-sectional comparison of three frailty
stages on various health domains. BMC Geriatr. 2015;15:77.
31. Bandeen-Roche K, Seplaki CL, Huang J, et al. Frailty
in older adults: a nationally representative profile in the United States. J Gerontol A Biol Sci Med Sci. 2015;70:1427–1434.
32. Sensitive and restricted data files. National Health and Aging Trends Study. 2015. Available at: www.nhatsdata.org/ResDataFiles.aspx
. Accessed February 15, 2016.
33. Atherosclerosis risk in communities study. Available at: www2.cscc.unc.edu/aric/
. Accessed February 15, 2016.
34. Benson WF, Aldrich N. CDC’s Disaster Planning Goal: Protect Vulnerable Older Adults (Media Contact). Washington, DC: CDC Healthy Aging Program; 2007.
35. Sakauye KM, Streim JE, Kennedy GJ, et al. AAGP position statement: disaster preparedness for older Americans: critical issues for the preservation of mental health. Am J Geriatr Psychiatry. 2009;17:916–924.
36. Claver M, Dobalian A, Fickel JJ, et al. Comprehensive care for vulnerable elderly veterans during disasters. Arch Gerontol Geriatr. 2013;56:205–213.
37. Dobalian A, Claver M, Fickel JJ. Hurricanes Katrina and Rita and the Department of Veterans Affairs: a conceptual model for understanding the evacuation of nursing homes. Gerontology. 2010;56:581–588.
38. Chen X, Mao G, Leng SX. Frailty
syndrome: an overview. Clin Interv Aging. 2014;9:433–441.