Single-item measures for patient care
My colleagues and I began developing patient-reported measures in the 1980s after we documented a significant mismatch between what matters to primary care patients and what clinicians knew (Nelson et al., 1983). These single items measured physical, emotional, and social health, pain, overall health, and quality of life (QOL) (Nelson et al., 1990). Ease of administration and interpretation in patient care prompted international dissemination (de Azevedo-Marques & Zuardi, 2011; van Weel et al., 1995). A controlled trial involving 45 primary care clinicians and patients 70 years or older showed that when our measures were linked to patient education materials and explicit directives to clinicians, care improved (Wasson et al., 1999). We developed similar measures for other clinical situations (Wasson et al., 1994, 2000).
We recently prospectively tested a “What Matters Index” (WMI) that unambiguously identifies fundamental, remediable needs for each patient and directs the delivery of services to patient categories based on their risk for subsequent costly care (Wasson et al., 2018a, 2018b). The WMI is based on only 5 patient-reported measures: pain, emotional issues, polypharmacy, adverse medication effects, and low confidence in managing health problems. The methods for scoring each measure and reasons for inclusion are included in Table 1 (Wasson, 2017).
HowsYourHealth.org is an Internet-based application that integrates our single-item measures and the WMI into a comprehensive health checkup. It automatically provides clinicians a report about patients' function, diagnosis, symptoms, health habits, preventive needs, capacity to self-manage chronic conditions, and their experiences of care (Nelson et al., 2015; Wasson et al., 2011). It has been freely available for almost 2 decades and provides the data for this report.
Multi-item measures for outcome assessment
Over a similar time, multi-item measures for assessing outcomes of health care have gained widespread acceptance. The earliest examples include various “short-form” (SF) measures used in the Rand Health Insurance Experiment and the Medical Outcomes Study (Brook et al., 1979; Tarlov et al., 1989). The Consumer Assessment of Healthcare Providers and Systems (CAHPS) is commonly used for judging health care quality (Agency for Healthcare Research and Quality, n.d.). Innumerable multi-item measures claim specificity for diseases, symptoms, and components of health services.
Multiple items generally distinguish smaller differences between interventions than single items. Therefore, they have been widely disseminated in the United States for health services evaluation and reimbursement. However, a significant sacrifice for this increased “granularity” are the costs of distribution and the nuisances of computation and interpretation. As an example, in their detailed analysis of SF-36 variants, Selim et al. (2018) needed complex mathematical techniques to compare management strategies: scoring, transforming, norming, imputation, and variance assessment. The authors then assumed that their calculated effect sizes reflect an important improvement in a patient's QOL. This assumption is highly debatable. Multi-item measures principally designed for psychometric integrity and outcome assessments have long been known to impose implementation and interpretive problems that run counter to what matters to a patient and a clinician trying to serve that patient's needs (Feinstein, 1992).
PURPOSE OF THE REPORT
The purpose of this report is to illustrate measurement compromises for QOL assessment and management.
Responses to HowsYourHealth.org from 9068 patients aged 65+ years are the basis for this report. Fifty-four percent of the patients are female. Eleven percent are poor: they self-report at least some of the time having difficulty paying for essentials such as food, clothing, or housing. In total, 1706 patients have 3 or more of 5 common chronic conditions and 3499 are taking 3 or more prescription medications. The 5 chronic conditions are hypertension (n = 4793), arthritis (n = 4026), atherosclerotic cardiovascular disease (n = 1804), diabetes (n = 1673), and respiratory disease (n = 1325).
Quality of life
Patients self-report their overall QOL: “How have things been going for you in the past four weeks: very well—could hardly be better (27%), pretty good (48%), good and bad parts about equal (19%), pretty bad (5%), very bad—could hardly be worse (1%).”
- The WMI measures.
- Multi-item comparison measures. A mix of measures is typically included in the SF-36 variants. For example, the SF-12 contains measures for overall health, emotional state, performance of physical activities, engagement in daily and social activities, and severity of pain (RAND Health, 2018). For this analysis, indicators for social determinants of health—a measure for social support and another for poverty—are added to the customary domains. This multi-item comparator is abbreviated in the following results as “6 FNX limitations, OH, Poverty.”
- A sum of the 5 chronic conditions listed previously.
I present the Medicare patients' “pretty bad” and “very bad” responses about their QOL as an indication of “poor QOL” (vs all other QOL responses).
I display the area under the receiver operating characteristic curve (AUROC) to examine the association of prototypic, multi-item measures with patient-reported QOL. The AUROC represents the probability that a randomly chosen patient who reports poor quality is correctly rated by a prototypic, multi-item measure compared with a randomly chosen patient who had neutral or positive QOL. When the AUROC is 0.70 to 0.79, its discrimination is considered fair, and when 0.80 to 0.89, good.
I include in the AUROC displays only 4 prototypic measures because their range subsumes that observed for other combinations. I also limit the displays to “poor QOL” because the results of an analysis for “good QOL” are not different (“very well” and “good” responses vs all others).
Poor QOL and other self-reported items
Table 2 displays the percentage of Medicare patients responding to each item when they rate their QOL either “poor” or “not poor.” As expected, negative responses to any item are strongly associated with poor overall QOL. Also listed is the Spearman correlation between the full scale of an item and the 5 possible responses for QOL ranging from “very well—could hardly be better” to “very bad—could hardly be worse.”
QOL and prototypic comparators
The Figure (left diagram) illustrates the association of the 5-item WMI, a sum of 5 chronic conditions and an 8-item comparator modeled after the SF-36. The better discriminators between QOL states will have a higher AUROC. The 8-item comparator is the best discriminator for poor QOL and the sum of 5 diagnoses is clearly the worst. The correlation of a full 5-response range of QOL reflects this difference (r = 0.60 vs r = 0.28).
An attribute of an AUROC is the insight it provides about the marginal advantage (or disadvantage) of adding or omitting measures. For example, if to the WMI any other additional functional limitation is added (from a response for a significant limitation in social support, social activity, daily activity, or physical activity), the AUROC is only marginally increased to 0.90 from 0.87. In other words, so many measures are correlated with QOL (Table 2), and therefore each other, that after a point the addition of measures is not meaningful.
Past hospital use and comparators
The Figure (right diagram) also includes the AUROC for identifying the 1884 Medicare patients who used the hospital in the past year. Although they discriminate better than chance, none of the comparators is acceptable for predicting hospital utilization by individual patients. This finding, based on retrospective data, is consistent with prospective prediction of hospital and emergency care in Medicaid and private practice patients. Regardless of the measures or algorithms used for prediction, designations of being at risk for costly care are inherently inaccurate (Wasson et al., 2018a).
This brief report illustrates several different methods for assessing and improving, whenever possible, a patient's QOL.
- Simply ask patients about their quality of their life and inquire about what they would need to maintain or improve it. These efficient, direct queries bring to the fore a concern for every patient's QOL.
- Administer a minimized few memorable measures for which a consistent remedy can be predefined (Wasson et al., 2018b). This efficient approach, represented by the WMI, can immediately address important patient needs and indirectly measure and monitor a patient's QOL.
- Use longer multi-item measures that require additional scoring and interpretation before action can be taken. An association with QOL is likely if the right measures are chosen.
My colleagues and I previously documented better value—lower costs for similar results—of single-item compared with multi-item instruments for assessing practice quality and patient engagement (Ho et al., 2013; Wasson, 2013). This brief report illustrates broader advantages of parsimony in measurement of what matters to patients.
Agency for Healthcare Research and Quality. (n.d). CAHPS measures of patient experience. Retrieved October 1, 2018, from https://www.ahrq.gov/cahps/consumer-reporting/measures/index.html
Brook R. H., Ware J. E. Jr., Davies-Avery A., Stewart A. L., Donald C. A., Rogers W. H., Johnston S. A. (1979). Overview of adult health status measures fielded in RAND's Health Insurance Study. Medical Care, 17(Suppl.), 1–131.
de Azevedo-Marques J. M., Zuardi A. W. (2011). COOP/WONCA Charts as a screen for mental disorders in primary care. The Annals of Family Medicine, 9(4), 359–365. doi:10.1370/afm.1267
Feinstein A. R. (1992). Benefits and obstacles for development of health status measures in clinical settings. Medical Care, 30(5, Suppl.), MS50–MS56.
Ho L., Swartz A., Wasson J. H.; Dartmouth Primary Care Practice Network and Ideal Medical Practices. (2013). The right tool for the right job: The value of alternative patient experience measures. The Journal of Ambulatory Care Management, 36(3), 241–244.
Nelson E. C., Conger B., Douglass R., Gephart D., Kirk J., Page R., Zubkoff M. (1983). Functional health status levels of primary care patients. JAMA, 249(24), 3331–3338.
Nelson E. C., Eftimovska E., Lind C., Hager A., Wasson J. H., Lindblad S. (2015, February 10). Patient reported outcome measures in practice. BMJ, 350, g7818. doi:10.1136/bmj.g7818
Nelson E. C., Landgraf J. M., Hays R. D., Wasson J. H., Kirk J. W. (1990). The functional status of patients: How can it be measured in physicians' offices? Medical Care, 28(12), 1111–1126.
RAND Health. (2018). 36-Item Short Form Survey (SF-36). Retrieved April 27, 2018, from https://www.rand.org/health/surveys_tools/mos/36-item-short-form.html
Selim A. J., Shirley X, Qian S. X., Rogers W., Arya D., Simmons K. (2018). Health status in adults with chronic conditions: Intervention strategies for improving patient reported outcomes. The Journal of Ambulatory Care Management, 42(1), 2–20.
Tarlov A. R., Ware J. E., Greenfield S., Nelson E. C., Perrin E., Zubkoff M. (1989). Medical Outcomes Study: An application of methods for evaluating the results of medical care. JAMA, 262, 925–930.
van Weel C., Konig-Zahn C., Touw-Otten F. W. M. M., van Duijn N. P, Meyboom-de Jong B. (1995). Measuring functional health status with the COOP/WONCA Charts. A manual. Groningen, the Netherlands: WONCA, ERGHO, and NCH-University of Groningen.
Wasson J. H. (2013). A patient-reported spectrum of adverse health care experiences: Harms, unnecessary care, medication illness, and low health confidence. The Journal of Ambulatory Care Management, 36(3), 245–250.
Wasson J. H. (2017). A troubled asset relief program for the patient-centered medical home. The Journal of Ambulatory Care Management, 40(2), 89–100.
Wasson J. H., Benjamin R., Johnson D., Moore L. G., Mackenzie T. (2011). Patients use the internet to enter the medical home. The Journal of Ambulatory Care Management, 34, 38–46.
Wasson J. H., Ho L., Soloway L., Moore L. G. (2018a). Validation of the What Matters
Index: A brief, patient-reported index that guides care for chronic conditions and can substitute for computer-generated risk models. PLoS One, 13(2), e0192475. doi:10.1371/journal.pone.0192475
Wasson J. H., Jette A. M., Anderson J., Johnson D. J., Nelson E. C., Kilo C. M. (2000). Routine, single-item screening to identify abusive relationships in women. The Journal of Family Practice, 49(11), 1017–1024.
Wasson J. H., Kairys S. W., Nelson E. C., Kalishman N., Baribeau P. (1994). A short survey for assessing health and social problems of adolescents. The Journal of Family Practice, 38(5), 489–494.
Wasson J. H., Soloway L., Moore L. G., Labrec P., Ho L. (2018b). Development of a care guidance index based on what matters
to patients. Quality of Life Research, 27(1), 51–58. doi:10.1007/s11136-017-1573-x
Wasson J. H., Stukel T. A., Weiss J. E., Hays R. D., Jette A. M., Nelson E. C. (1999). A randomized trial of using patient self-assessment data to improve community practices. Effective Clinical Practice, 2, 1–10.