The Consumer Assessment of Healthcare Providers and Systems (CAHPS® ) is a comprehensive and evolving family of survey instruments designed to capture and report various aspects of consumers’ experiences with care, such as access to care and experiences with clinicians. In addition to developing surveys that capture reliable and valid data, CAHPS surveys seek to measure what consumers consider important and for which consumers are the best or only source of information.1 CAHPS survey data are used for public reporting of health care quality information to inform consumer choice and provider quality improvement initiatives and, more recently, as input into pay-for-performance awards to hospitals.
The development of a survey that measures the experiences of family members of nursing home residents appears to represent a departure from the CAHPS surveys because CAHPS principles of survey development discourage proxy response. Research has shown that when proxy responses are used instead of consumer responses, biased results are produced.2–8 In general, agreement between proxies and consumers is higher for observable events such as ambulation and quality of speech, and lower for events internal to the patient such as depression and general feelings of wellbeing. These findings would suggest that the perspectives of nursing home residents are the best measure of the quality of care and quality of life in a nursing home. However, approximately two thirds of nursing homes’ residents have some degree of cognitive impairment as measured by the Cognitive Performance Scale, and the extent of this cognitive impairment prevents a substantial portion of nursing home residents from assessing the quality of care provided to them.9 To avoid proxy response, while at the same time providing assessments of quality for residents too cognitively impaired to respond, the objective of the family member survey was to ask about nursing home experiences that could be directly experienced by the resident’s family or caretakers. Thus, the Agency for Healthcare Research and Quality (AHRQ) and the Centers for Medicare and Medicaid Services (CMS) sponsored the development of the CAHPS Nursing Home Survey: Family Member (family member survey) to complement the existing CAHPS Nursing Home Survey: Resident Survey.10 Although the family members do not receive care from the nursing home, their experiences at the nursing home and with staff are important in describing the quality of the nursing home. Family members often help choose the nursing home and continue to be involved with their family member after entry, participating in care planning and making decisions on behalf of the resident. Like the CAHPS Nursing Home Resident Surveys, the family-reported experiences of care complement other, publicly reported data such as health deficiencies, staffing, and other quality indicators (eg, percent of residents with restraints), and thus contribute to providing a comprehensive view of quality care.11
The survey development process followed CAHPS principles: asking questions for which the family member is the best, if not the only, source of information regarding the quality of care in a nursing home; asking respondents to report on actual experiences, rather than perceptions of care provided to residents; and, finally, including topics that stakeholders, particularly family members, have identified as being important. One challenge in developing a survey that adheres to these principles is the need to address variation in the degree of engagement exhibited by residents’ families. Some family members rarely see nursing home residents; perhaps as infrequently as once a year.12 Other residents either have no family members or have family members who have legal guardians make decisions on the resident’s behalf. In such situations, the guardian is the person best positioned to answer the survey questions, and so we developed a survey process that includes a means of determining who would be the most appropriate respondent.
This paper describes the procedures that were used, the issues that were identified, and the bases on which candidate questions were developed, eliminated, retained, or revised to finalize the CAHPS Nursing Home Surveys: Family Member Instrument.
METHODS
The development of the Nursing Home Family Survey followed a process similar to that used for all other CAHPS survey development projects: formative research to inform the content of the draft survey, cognitive interviews and an expert review of the draft survey, a field test of the draft survey, psychometric analysis of the field test data, further revisions based on that analysis, expert review of these revisions, and production of the final version of the survey (Fig. 1 ). Findings from the Nursing Home Resident Survey also informed the development of the family member survey.
FIGURE 1: CAHPS survey development process.
Formative Research
The formative research included the original literature review from resident survey study, literature review updates, responses to the call for measures in the Federal Register, and focus groups.13–14
The team conducted focus groups with family members to identify aspects of the nursing home experience that are most important to them. Focus groups illuminate the concepts most important to consumers, thereby contributing to defining quality of care. They also help to confirm findings from the literature review, determine additional constructs,15 and identify the language consumers use to describe quality of care. Twelve focus groups were conducted in 2005 with participants who either already had placed a family member into a nursing home or were considering placing a family member into one. Groups included a mix of men and women of different race, ethnicity, and educational attainment.14
Cognitive Testing and Expert Review
The team cognitively tested the survey with 27 participants in June 2005. Cognitive testing illuminates whether respondents understand survey items consistently with their intended meaning.16 Cognitive testing identifies imprecise wording, poor ordering of items, and mismatches between survey content, and respondent knowledge and literacy, thus minimizing respondent cognitive burden. After this initial testing round, a panel of experts provided guidance and confirmed which items were valuable and measureable indicators of quality of care. The experts included representatives from the nursing home industry, regulators, quality improvement organizations, consumer advocates (including nursing home ombudsman), providers, and long-term care researchers. The team revised the items and conducted additional testing (n=27) in June 2006. The instrument was revised again for the field test.
Field Test
The survey was field tested from October 2006 to January 2007 in Texas, using a convenience sample of 15 Texas nursing homes that varied in quality as indicated by Texas’s Long Term Care Quality Reporting System, derived from deficiency and quality measures from CMS. Researchers of our team had a history of working with the Texas State Long Term Care Ombudsman program that facilitated our working with these nursing homes on the field test. Five of the 15 participating nursing homes had Quality Reporting System scores below the statewide mean, 1 facility was at the mean, and 9 exceeded the mean.
We identified an eligible respondent as the person listed by the nursing home as the party responsible for a resident who was at the facility for ≥30 days. In addition to family members and friends, guardians, medical powers of attorney, and attorneys were considered to be eligible responsible parties. Some nursing homes had >1 responsible party listed. In those cases, we selected the one with the complete contact information, or if there was >1 with complete contact information, then we randomly selected one. On the basis of a power analysis of several sampling scenarios, we determined that we would need approximately 50 complete responses per facility to detect a moderate difference between facilities (effect size of 0.50 as per Cohen17 ). Assuming an 80% eligibility rate and a response rate of 50%, only facilities with at least 120 beds would yield samples of this size. To provide a margin of error in our sampling design, we increased this target to 150. We selected every resident in facilities with fewer than 150 beds, and for those with >150 beds, we drew a random sample of 150 residents.
The data collection process included mailing a questionnaire to respondents, mailing a reminder postcard 2 weeks later, mailing a second survey 2 weeks after the postcard, and calling nonrespondents 2 weeks after the second survey mailing to attempt computer-assisted telephone interviews.
Psychometric Analysis
To use all available data for the respondent-level factor analysis, a multiple imputation procedure (SAS/STAT version 9.2) was used to impute data missing due to structured item nonresponse (5 imputations were requested to reflect variability in estimates). This procedure calculates the maximum-likelihood estimates of the covariance matrix under the missing-at-random assumption.18,19 The first correlation matrix produced by PROC MI was used as an input for the factor analysis models, and the analyses were repeated on each of the remaining 4 matrices to evaluate the similarity of conclusions based on the results of analyses for each imputation. Results from these subsequent analyses are only reported if they differ from the initial analysis.
Thirty-two survey substantive items were originally hypothesized as indicators of 5 composites: getting care quickly/availability of staff (5 items), quality of care from nurses and aides (8 items), communication with nurses and aides (4 items), communication with other staff and administrators (8 items), and the nursing home environment (7 items). To assess the validity of the hypothesized factor structure, we examined goodness-of-fit results from a confirmatory factor analysis (CFA) using the PROC CALIS in SAS/STAT software. As nontrivial amounts of the data were imputed, and imputed values tend not to be whole numbers, we determined that it would be best to treat the data as continuous instead of ordinal. Recent simulation studies indicate that rounding imputed values to make them integers can bias estimations.20,21 We examined 3 indices of how well the data fit the hypothesized structure: the root mean square error of approximation (RMSEA),22 which describes how well the model fits the population covariance matrix; the normed-fit index (NFI), which compares the hypothesized model to a “worst case scenario” model where all composite items are uncorrelated; and the comparative fit index,23 which is a variant of the NFI that takes into account sample size and performs well even with small samples.
The initial hypothesized factor structure did not fit the data (details provided in the Results section below), and thus we followed the CFA with an exploratory factor analyses using PROC FACTOR in SAS. These models used the principal factor method with squared multiple correlations as initial communality estimates and oblique rotation (promax) with Kaiser normalization. The number of factors was determined by Guttman’s weakest lower bound24 (number of factors with eigenvalues >1) in conjunction with a scree plot of the eigenvalues25 and examining the pattern of factor loadings upon rotation for simple structure26 (ie, assessing the degree to which the number of factors extracted based on the first 2 criteria suggested a composite structure that was conceptually interpretable).
Multitrait, multi-item analyses provided supporting evidence for alternative factor structures.27–29 This approach entails comparing the correlations of items with their composite total (correcting for overlap30 ) to the correlations of items with competing composites. The “scaling success” statistic is one of a number of pieces of evidence that bears on the construct validity of the proposed composites—scaling success of 100% indicates that all items correlate more highly (at least 1 SE higher) with their own composites than with competing composites. The logic for this analysis as an evaluation of construct validity follows that laid out by Campbell and Fiske.31 In addition, to evaluate the relative importance of the items and composites in predicting the overall evaluations of the nursing homes, we examined the correlations between the composites and 3 global ratings of the nursing home: (1) ever unhappy about the care; (2) recommend the nursing home; and (3) rating of all of the care on a scale from 0 to 10.
At the respondent level, Cronbach α32 was used as an indicator of internal consistency reliability of the composites. We also calculated interunit reliability (IUR) for each item and composite. The IUR indicates the degree to which these measures discriminate across nursing homes, and is computed as the between-group variance minus the within-group variance over the between-group variance.33,34 As IUR is sensitive to sample size, we also calculated the intraclass coefficient (ICC), which is the between-group variance minus the within-group variance over the total variance adjusted for the average number of respondents per nursing home.35 We considered an IUR of 0.70 as the desired minimum for indicating good reliability.36 Using the ICC and the Spearman-Brown prophecy formula,37,38 and taking into consideration the item-level response rates because of skip instructions, we estimated the number of complete responses, as well as the total sample size, needed for a composite or item to yield an IUR of 0.70.
RESULTS
Focus Groups
The focus groups provided information on quality of care indicators important to family members. These included cleanliness, staff training, staffing levels, quality of interactions between staff and residents, quality of food, social engagement, and privacy (Table 1 ). Participants were interested in measures of both good and poor quality of care; so the team created items that measured both. Indicators of good quality included concepts such as how caring and approachable the staff is (I want to see how the staff treats the patient), treating the resident as a “real person,” and physical aspects of the nursing home, such as cleanliness, appearance, and size of rooms. Indicators of poor quality included concepts such as “no screaming, yelling, or crying [at the home].”
TABLE 1: Decisions About Quality of Care Constructs Emerging From Focus Groups
Participants also identified numerous concepts that family members could not have sufficient experience or knowledge with which to form a judgment, such as the quality of food, providing medications on time, and quality of clinical care. Ultimately, concepts that were considered for item development included those that respondents could observe or experience and a few proxy concepts.
Cognitive Testing and Expert Review
One challenge was to determine whether respondents had enough information or experience to make meaningful judgments. Candidate items included both obvious proxy items and items not initially perceived as proxy items (hereafter termed “hidden proxy”) (Table 2 ). Obvious proxy items were eliminated or revised. Formative research findings suggested measuring the taste and nutrition of food, an obvious proxy, but the team discarded measuring these concepts because family members would not be eating the food and would not be able to judge (and, in some cases, residents may be on a low-sodium diet, which may be perceived as poor tasting). The team developed and tested several hidden proxy items, for example, items asking whether the staff showed courtesy and respect, cared about the resident, or treated the resident with kindness. Cognitive testing participants were able to provide concrete examples of observing staff performing these behaviors.
TABLE 2: Obvious and Hidden Proxy Items
Other hidden proxy items did not fare as well. Cognitive testing participants could not provide meaningful answers about seeing staff encourage residents to participate in care decisions and encourage residents to be as independent as possible. These items were abandoned because respondents did not observe the behaviors, were not able to identify the behaviors, or, regarding encouraging residents to be independent, were not applicable because the resident was too frail as to be independent at all.
Formative research indicated that adequate staffing is critical to family members, but developing items that captured the construct while being observed by the family member was a challenge. To measure adequate staffing, the team developed items that measured unmet needs. Three items asked about whether the family member performed an activity (help with eating, drinking, and toileting) because no staff was available. An additional item was developed that asked how often the respondent was able to find a nurse or aide when he or she wanted one.
Even with these staff adequacy measures, stakeholders requested an item measuring overall nursing home staffing. The team decided to add the item “…did you feel there were enough nurses and aides?” although this item did not follow usual CAHPS principles. First, family members would not necessarily know if there were enough nurses and aides and thus would not be the best source of this information. Second, it was unclear if this was really a proxy for administrative staffing records. This item was placed after all of the substantive items to minimize its influence/order effect.
Field Test Analysis
The total sample size across all homes was 1471 with homes averaging 90 eligible respondents. The final response rate was 63% (n=885), 732 by mail and 153 by phone. More than half of the respondents were adult children of the resident (55%), 11% were siblings, 9% were spouses, 14% were other family members, and 11% were someone else. Most respondents (69%) visited their family member >20 times in the last 6 months; 11% visited 11–20 times; 10% visited 6–10 times; and 7% visited 2–5 times; only 3% visited ≤1 times (and their responses were excluded from further analysis).
Most respondents reported that their family members were long-stay: 85% of respondents had family members who lived in the nursing home for ≥6 months and 84% of the respondents expected their family member to live there permanently. Finally, 63% of respondents reported that their family member had a serious memory problem.
The CFA showed that the data did not fit the hypothesized 5 factor structure (CFI and NFI <0.90; RMSEA>0.10), and so the team identified an alternative structure and a revised set of composites. The exploratory factor analyses results also did not demonstrate a clear underlying factor structure for these 32 items. Accordingly, the team conducted iterative analyses that included additional factor analysis and multitrait multi-item analysis, along with input from the team, AHRQ, CMS, and the experts.
After reviewing the analytic results, the team agreed on a final set of 21 items organized into 4 composites (Table 3 ). The Cronbach coefficient α indicated a high level of internal consistency for all the 4 composites. The floor effects (the percentage of respondents at the lowest possible rating/score) were quite low (0%–<1%). Likewise, the ceiling effects (the percentage of respondents who gave the highest possible rating/score), were generally low (5%–14%) for 3 of the 4 composites. The “Basic Needs” composite had substantially higher ceiling effects, which is not unusual for a topic made up entirely of dichotomous items. The correlations between the composites and the global rating of the nursing home supported the concurrent validity of all composites. Composites 1, 2, and 3 have good discriminant validity with 100% scaling success. The fourth composite’s scaling success was lower but was retained because the team and stakeholders considered the items important to measure.
TABLE 3: Final Composites and Dispositions
Three out of 4 of the composites (2, 3, and 4) had high nursing-home–level reliabilities; that is, they discriminated across nursing homes. The basic needs composite had relatively low IUR because about two thirds of the respondents legitimately skipped the survey items based on their responses to screener questions. Further, all 3 items were dichotomous, and thus the variance in the composite is naturally limited.
Table 4 summarizes the statistical and stakeholder considerations that were taken into account in determining which of the field test items should be included in the final instrument. Ten of the 32 items were abandoned. In every case, these items had either low item-total correlation, were highly correlated with >1 composite, had low IUR, and/or contained content that was included elsewhere in the survey (Table 4 ). One item, (did you ever have a problem?) was changed to a global measure.
TABLE 4: Final Items and Dispositions
Surprisingly, the item “…did you feel there were enough nurses and aides in this nursing home?” had the highest IUR at 0.88, with an ICC of 0.12. Although this item was considered a nonstandard CAHPS item as described earlier, it was kept because of its ability to discriminate among nursing homes.
DISCUSSION
Unlike other CAHPS surveys, the CAHPS nursing home family member survey was developed to solicit information from respondents who do not directly receive care. This survey complements the resident survey and provides a way to address the care experience of the most vulnerable residents in nursing homes; those who are too cognitively impaired to respond for themselves.
The challenges inherent in creating such a survey are to include constructs important to consumers and to ask about concepts with which family member respondents have direct experience. Focus groups findings indicated that family members defined quality of care in some ways they would not have the relevant experience with which to make meaningful judgments. But the survey addressed constructs a family member could judge.
Cognitive testing identified issues such as interpreting items inconsistently, ambiguity of terms or items, lack of direct experience relating to the question, and redundant concepts in the eye of the respondents. Testing also identified items that were proxies for constructs for which nursing home residents would be the best source of information. Items that were deemed important by family members and stakeholders were refined through cognitive testing so that respondents could evaluate the concept based on their own experience(s).
All composites but “Meeting Basic Needs” had high levels of discrimination. Surprisingly, an atypical CAHPS item (how often respondent “felt” there were enough nurses and aides) provided the highest discriminatory value. This item may have effectively captured an important construct by filtering an objective measure through feelings.
The final family member survey, using formative research to develop the draft, cognitive testing to refine the items, psychometric analyses, and technical expert input, represents a well tested, valid, and reliable survey. Providers and health policy makers can use it to measure the quality of nursing home care from the family member’s perspective. Its mail or phone administration is an advantage—it is less expensive than the in-person administration required for the long-stay resident survey. Family members often have a large decision-making role, thus information obtained from those with sufficient experience with the home provides value for individuals and other family members who will make similar decisions.
These 3 CAHPS surveys—the family member survey, and the 2 resident surveys for long-stay resident, and the discharged short-stay resident—measure different perspectives of nursing homes.39 The long-stay resident provides the perspective from residents able to respond about their care; the short-stay presents a view from persons receiving rehabilitation. This survey represents the experiences of those residents receiving the most days of care and who would otherwise be unable to voice their experiences.
ACKNOWLEDGMENT
The authors thank Dr Ron Hays for his suggestions that improved the quality of this manuscript.
REFERENCES
1. Crofton C, Lubalin J, Darby C. Foreword. Med Care. 1999;37:MS 1–MS9
2. Epstein AM, Hall JA, Tognetti J, et al. Using proxies to evaluate quality of life: can they provide valid information about patients’ health status and satisfaction with medical care? Med Care. 1989;27:S91–S98
3. Hays RD, Vichery B, Hermann B, et al. Agreement between self reports and proxy reports of quality of life in epilepsy patients. Qual Life Res. 1995;4:159–168
4. Neumann PJ, Araki SS, Gutterman EM. The use of proxy respondents in studies of older adults: lessons, challenges, and opportunities. JAGS. 2000;48:1646–1654
5. Todorov A, Kirchner C. Bias in proxies’ reports of disability: data from the National Health Interview Survey on disability. Am J Public Health. 2000;90:1248–1253
6. Kane RA, Kling KC, Bershadsky B, et al. Quality of life measures for nursing home residents. J Gerontol A Biol Sci Med Sci. 2003;58:240–248
7. Castle N. Are family members suitable proxies for transitional care unit residents when collecting satisfaction information? Int J Qual Health Care. 2005;17:439–445
8. Kane RL, Kane RA, Bershadsky B, et al. Proxy sources for information on nursing home residents’ quality of life. J Gerontol B Psychol Sci Soc Sci. 2005;60:S318–S325
9. CMS (2010). Nursing home data compendium, 2010 edition. Available at:
http://www.cms.gov/CertificationandComplianc/Downloads/nursinghomedatacompendium_508.pdf . Accessed December 7, 2011
10. Sangl J, Buchanan J, Cosenza C, et al. The development of a CAHPS instrument for nursing home residents (NHCAHPS). J Aging Soc Policy. 2007;19:63–82
11. Çalikoğlu Ş, Christmyer C, Kozlowski B. My eyes, your eyes-the relationship between CMS five star rating of nursing home and family rating of experience of care in Maryland. J Health Q. 2011 doi: 10.1111/j.1945-1474.2011.00159.x
12. Frentzel E, Evensen C, Keller S, et al. CAHPS Survey for Family Members of Nursing Home Residents. Report Submitted to AHRQ Under Grant Number 1 U18 HS13193-01, the CAHPS II Grant. 2008 Washington, DC American Institutes for Research
13. Congdon JG, Magilvey JK, Jones KR, et al. Quality Factors in Nursing Home Choice. Report Submitted to AHRQ Under Grant # R18 HS10926-03. 2004 Denver, Colorado University of Colorado Health Sciences Center
14. Frentzel E, Dardess P, Carman K, et al. Reporting nursing home quality: results from focus groups with family members involved in choosing a nursing home. Report to AHRQ under contract grant number 1 U18 HS13193-01; 2005
15. Morgan DL Successful Focus Groups: Advancing the State of the Art. 1993 Thousand Oaks, CA Sage Publications
16. Levine R, Fowler F, Brown J. Role of cognitive testing in the development of CAHPS Hospital survey. Health Serv Res. 2005;40:2037–2057
17. Cohen J Statistical Power Analysis for the Behavioral Sciences. 19882nd ed Hillsdale, NJ Lawrence Erlbaum Associates
18. Rubin D. Inference and missing data. Biometrika. 1976;63:581–592
19. Rubin D Multiple Imputation for Nonresponse in Surveys. 1987 New York John Wiley & Sons Inc.
20. Allison PD. Imputation of categorical variables with PROC MI. SUGI Proceedings 2005; Paper 113–30, from SAS Institute Inc. 2005.
Proceedings of the Thirtieth Annual SAS® Users Group International Conference. Cary, NC: SAS Institute Inc
21. Horton NJ, Lipsitz SR, Parzen M. A potential for bias when rounding in multiple imputation. Am Stat. 2003;57:229–232
22. Steiger JH. Structural model evaluation and modification. Multivar Behav Res. 1990;25:173–180
23. Hu LT, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Modeling. 1991;6:1–55
24. Guttman L. Some necessary and sufficient conditions for common factor analysis. Psychometrika. 1954;19:149–162
25. Cattell RB. The scree test for the number of factors. Multivar Behav Res. 1966;1:245–246
26. Thurstone LL The Vectors of the Mind: Multiple Factor Analysis for the Isolation of Primary Traits. 1935 Chicago University of Chicago Press
27. Hays RD, Hayashi T, Carson S, et al. User’s Guide for the Multitrait Analysis Program (MAP). 1988 Santa Monica, CA The RAND Corporation, N-2786-RC, 1088
28. Hays RD, Hayashi T. Beyond internal consistency reliability: rationale and user’s guide for multitrait scaling analysis program on the microcomputer. Behav Res Meth Instr Comp. 1990;22:167–175
29. Ware JE, Harris WJ, Gandek B, et al. MAP-R Multitrait/Multi-Item Analysis Program—Revised. 1997 Boston, MA Health Assessment Lab
30. Howard KI, Forehand GG. A method for correcting item-total correlations for the effect of relevant item inclusion. Educ Psychol Meas. 1962;22:731–735
31. Campbell DT, Fiske DW. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychol Bull. 1959;56:81–105
32. Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16:297–334
33. Winer BJ Statistical Principles in Experimental Design. 1970 New York McGraw-Hill
34. Zaslavsky AM, Buntin MJB. Using survey measures to assess risk selection among Medicare Managed care plans. Inquiry. 2002;39:138–151
35. Hays RD, Revicki DFayers P, Hays R. Reliability and validity (including responsiveness). Assessing Quality of Life in Clinical Trials: Methods and Practices. 20052nd ed Oxford Oxford University Press:41–53
36. Nunnally JC Psychometric Theory. 19782nd ed New York McGraw-Hill
37. Spearman CC. Correlation calculated from faulty data. Br J Psychol. 1910;3:271–295
38. Brown W. Some experimental results in the correlation of mental abilities. Br J Psychol. 1910;3:296–322
39. Kasper J Who Stays and Who Goes Home: Using National Data on Nursing Home Discharges and Long-stay Residents to Draw Implications for Nursing Home Transition Programs. 2005 Washington, DC Kaiser Commission on Medicaid and the Uninsured