To demonstrate high standards of medical practice that protect the public, regulators and policy makers should use a multidimensional assessment of professional competence that is valid and reliable.1,2 To this end, the American Board of Medical Specialties developed a comprehensive framework for practicing physicians called maintenance of certification (MOC) that integrates self-assessment of lifelong learning and practice performance with a secure examination evaluating medical knowledge and judgment.1 Assessments of medical knowledge are known to be reliable and valid, but assessments of physicians’ practice have many methodological challenges.3,4 Yet, consumers seek better ways of distinguishing among physicians in light of studies showing inadequate quality of health care in the United States.5 A better understanding of the relationship between systems and quality-of-care and study designs that consider physician-level clustering and patient case mix have addressed some of the methodological challenges.6,7 However, challenges still exist in understanding the complex interactions in an office setting and producing a reliable and valid assessment of physician clinical practice.3
The American Board of Internal Medicine developed a tool called the practice improvement module (PIM) to help physicians assess and improve performance in practice including diabetic patient care.8 A sample PIM can be viewed at (www.abim.org/online/pim/demo.aspx). The Diabetes-PIM provides a comprehensive assessment of a physician’s practice using three data-collection components that examine patient satisfaction with diabetes care, processes and outcomes of diabetes care, and the nature of the office’s practice system. The primary objectives of the study are to assess the psychometric properties of the three components of the Diabetes-PIM, to explore whether composite measures used to assess physician performance are more reliable than individual measures, and to assess the validity of the composite measures.
The Diabetes-PIM, an elective module of MOC, contains three data sources: (1) survey of patient’s views regarding quality and access to medical practice (patient survey), (2) audit of medical records (chart audit), and (3) questionnaire of the practice’s microsystem (practice system survey). Questions in the patient survey were adapted from the Picker patient and CAHPS surveys9,10; the chart audit measures were selected from evidence-based guidelines,11 and questions in the practice systems survey were based on Wagner’s Chronic Care Model and the Institute for Health Care Improvement’s Idealized Office Design project.12,13
Physician-level data were reported for all three data sources, and patient-level data were reported for the patient survey and chart audit. Physicians were instructed to sample diabetic patients seen within the past 12 months and with the practice for at least one year. A prospective, sequential sample was recommended asking the next 50 eligible patients to complete the survey while auditing the medical charts of 25 of these patients. Between-sample differences in patient characteristics were compared using Student t tests adjusted for the clustering effect of patients within physicians. Power of coefficient tests range from 22% to 98% for 620 physicians, with about 20 patients, using P = .01.
For the patient survey and chart audit, principal factor analysis was used to aggregate items into a composite, using physician as the unit of analysis. Although factor analysis was initially used for the practice systems survey, the structure was too complex with 626 physicians. Instead, subsets of items were organized based on Wagner’s Chronic Care model.12 Cronbach’s α was used to assess the internal consistency reliability of each composite. The composites were calculated as equally weighted item measures.
Reliability of physician means from the patient survey and chart audit were calculated, using patient sample size within physicians and estimates of the intraclass correlation coefficient (ICC) of reliability. High ICC values mean that patient measures within a physician’s practice tend to be more similar than measures compared among physician practices. Two-sided (95%) interval estimates for ICCs assessed the magnitude of physician effects such that ranges that included zero were deemed unreliable.
Mixed linear models (hierarchical), which handle the complexities of patients clustered within physicians, were used to predict satisfaction with care and clinical outcomes composites, using patient- and practice-level adjusters. Stepwise regression was used to dictate inclusion in the model, starting with the full model and ending when the minimum Akaike information criterion value was found among a series of nested models.14 All analyses were conducted using SAS statistical software.
Six factors resulted from the 27 patient survey questions: (1) satisfaction with care (very good or excellent) for patient’s view of physician’s capacity for understanding living with diabetes, encouraging questions, providing information on diet, medication, side-effects, and foot care, teaching blood sugar monitoring, and overall diabetes care, (2) access to care (no problem) for patient’s view on effort required to schedule, reach practice, obtain prescription refills, referrals, and test results, (3) knowledge of care received (yes) for patient’s knowledge of cholesterol and HbA1c test, eye and foot exam, blood pressure, and flu shot, (4) patient’s overall health using a self- rating of health status (1–5 scale), (5) self- care knowledge (known) for patients’ knowledge of regular blood sugar monitoring and three blood sugar levels/symptoms, and (6) diet habit for patient’s adherence to eating plan (followed) and reading nutrition facts (yes).
Four factors resulted from the 16 chart audit variables: (1) clinical outcomes and treatment for patients who met the following goals: HbA1c <7%, LDL <100 mg/dl, triglycerides <150 mg/dl, BP <130/80, and aspirin and statin treatment appropriately prescribed for eligible patients. This factor was split into separate composites because goals and treatment are conceptually different. Clinical outcomes goals were based on excellent rather than competent care, to emphasize quality improvement. The other factors were (2) clinical processes for patients receiving annual HbA1c, lipids, eye, and foot exams, (3) exercise/nutrition plan for patients who received medical nutrition therapy and physical activity plan, and (4) factors limiting self-care for patients who had psychiatric illness or cognitive impairment, problems with adherence, other medical conditions, and other social factors.
The practice systems survey, comprising 94 questions, addressed whether the practice implemented elements important to idealized practice.12,13 Questions were scored as 1 if yes and 0 otherwise. The six composite measures were (1) care management—use of work aids (protocols and reminders) to help with planning and monitoring patient care, (2) patient activation and communication—directed communication with patients by staff and physician and assessment of patients’ readiness to change, (3) modes of communication—system capacity to track scheduling issues, encourage patient contact with the practice, and give guidance for providing care in urgent situations, (4) practice-based learning and improvement—how the practice team can improve outcomes and processes of care, (5) information management—quality of patient record-keeping system and referral tracking, and (6) environment—staff teamwork and practice efficiency.
From the 810 physicians who elected to complete the Diabetes-PIM by the study date, 626 usable modules were returned for a 77% response rate. The mean age of the 626 physicians was 43.8 years (SD = 6.1) with an average of 17.0 years (SD = 6.0) since medical school graduation. Most physicians were in group practice (74%), and 35% were female. Most were general internists (84%); 10% were endocrinologists or nephrologists. Mean age and gender for this sample did not differ substantially from the 30,183 physicians enrolled in MOC. Data include 626 practice systems surveys, 13,965 chart audits, and 12,927 patient surveys. Although patients in the patient survey and chart audit were not necessarily the same, their mean age (62.0 for survey and 61.8 for chart; P = .56) and gender (female = 50.9% for survey and 50.3% for chart; P = .62) were similar. However, the prevalence of smokers in the patient survey (12.5%) was greater than in the chart audit (10.0%) (P < .001).
Table 1 (part 1a) shows means and reliabilities for three patient survey composites. On average, 92% of patients rated access to care as not being a problem, and 77% rated satisfaction with care as very good or excellent. ICCs of the three composites ranged from 0.11 to 0.18; not surprisingly, each is higher than any single item within a composite. For example, the ICCs of items comprising the knowledge of care received composite range from 0.05 to 0.13, but the composite’s value is 0.18.
Table 1 (part 1b) presents the four chart audit composites. On average, 51% of patients are achieving excellent outcomes of care, 73% are receiving the required processes of care, and 24% have some factors limiting self-care. Exercise/nutrition plan has a high ICC (0.54), as does clinical processes (0.46), in contrast to a low ICC (0.11) for clinical outcomes. The last two columns of Table 1 show the estimated reliability of the mean if all measures were based on a random sample of 25 patients and an estimate of the patient sample size to obtain a high level of reliability (i.e., ρ = 0.85). Values of less than 25 suggest minimal measurement error, because this was about the sample size used. For clinical outcomes, where the ICC is low, 45 patients per physician are needed to achieve a respectable reliability, whereas clinical processes requires only seven patients. Table 1 (part 2) shows the mean composites and reliabilities of the practice system survey. Each composite represents the mean percentage of quality medical care and organizational features reported implemented in practice. Reliabilities range from 0.65 to 0.89, with an overall reliability of 0.84.
Results for the patient survey model (Table 2) indicate that, adjusting for patient and physician characteristics, there are significant associations between satisfaction with care and access to care (P < .001), knowledge of care received (P < .001), patient activation and communication (P = .004), and environment (P = .005). Quite respectably, these composites explain 29% of the variance in the model (R = 0.54). Results for the chart audit model (Table 2) indicate that, adjusting for patient and physician characteristics, clinical outcomes is associated with clinical processes (P < .001), and treatment (P < .001). These composites explain only about 6% of the variance in the model (R = 0.24), possibly because patient adjusters are inadequate and because other patient variables (e.g., self-care knowledge) were not available for these patients.
The purpose of the study was to assess the psychometric properties of three components of the Diabetes-PIM, explore the effect of creating composites from individual items, and assess the validity of these composites. Aggregating items into composites improves the reliability for both the patient survey and chart audit, yielding respectable reliabilities ranging from 0.76 for satisfaction with care to 0.97 for exercise/nutrition plan. Had we used single items as measures, these reliabilities would be substantially lower. Clinical processes and exercise/nutrition plan have high ICCs because they are more under physician’s control, so smaller sample sizes are required to achieve a reliable estimate. Clinical outcomes ICCs are much lower, indicating that they are more complex and likely influenced by patient adherence to recommendations.
Patient adjusters (e.g., overall health or factors limiting self-care) are important for both patient survey and chart audit models, whereas physician characteristics seem less important. The validity of the composites is demonstrated by their meaningful relationships with other measures. Patients demonstrated greater satisfaction with care if a practice was accessed without a problem (e.g., scheduling, obtaining prescription refills), provided better information to patients, patients were aware of the care they received, and there were fewer distinct staff roles (possibly suggestive of a smaller practice setting). Patients had better clinical outcomes if a practice had better clinical processes of care and appropriate treatment. Surprisingly, none of the practice system measures were strongly related to clinical outcomes.
These results should be interpreted with caution. First, participants may not be representative of all physicians caring for diabetic patients, because they are younger and have demonstrated initiative by enrolling in MOC. Second, chart audit and practice system surveys are self-report data, but because physicians do not need to meet a specific standard of performance, there is no incentive to misrepresent performance. Additionally, prior work has shown strong agreement between trained abstractors and physician audits of the same medical records.8 Third, we cannot ensure that patient surveys were distributed to patients whose chart was audited although patient age and gender were similar for the two data sources. Fourth, the available adjusters are inadequate; socioeconomic and race classifications and a more robust measure of comorbidity are necessary. Fifth, the composites were calculated by equally weighting the items; research on an optimal way of weighting items to maximize sensitivity and specificity of these measures is needed.
Despite these limitations, our initial findings demonstrate that composites require smaller patient sample sizes and result in meaningful and more reliable measures than individual items. In addition, the data show meaningful relationships between composites such as between physician-directed components (e.g., clinical processes and treatments) and clinical outcomes. Patients were clearly more satisfied with care if it was easily accessible and communication about care was good. Future research can explore optimal methods for composite development and a clearer understanding of the structural relationships among composites. It is hoped this analysis will provide a stepping stone to the development of a fair, reliable, and valid assessment of practice performance.
1Wasserman SI, Kimball HR, Duffy FD. Recertification in internal medicine: a program of continuous professional development. Task Force on Recertification. Ann Intern Med. 2000;133:202–208.
2Leape LL, Fromson JA. Problem doctors: is there a system-level solution? Ann Intern Med. 2006;144:107–115.
3Brennan TA, Horwitz RI, Duffy FD, Cassel CK, Goode LD, Lipner RS. The role of physician specialty board certification status in the quality movement. JAMA. 2004;292:1038–1043.
4Landon BE, Normand ST, Blumenthal D, Daley J. Physician clinical performance assessment: prospects and barriers. JAMA. 2003;290:1183–1189.
5McGlynn EA, Asch SM, Adams J, et al. Quality of health care delivered to adults in the United States. N Engl J Med. 2003;348:2635–2645.
6Greenfield S, Kaplan SH, Kahn R, Ninomiya J, Griffith JL. Profiling care provided by different groups of physicians: effects of patient case-mix (bias) and physician-level clustering on quality assessment results. Ann Intern Med. 2002;136:111–121.
7Nutting PA, Dickinson WP, Dickinson LM, et al. Use of chronic care model elements is associated with higher-quality care for diabetes. Ann Fam Med. 2007;5:14–20.
8Holmboe ES, Meehan TP, Lynn L, Doyle P, Sherwin T, Duffy FD. The ABIM diabetes practice improvement module: a new method for self assessment. J Contin Educ Health Prof. 2006;26:109–119.
9Cleary PD, Edgman-Levitan S. Health care quality. JAMA. 1997;278:1608–1612.
11National Committee on Quality Assurance and American Diabetes Association. Diabetes physician recognition program. Available at: (http://web.ncqa.org
). Accessed May 1, 2007.
12Wagner EH, Austin BT, Von Knrff M. Organizing care for patients with chronic illness. Milbank Q. 1996;74:511–544.
13Moen R. A Guide to Idealized Design. Cambridge, Mass: Institute for Healthcare Improvement; 2002.
14Demidenko E. Mixed Models Theory and Applications. Hoboken, NJ: Wiley Publications; 2004.