Next, the program directors were instructed to identify sources from which they could collect data to track their clinical performance around the selected measures. The program directors required significant assistance with data source identification, as many, if not most, presumed that they would have to initiate or create their own manual data-collection processes and that each program would have to marshal personnel and time resources to accomplish such a task. Program directors and faculty were often overwhelmed when considering quality measures because they did not know how or by whom the large volumes of available data were collected in hospitals and clinics. Further, they often had trouble seeing how data collection can be built into their daily work or that, in many cases, it already is. An important part of beginning the data collection process was orienting the program directors to the extent of data that already exist in the health care delivery system and connecting them to the appropriate data sources—especially appropriately constructed electronic data queries. In November 2006, faculty proceeded with clinical quality data collection, on the basis of the indicators and data sources the program directors had previously identified.
Because neither medical education nor health care delivery is done in isolation, clinical outcomes in resident evaluation should be used to assess a resident’s performance as reflective of his or her participation in the health care delivery team. The data collected for the selected clinical quality indicators provide additional inputs for resident assessment at both midyear and end-of-year evaluations. Here, the program directors have struggled with the challenge of using data reporting and analysis that does not identify the individual resident provider. In a separate initiative, our hospitals have moved from reporting on quality measures at department or clinical service levels to individual faculty and staff levels. However, without the ability to query an electronic medical record, performance data reported at the resident-specific level are currently not available. Another issue that makes it difficult to track resident performance is the lack of clarity in assigning responsibility for work and decisions within a team of residents. For example, if an intern writes an order for aspirin for a patient with acute myocardial infarction, who gets the credit and feedback—the intern who writes the order, or the senior resident who tells the intern to write the order? Here, we have begun to provide education and guidance to the program directors on how to use aggregate data for the service at the team level to inform and assist the residents in understanding their individual performance and improvement in performance over time.
Programmatic improvements, for instance, in the form of curriculum modifications, are driven by clinical outcomes that are below benchmark across the residents. In this case, data for the selected clinical quality indicators provide additional inputs to the annual educational effectiveness evaluation for a particular program, as well as to the program assessments in the ACGME-required midaccreditation cycle internal review process and the continuous quality improvement monitoring that follows the internal review. Our institution’s process for tracking progress on issues identified at internal reviews and/or site visits has been expanded to include discussion of the program’s selected clinical measures. It gives the program director opportunity to have feedback on the measures selected, the data collected, and the application of both in resident and program evaluation, and it allows the program director the opportunity to ask questions and get advice and assistance for integrating the clinical indicators in the educational process.
The Tiered Strategy for Indicator Selection
Selecting indicators from the first tier was most preferable, but program directors could move through the four tiers, considering the availability of measures from each tier, to ensure that they selected the most widely agreed-on and appropriate indicators of success in their particular program or specialty. We describe each tier in detail below.
National consensus standards
Preferably, a set of clinical indicators for educational programs would always be aligned with the set of national consensus standards already selected for a clinical specialty, major diagnostic group, or area of care. To start, a subset of indicators may be selected for a particular program on the basis of national standards while program leaders identify data sources and data-collection processes and test and refine reporting methods to find those that work best for their program and institution.
Working with indicators that are consistent with known consensus standards serves several purposes. It puts the program in concert with other programs on a national level, using the same definitions, criteria, and comparable benchmarking. It also places the institution and its faculty in a ready or more competitive position for the data and reporting for pay-for-performance necessities. Third, it exposes the trainees to the quality indicators, data feedback, and performance framework with which they will be working for much, if not all, of the rest of their professional lives. Therefore, part of our duty in training them is to give them the data analysis and quality improvement tools they will need to apply to their practice-based learning and system-based practice.
The National Quality Forum (NQF) is a quasi-governmental organization that rigorously evaluates performance measures and that is regarded as the gold standard for performance measure acceptance, representing national endorsement. The NQF has already published consensus standards for one specialty (cardiac surgery) and one major diagnosis (adult diabetes), with cancer care consensus standards under development. In addition, the NQF has endorsed quality consensus standards by location of care delivery—hospital care,2 ambulatory care,3 nursing home care, and home health care. Child health care measures are also under consideration, among others.4
The AQA Alliance (formerly the Ambulatory Care Quality Alliance) is another national leadership entity involved in establishing performance standards. This organization has the broadest array of stakeholders and strong support of the Center for Medicare and Medicaid Services (CMS) and the Joint Commission and evaluates each set of performance measures. If a set of performance measures is approved by the AQA Alliance, insurers have agreed to use the measure set in any quality initiative they develop, which ensures that physicians are not bombarded with different rating schemes and different criteria from different insurers. The AQA Alliance has also formed a liaison with the Hospital Quality Alliance, which focuses entirely on quality measurement at the hospital level. These two alliances form a group that meets regularly with the secretary of health and human services.
CMS is also now contributing to the identification of quality measures by way of its initial foray into identification of quality indicators that will be held up as national standards in the Physician Quality Reporting Initiative—the voluntary reporting initiative described as the precursor to “pay for performance.”5
National specialty society-selected measures
There is a good deal of work underway at the national societal level to identify or develop standards or standardized indicators for quality of care, building on the evidence of the literature. Ideally, it is with input from and representation of the specialty societies that the NQF is able to endorse sound consensus standards that make good sense clinically and facilitate the needs and demands of other stakeholders such as patients, payers, and accreditation bodies. So, when the NQF has not yet had the opportunity to see to the indicators for a given specialty or diagnostic area or area of care pertaining to a given GME program, then that program should look next to the national quality leadership within its own society.
The American Medical Association Physician Consortium for Performance Improvement is charged with developing performance measures for the medical specialties. In contrast to the AQA Alliance, it consists entirely of physicians and American Medical Association staff. The consortium works at the level of the science of performance measure development and guides a specialty society through the process of identifying fair and meaningful measures for use in measuring quality.
The Surgical Quality Alliance (SQA) is the quality arm of the American College of Surgeons (ACS). Its purpose is to shepherd surgical specialty societies through the process of developing methods of quality measurement and applying those methods to improve quality. At present, all but two surgical specialties are represented on the SQA, and this organization also consists entirely of physicians and ACS staff.
Examples of specialty societal leadership in quality measurement endeavors include, but are not limited to, the ACS and the American Gastroenterology Association.6,7 In addition, there are other bodies of leadership in the clinical specialty arena that have developed and tested quality indicators. A premier example of such efforts is the Veterans Administration (VA) work on its National Surgery Quality Improvement Program (NSQIP). The ACS is now collaborating with VA surgical leaders to build on the work done through NSQIP to apply these quality indicators and standards beyond the VA.8
Local, institutional, or regional initiatives
Lacking established national consensus standards and well-developed specialty society work in quality indicators and measurement standards, program and institution leaders would do well to explore what quality- and performance-improvement endeavors are in place at the local, institutional, or regional levels.
The University of Florida College of Medicine and Shands Health Care Corporation facilities established in 2004 a formal agreement known as the Academic Quality Support Agreement. This alliance tracked and reported 69 indicators reflecting a broad spectrum of quality measures. These indicators reflect quality of care across inpatient and outpatient/ambulatory care, and across specialties, with a number of interdisciplinary or shared indicators, as well as a number of indicators that apply to all physicians. The endeavor provided a platform to drive protocol development, standardization of care processes, and system efficiencies, and it also provided feedback on mortality and major morbidities for selected diagnoses and major procedures.
It is useful to investigate whether one’s institution already participates in a local or regional reporting effort for benchmarking performance against like institutions or those in proximity. This is an appropriate place to start when higher-issued standards do not exist. If program leadership were not aware of the institutional quality measures and audits underway, then it would be appropriate to explore this with the institution’s quality management and compliance staff.
Or select what matters …
Should a program director be unable to identify clinical quality indicators through any of the aforementioned avenues, then it falls to the program director, with the assistance of fellow faculty and the designated institutional official, to select quality indicators for the program and specialty that make clinical “sense.”
The first step in selecting quality measures to represent an educational program is identifying the major diagnostic areas of the specialty—the top three to five high-frequency, high-risk, or high-volume features of the specialty. These features represent some of the major “must haves” of the training program, as applies to expectations for resident or fellow competence and accomplishment and knowledge during training. After these top priorities have been identified, the faculty and program director can identify appropriate process and outcome measures, or proxy measures for those desired.
Identify Data Sources and Data Collection Processes
In identifying appropriate data sources, program directors should assess the national or regional resources that are already available and, perhaps, even already in use. If a specialty-specific validated national or regional clinical database or registry exists, participating in this forum is paramount. Doing so provides a vehicle for validated data collection for appropriate risk-adjusted clinical outcomes to be derived, and a large enough dataset for solid, critical study and research. Another value of a large database or registry is the substantially greater potential for complete and validated data. Access to these data can support studies that yield sufficient statistical power to make strong conclusions on impact of care processes on outcomes of interest.
Many institutions and/or departments have internal quality audits and performance improvement endeavors that are already tracking and reporting selected quality measures. Most institutions and their quality management departments have extensive data collection and auditing processes already in place. It is important to realize that a program may already be collecting data for clinical quality assessment and review that can readily be applied to the educational mission as well.
Local or institutional data collection can be limited by the relatively small numbers in the dataset. Because of this, it is difficult to provide data feedback with any statistically significant conclusions on variance. The labor-intensive nature of data collection, where data are not available via an electronic database or health record, often translates into data only available by an audit of a sample of patients’ records. This methodology may be simply the best currently available for the time and circumstances, but it must be recognized that such a methodology can provide only incomplete information on the performance by all caregivers involved in the measure and that statistical performance is easily affected by the sample selection.
Data for quality measures, in cases of inadequate clinical volume for demonstrating satisfactory process or outcomes, may be provided by simulation as an alternative to or in combination with clinical data. Simulation is beginning to evolve as a training tool and is undergoing increasing study and validation for its effectiveness in training and in testing skills, judgment, and teamwork aspects of quality performance.
Challenges of Implementation
Whose performance is really being measured?
Program directors commonly express concern about not being able to directly attribute a selected process or outcome quality measure to a particular resident or fellow. However, virtually all of health care delivery is a team activity and, to varying degrees, relies on multiple stakeholders. This concept is reinforced by the study of one’s own microsystem of health care delivery9 and by the study and application of systems based practice. It is our experience that, whether discussing clinical outcomes and performance at a medical staff or faculty level or at a GME level, clinicians regularly discount or express dissatisfaction with data that are not reported at the individual physician level. Using aggregate data to study and improve performance of the team as a whole is still a paradigm to be embraced and taught.
Medical education does not occur in isolation, and most process and outcomes measures represent the group milieu in which teaching and learning occur. GME, like clinical care delivery, involves teams and groups of various sizes and compositions to affect the delivery of each specialty’s care and to facilitate interaction and collaboration with other caregivers as consultants and multidisciplinary care teams. So, it follows that quality measures applied to the educational process would also reflect the individual’s roles as part of a team and microsystem—all of which are part of the clinical specialty learning process. Recognizing one’s role and responsibility in that team and microsystem also helps the physician attach value to participation and leadership in the team, and contribution to and influence on the microsystem to drive improvement.
How do we effectively apply general or service data?
Even though practicing clinicians may have become familiar with quality measures and performance data feedback in recent years in terms of their own practices, few have yet become used to tying those measures and data to the GME process. More than new measures and data, this will take a new way of thinking about the data we already have. It will require that we recognize and reinforce the connection between clinical care and the educational curriculum and evaluation process. This is especially true for broadly stated measures, such as patient satisfaction. Patient satisfaction reports by clinical service or hospital unit usually report patients’ responses to questions about physicians in general or as a group, but do not specify satisfaction about each physician separately. Similarly, some key clinical indicators, such as pain management selected by medical oncology, are multifactorial, influenced by the activities of numerous types of providers—physicians, nurses, pharmacists, and therapists, to name a few. Though not resident specific, these types of indicators are still very useful to the GME evaluation process. Such indicators introduce the residents to thinking about their individual responsibility for and contribution to systems-based practice and measurement thereof. At evaluation, the program director and resident or fellow have opportunity to discuss the development of the trainee’s role as physician leader in performance improvement of care delivery.
Data Feedback and Utilization—Measuring What Matters
Once quality indicators are selected, data sources are identified, and data collection is underway, program directors must address the application of data feedback. In other words, how will the data be reported and used as part of educational evaluation in GME? In our experience, collected data have a twofold application to educational effectiveness evaluation.
First, we incorporate data feedback into the resident’s or fellow’s regular evaluation, which takes place on a frequency of at least every six months. The data report on clinical outcomes provides feedback to the physician-in-training about the patient outcome and satisfaction evidence for their performance in the six general competencies. Thus, performance evaluation extends beyond the assessment of the trainee’s knowledge, work ethic, communication, and contribution to discussion and conferences. Providing clinical outcomes feedback to trainees begins to instill in them the sense of personal ownership of their role in those outcomes, and it also provides information on which practice-based learning and system performance improvement can and should be based. At each evaluation, besides assessing performance during a specific period of time, the program director and resident or fellow should be able to track improvement throughout training in the data trends over time.
The second utility of clinical outcomes applied to medical education is the context in which the strength of a program’s curriculum can be assessed. It is critical to identify gaps in care. Measures that are consistently not meeting target should signal areas of weakness in the curricular plan or the venue and means by which a key portion of the curriculum (as reflected by the corresponding clinical measure) is presented. Additional or different educational processes can then be applied—for instance, additional didactic lectures related to that topic of care, or simulation scenarios to enhance the educational experience and foster better integration of knowledge and judgment. Program-wide clinical indicator monitoring also identifies those individuals who are struggling in multiple or all measures, and it can direct individualized counseling, remediation, and development assessment. The service- or team-level clinical outcomes measured when a resident is on a particular rotation provide the basis for individual resident feedback, even when the specific contribution of a resident to a measure may not be quantifiable. Figure 5 displays both utilities in programmatic evaluation, illustrating identification of need for curricular changes as identified by one measure that is low across multiple trainees, versus individual trainee counseling and remediation when one trainee scores lower than others on multiple measures.
There is much work yet to do in refining the selection of the most optimal quality indicators and benchmarked targets. It is, therefore, important for physicians—clinician leaders and education leaders—to work to be sure that they, or their specialty society representatives, have a “seat at the table” when CMS and/or the NQF is determining their specialty’s consensus standards. It is imperative that physicians be leaders in the process of selecting the measures and definitions that make good clinical sense to practitioners and that measure what matters. It is far better to be a leader or participant in the process than to be a passive victim. Academic clinicians are now not only acting on behalf of themselves and their patients, but also of the future providers they are training! This is the ultimate opportunity for clinicians to impact quality of care and quality improvement through health care advocacy and influence on health policy.
The ongoing challenge for leaders and educators is to identify how a resident’s action and judgment can be realistically linked with a patient outcome. We propose that this effort is an important aspect of orienting trainees to using data for monitoring and improving care processes and outcomes throughout their careers. Furthermore, this is an important first step to preparing medical trainees to “own their data,” as familiarity and facility in working with data will impact their lifelong practice-based learning and systems-based practice and data-driven clinical decision making, maintenance of certification, and likely, eventually, their reimbursement in the form of pay for performance. This will foster the integration of quality of care and quality improvement with resident practice-based learning and faculty scholarship in clinical teaching. We must train not just for medical knowledge, but for medical practice.
7 Brotman M, Allen JI, Bickston SJ, et al. AGA Task Force on Quality in Practice: A national overview and implications for GI practice. Gastroenterology. 2005;129:361–369.
© 2008 Association of American Medical Colleges
9 Nelson EC, Batalden PB, Huber TP, et al. Microsystems in health care: Part 1. Learning from high-performing front-line clinical units. Jt Comm J Qual Improv. 2002;28:472–493.