Secondary Logo

Journal Logo

Implementing Achievable Benchmarks in Preventive Health: A Controlled Trial in Residency Education

Houston, Thomas K., MD, MPH; Wall, Terry, MD, MPH; Allison, Jeroan J., MD, MSEpi; Palonen, Katri, MD; Willett, Lisa L., MD; Keife, Catarina I., PhD, MD; Massie, F Stanford, MD; Benton, E Cason, MD; Heudebert, Gustavo R., MD, MDP

doi: 10.1097/01.ACM.0000232410.97399.8f
Innovative Curricula

Purpose To evaluate the Preventive Health Achievable Benchmarks Curriculum, a multifaceted improvement intervention that included an objective, practice-based performance evaluation of internal medicine and pediatric residents’ delivery of preventive services.

Method The authors conducted a nonrandomized experiment of intervention versus control group residents with baseline and follow-up of performance audited for 2001-2004. All 130 internal medicine and 78 pediatric residents at two continuity clinics at the University of Alabama School of Medicine, Birmingham, participated. Performance of preventive care was assessed by structured chart review. The multifaceted feedback curriculum included individualized performance feedback, academic detailing by faculty, and collective didactic sessions. The main outcome was difference in receipt of preventive care for patients seen by intervention and control residents, comparing baseline and follow-up.

Results Charts were reviewed for 3,958 patients. Receipt of preventive care increased for patients of intervention residents, but not for patients of control residents. For the intervention group, significant increases occurred for five of six indicators in internal medicine: smoking screening, quit smoking advice, colon cancer screening, pneumonia vaccine, and lipid screening; and four of six in pediatrics: parental quit smoking advice, car seats, car restraints, and eye alignment (p < .05 for all). For control residents, no consistent improvements were seen. There was greater improvement for intervention than for control residents for four of six indicators in internal medicine, and two of six in pediatrics.

Conclusions Using a multifaceted feedback curriculum, the authors taught residents about the care they provide and improved documented patient care.

Dr. Houston is assistant professor of medicine, Divisions of General Internal Medicine and Preventive Medicine, University of Alabama at Birmingham School of Medicine; scientist, Deep South Center for Effectiveness Research, Birmingham VA Medical Center; and scientist, Center for Outcomes and Effectiveness Research, University of Alabama at Birmingham School of Medicine, Birmingham, Alabama.

Dr. Wall is assistant professor of medicine, Division of General Pediatrics, University of Alabama at Birmingham School of Medicine; and scientist, Center for Outcomes and Effectiveness Research, University of Alabama at Birmingham School of Medicine, Birmingham, Alabama.

Dr. Allison is associate professor of medicine, Divisions of General Internal Medicine and Preventive Medicine, University of Alabama at Birmingham School of Medicine; scientist, Deep South Center for Effectiveness Research, Birmingham VA Medical Center; and scientist, Center for Outcomes and Effectiveness Research, University of Alabama at Birmingham School of Medicine, Birmingham, Alabama.

Dr. Palonen is assistant professor of medicine, Division of General Internal Medicine, University of Alabama at Birmingham School of Medicine; and scientist, Center for Outcomes and Effectiveness Research, University of Alabama at Birmingham School of Medicine, Birmingham, Alabama.

Dr. Willett is assistant professor of medicine, Division of General Internal Medicine, University of Alabama at Birmingham School of Medicine, Birmingham, Alabama.

Dr. Kiefe is professor of medicine, Division of Preventive Medicine, University of Alabama at Birmingham School of Medicine; director, Deep South Center for Effectiveness Research, Birmingham VA Medical Center; and co-director, Center for Outcomes and Effectiveness Research, University of Alabama at Birmingham School of Medicine, Birmingham, Alabama.

Dr. Massie is assistant professor of medicine, Division of General Internal Medicine, University of Alabama at Birmingham School of Medicine, Birmingham, Alabama.

Dr. Benton is assistant professor of medicine, Division of General Pediatrics, University of Alabama at Birmingham School of Medicine, Birmingham, Alabama.

Dr. Heudebert is associate professor of medicine, Division of General Internal Medicine, University of Alabama at Birmingham School of Medicine; and scientist, Center for Outcomes and Effectiveness Research, University of Alabama at Birmingham School of Medicine, Birmingham, Alabama.

Please see the end of this report for information about the authors.

Correspondence should be addressed to Dr. Houston, 1530 Third Ave South, FOT 720, University of Alabama at Birmingham, Birmingham, AL 35294; telephone: (205) 934-7997; fax: (205) 975-7797; e-mail: (

Currently, during physicians’ formative training when feedback may be most valuable, the evaluation of their clinical performance is mostly subjective.1 Incongruously, practicing physicians’ performance is often objectively evaluated and compared with peers for quality improvement.2–5 Although there has been considerable research on the effectiveness of multifaceted quality improvement tools such as academic detailing, opinion leaders, and performance audit with feedback among physicians-in-practice,6–9 to our knowledge few studies have addressed these within the context of residency training, and those that have, have shown mixed results.10–16 Today’s trainees will be exposed to quality improvement during their subsequent careers; accordingly, these tools need to be adapted and integrated into residency training.17

Recently, introducing physicians-in-training to objective practice-based performance evaluation became a priority for the Accreditation Council of Graduate Medical Education (ACGME). The new ACGME guidelines require practice-based learning and improvement for all residents.18 Evaluation strategies and curricula developed to meet this requirement assist residents with identifying target areas for improvement, expose them to aspects of their future work environment, and possibly improve the quality of patient care. However, the ACGME has not provided specific guidelines on how to implement practice-based learning within residency training.19,20

We developed and implemented a multifaceted intervention that included an objective practice-based performance evaluation of internal medicine and pediatric residents’ delivery of preventive services. We designed the Preventive Health Achievable Benchmarks (PHAB) Curriculum to provide performance data feedback in comparison with peer-based benchmarks, to teach residents about practice-based learning and quality improvement, and to help residents strategize for improvement. Our goal was to adapt techniques used in outcomes research to residency training. We evaluated the intervention using a controlled trial and hypothesized that the proportion of patients seen by our residents with documentation of receipt of appropriate services would increase from the baseline audit to the follow-up audit. Because other changes in performance may occur as residents advance through residency training, we further hypothesized that improvements in performance would be greater among residents who participated in the curriculum than among those in the control group who did not.

Back to Top | Article Outline


Study design and participants

We used a quasi-experimental design.21 For both the medicine and pediatrics residencies at the University of Alabama School of Medicine, Birmingham, our design included two groups: an intervention group of 112 residents who participated in the PHAB curriculum, and a matched control group of 96 residents who did not. The control group residents graduated from the residency programs at least one year prior to the intervention implementation. We compared the change in receipt of preventive services (follow-up versus baseline) among patients seen by residents in the intervention group with the change for patients seen by the control group of residents. We included all postgraduate year one and two residents in the 2002–03 academic year (internal medicine, pediatric residents, and medicine–pediatrics residents) participating in the PHAB curriculum in our evaluation, as well as all control residents (2000–01 academic year), matched by residency track and postgraduate year, with available medical records.

Because this was an evaluation of a quality improvement performance audit with a feedback curriculum, the institutional review board at our institution allowed exemption of informed consent from the resident subjects, and because patient data were collected without unique identifiers from existing records, a Health Insurance Portability and Accountability Act (HIPAA) waiver and exemption of informed consent were approved at the patient level.

Back to Top | Article Outline

Setting and patient population

The internal medicine and pediatric clinics serve a low income, urban patient population. Residents attend clinic, on average, one half-day per week. Dictated clinic notes for medicine and standardized well-child visit forms for pediatrics, as well as lab and procedure reports, are stored as paper records in the clinics.

Back to Top | Article Outline

Identification of targeted preventive health performance indicators

We developed a list of 18 potential indicators, including those adopted by the Health Plan Employer Data and Information Set (HEDIS) and those related to the U.S. Preventive Health Task Force.22,23 The investigators then used six standard criteria24 to rate each indicator: supported by a strong evidence-base, able to be reliably abstracted from medical record data, dependent on resident performance (not based on system or nursing protocols), perceived to have variable performance (i.e., room for improvement), capable of being improved by changes in residents’ performance, and relevant to residency education.24 We then met together to reach consensus, and thus identified a subset of six indicators each for medicine (smoking screening, quit smoking advice, colon cancer screening, breast cancer screening, pneumonia vaccine, and lipid screening) and for pediatrics (parental smoking screening, parental quit smoking advice, car seats [4 years or younger], car restraints [over 4 years], immunization up to date, and eye alignment) (see Appendix 1 for more detail). We refined the indicator definitions after developing the abstraction instrument and performing the initial pilot audit.

Each indicator was calculated for appropriate patients. Appropriate patients were defined as those patients most eligible for the intervention by the HEDIS and U.S. Preventive Task Force Guidelines definitions.22,23 When applying these guidelines to determine whether residents were in compliance, we erred on the conservative side of the guidelines. For example, for breast cancer screening, we set the cutoff age as over 50, and did not include those patients between 40 and 50 (Appendix 1).

Recent studies have criticized some performance indicators because they measure system factors, and not individual providers’ performance.25 With this in mind, we gave residents credit not only if a mammogram had been performed, for example, but also if it had been scheduled but not yet performed, or if they had offered the procedure and the patient had declined. The combination of treatments that had been performed, scheduled, or offered was a better indicator of residents’ performance than simply treatments performed, which may vary because of system issues (e.g., waiting time for a mammogram).26

For all patients seen by intervention and control residents at baseline and follow-up, the measurement period for each preventive service measure was held constant (Appendix 1). Because of the long time window for some of the indicator measurement periods, the assessment period for some follow-up overlaps with the baseline. Follow-up is thus a conservative comparison and represents the marginal improvement in documented receipt of services.

Back to Top | Article Outline

Chart selection procedure

We abstracted a comprehensive sample of all patients seen in continuity within specific time intervals. In the medicine clinic, continuity was defined as a patient being seen for two or more visits in the time interval, and for pediatrics, having one or more well-child visits with the provider of interest in the past year. We abstracted records for all children ages two months and older seen by intervention and control residents for well-child visits in the pediatric clinic. In the medicine clinic, all medical records for men ages 35–80 years and women ages 45–80 years seen in continuity by intervention and control residents were abstracted during the time intervals.

We selected patients within discrete, symmetric time intervals for both the intervention and control groups. In the medicine clinic, we included all continuity patients seen over a 12-month period for control baseline and follow-up, and for intervention baseline. Because some internal medicine residents were graduating, our postintervention follow-up interval was only six months. Thus, the number of follow-up internal medicine medical records available to abstract was fewer than available at baseline. Because pediatric residents saw a higher volume of patients, we were able to achieve an adequate sample of patients per resident in a four-month interval, from March through June of each evaluation year, and thus all intervals were the same.

Patients’ medical records were linked to the resident provider, and the performance indicators as well as patients’ characteristics (age, gender, ethnicity, and health insurance status) were abstracted. For medicine, we also collected data on patients’ number of medications and significant comorbidities (diabetes, coronary artery disease, heart failure, and pulmonary diseases).

Back to Top | Article Outline

Medical record performance audit techniques

We developed a standard medical record abstraction tool using a customization of the public domain MedQuest program.27,28 The tool allows for branching logic of questions, range limits, and readily accessible variable definitions, which help to enhance abstraction fidelity. Research assistants were trained using a sample of 15 medical records. We then refined the medical record abstraction tool, and the research assistants were further trained on a second sample of 15 medical records. These same research assistants abstracted all patient records for the intervention and control groups in this study. For quality assurance during the study, on a bimonthly basis, a 5% sample of charts was double-abstracted, with any errors adjudicated by group review. After review, variable-level error rates for the primary abstractor averaged 3% over the course of abstraction.

Back to Top | Article Outline

Feedback: the Preventive Health Achievable Benchmarks Curriculum

We then developed a three-component curriculum to provide the feedback to residents in the intervention group made up of a feedback lecture, an individual feedback form, and individual performance review sessions. Our objectives for the PHAB curriculum were to expose residents to concepts of performance audit, encourage residents to reflect on their own performance deficits, and plan for residents’ improvement.

First, we discussed with residents the principles of performance audit and aggregate performance results for each residency program at a noon conference. Second, those residents received a confidential “performance feedback” form with individualized feedback on their personal performance as it compared with benchmark performance for each indicator. Based on our objective performance audit of the six preventive health performance indicators for each residency program, we calculated the proportion of appropriate patients receiving the preventive service. Using the Achievable Benchmarks of Care (ABC) method, the benchmark performance among the residents was defined.4,5 In essence, this method represents the average performance for the top 10% of the residents being assessed.

Last, the performance feedback was reviewed with individual residents at their midyear evaluation. The feedback session was conducted by program directors in internal medicine and by clinical faculty in pediatrics acting as “local opinion leaders.”8 We based the individual feedback session on academic detailing strategies.9,29,30 All program directors and clinical faculty participated in a training session prior to conducting the midyear feedback sessions. The purpose of the feedback was discussed, and strategies for reviewing the feedback with residents were solicited from the faculty. To standardize feedback, we provided the faculty with a written feedback script using academic detailing principles (investigate baseline performance, provide positive reinforcement, define clear behavioral objectives).29 In the script, faculty were instructed to review the ACGME requirements and then, in sequence, review the areas in which the resident performed well, giving positive feedback; review the areas in which the resident performance was suboptimal or needed improvement; and ask residents, “What is your plan for improvement?” The faculty member was instructed to work with each resident to identify concrete strategies.

Back to Top | Article Outline

Statistical analysis

To assess the impact of our curriculum, we conducted a series of patient-level analyses. All analyses were conducted using Stata statistical software, version 8.0 (StataCorp, College Station, TX). First, we calculated the unadjusted proportion of intervention and control patients receiving each appropriate preventive service, at baseline and at follow-up. Differences (baseline versus follow-up) were assessed using chi-square tests. These raw proportions are easy to interpret, but do not adjust for other patient and provider.

Next, we assessed differences (baseline versus follow-up) in odds of receiving services, comparing patients seen by PHAB intervention residents versus control residents using logistic regression models. The models were adjusted for both patient characteristics (age, health insurance status, ethnicity, and, when appropriate, gender) and provider characteristics (gender, training year, and track [primary care versus categorical for medicine]), and for the number of patient visits to the residents in each chart selection interval as a marker for the varying length of follow-up. However, this patient-level approach may have resulted in an artificial inflation of statistical significance because the intervention was conducted at the resident level, and each resident may not have behaved independently for each patient. Therefore, we further used generalized estimating equations with an exchangeable correlation matrix to adjust for clustering of patients within providers.

We first conducted separate adjusted models for the intervention and control groups. These models provided easily interpretable odds ratios for change from baseline to follow-up, but did not directly compare the difference in change between the two groups. Finally, we developed adjusted models with both groups to assess the difference between the intervention and control groups in documented performance change using the p value of a group-by-time interaction term. This model provides an interpretable p value for the change.

Back to Top | Article Outline


A total of 130 internal medicine residents and 78 pediatric residents were included in the study. The mean resident age was 27.6 years (standard deviation (SD) 2.2), with a narrow interquartile range (26–29 years). More intervention internal medicine residents were female (34%) than were those in the control group (25%). More control group residents were primary care track (32%) than were intervention group residents (16%), reflecting secular trends in group composition. There were more female pediatric residents control group (72%) than in the intervention group (55%). However, none of these differences were significant at p < .05.

The mean number of patients was 8.5 (SD 1.2) per medicine resident and 14.6 (SD 6.2) per pediatric resident. There were few patient-level differences comparing intervention and control (Table 1). The mean number of patient visits during follow-up was lower than at baseline for patients seen by medicine residents, as was to be expected because the difference in follow-up chart selection interval.

Table 1

Table 1

Back to Top | Article Outline

Changes in receipt of preventive services

For internal medicine residents participating in the intervention curriculum, the aggregate proportion of their patients receiving each preventive service improved after the curriculum. The difference from baseline to follow-up was statistically significant for smoking screening, quit smoking advice, colon cancer screening, pneumonia vaccine, and lipid screening (Table 2). In comparison, for control group internal medicine residents, there was less improvement for quit smoking advice and lipid screening, no change for colon cancer screening, and actually a decline in performance for smoking screening, breast cancer screening, and pneumonia vaccine. None of the baseline versus follow-up changes was statistically significant for the control group. The difference in performance over time for four of the six indicators was significantly better for the intervention group compared with control group residents after adjustment (p < .05 based on group-by-time term, Table 3).

Table 2

Table 2

Table 3

Table 3

For pediatric residents, the proportion of appropriate patients receiving services significantly improved for those seen by intervention residents for parental quit smoking advice, car seats, car restraints, and eye alignment (Table 2). Of note, comparing baseline values (intervention versus control), there was a temporal trend in improvement between intervention and control (62% versus 77%) for car seats, and also for car restraints in older children. After adjustment, the difference in improvement (baseline versus follow-up) favored the intervention group for car restraints and quit smoking advice, with marginal significance (p = .07) for car seats (Table 3).

Back to Top | Article Outline


We demonstrated that practice-based evaluation and individualized feedback and education, using cutting-edge quality improvement techniques developed within outcomes research, was beneficial in improving documented performance for six preventive health services among residents training in two specialties. Overall, the residents were performing well at baseline on some indicators, but other indicators needed considerable improvement. Because documented performance improved, the innovative evaluation and surrounding curriculum not only benefited residents but also improved the quality of care for patients. Although our evaluation focused only on documented quality, we feel that our assertion that quality of care was improved is justified in that documentation is a part of quality care. Also, because direct observation of performance is difficult, chart audited performance is frequently considered a surrogate for quality and is frequently the basis for evaluating providers in practice.

Systematic reviews of implementation strategies including performance feedback for practicing clinicians have noted variable and modest results at best.6,7 The reported effects of audit and feedback vary between 1% and 16%,33 and several studies of practice-based performance audit in residency have shown limited effects of such audits.10,15

There are several possible reasons for the relative success of our intervention. First, we know of no other trials of audit and feedback in residency that have used nationally recognized standard methods of peer comparison. The Achievable Benchmarks of Care model has been demonstrated as a robust method of performance feedback, resulting in significant changes across several indicators in prior studies.4 Comparing providers with a locally relevant, “real-world” benchmark is likely a more appropriate comparison and more acceptable to providers than comparison with a national standard.

Second, the curriculum we designed to provide the feedback may have been more intense than some previously reported audit and feedback interventions. We not only provided written feedback, but also met individually with each resident to identify areas to target and to plan strategies for improvement. In a previous randomized trial of performance feedback in residency with negative results, Kogan et al. noted that variability in the implementation of the performance feedback by faculty may have limited the impact of the intervention.10 We were careful to provide a detailed script and train pediatric clinic faculty and internal medicine program directors to deliver the academic detailing in a standard, scripted way, reviewing residents’ needs for improvement, encouraging improvement, and strategizing for change.9

Third, physicians-in-training may simply be both more amenable and more motivated to change than practicing physicians. In addition, we were careful in the initial lecture to emphasize that this performance feedback was being provided for educational purposes and self-reflection, not as summative evaluation to mark in their record. This approach may have reduced some defensiveness, especially from junior providers, noted in previous research on feedback.34–38

The impact of the curriculum appeared most consistent among the internal medicine group, although pediatric residents also significantly improved on two indicators. The most obvious difference in implementation was that the individualized feedback sessions were conducted by the residency program directors in internal medicine, but by clinical faculty in pediatrics. The program directors may have been more effective opinion leaders. Also, although we attempted to have symmetry in the indicators, they were somewhat different. Of note, the change in the quit smoking advice indicator was strong and consistent for both specialties.

There are several limitations to our study. First, our design was not randomized. Although we attempted to adjust for confounding by physician and patient characteristics in multivariable analysis, unmeasured and residual confounding may exist. Because the control group was removed in calendar time from the intervention group, other temporal changes occurring between intervention baseline and follow-up may have influenced the results. Also of note, the difference in assessment intervals for medicine limited our power and may have resulted in differences. Because of the risk of temporal changes, we focused on stable indicators, and had to reject some indicators for which guidelines had changed during the interval (e.g., Pap test).39 And although we had residents training in two specialties, the study was conducted at only one university. We chose not to randomize residents because this allowed a waiver of informed consent from the residents and enabled us to include all residents practicing in these clinics, not just those interested in receiving feedback. By including all residents, we avoided selection bias and thus increased the external validity of the study.40

Although we were able to detect significant differences for some indicators, several factors may have biased us towards the null. We crafted the performance indicators to conservatively focus on the most appropriate patients. Thus, some patients who may have potentially received the indicator were excluded from our denominator. For some indicators, the power to detect small effects was limited, but these limitations in power were somewhat similar for the intervention and control groups. Also, many of the indicators had long interval windows of measurement. The longest was colon cancer (up to ten years). Of note, the indicators with the shortest measurement windows (tobacco control and car safety) demonstrated the greatest improvements. Finally, patients with fewer visits less frequently received services. Because our postintervention follow-up period was only six months for medicine residents, patients seen in this shorter follow-up period had fewer scheduled visits and thus we may have underestimated the change in performance.

The strengths of our study also include the large number of medical records we abstracted, which gave us significant power to detect differences despite the above limitations, and the effective training of abstractors, as demonstrated by the low error rate noted on double-abstraction. We also instituted significant quality control (e.g., standardizing feedback forms, training faculty to conduct feedback review) to assure as much fidelity in delivering the multifaceted curriculum as possible.

We successfully overcame several challenges to residency educational research, including curricular constraints and costs.14 Several previous evaluations of practice-based learning in residency have had a pre-post test only design.11–13,16 We balanced the need of the residency program to implement practice-based learning for all residents with the need for more rigorous evaluation by using a historical control group design to account for patterns of performance improvement as residents progress through training. This design has been commonly used in elementary and other educational settings with turnover of trainees.21 The development of the indicators and implementation of the project required significant cost and effort. The indicators were developed over five investigator meetings over a two-month period and then needed to be refined after pilot abstraction. Although the chart audit was somewhat costly (training abstractors and baseline chart audit cost approximately $107.00 per trainee), we received significant support from the residency training programs and internal funding to support the project. Because of the perceived value of the PHAB curriculum, its cost is now being sustained by the internal medicine residency training program as a regular part of training.

Experts in medical education research have increasingly called for implementing outcomes research methods into educational interventions and evaluations.17,40,41 We have demonstrated, in two training programs, the feasibility of implementing a curriculum based on quality improvement methods. We speculate that careful attention to the feedback methods used, fidelity and systematic delivery of the feedback and education, and especially using a nonjudgmental, reflective approach likely enhanced the impact of our curriculum. Residents responded well, with overall positive emotional responses to the feedback, based on reports after the individual feedback sessions.

The ACGME sponsored Outcomes Project is entering its third phase in July 2006. This new phase is characterized by full integration of the competencies and their assessment with learning and clinical care ( Specifically, the ACGME is looking for institutions to obtain resident performance data to use as a tool to improve clinical care and eventually to benchmark against national quality of care indicators. We believe the approach of audit and feedback is a valuable resource within the practice-based learning and improvement assessment toolbox available to training programs, although it might not be applicable to all residency training programs, and whether chart audit alone is enough to meet the ACGME requirements is unknown.

Performance evaluation of practicing physicians is increasing, and, in fact, the American Board of Internal Medicine allows providers to audit their own performance to fulfill one of the required modules for recertification.42 As performance audit for practicing physicians is likely to increase in the future, we feel that it is critical that trainees, even medical students, be exposed to these concepts during training. Trainees that are better equipped to reflect on their performance and improve their processes will likely provide better care for patients as they make the transition to clinical practice.

Back to Top | Article Outline


This research received funding through a grant from the University of Alabama at Birmingham Health Services Foundation General Endowment Fund. Drs. Palonen and Kiefe were supported by the Veteran’s Affairs National Quality Scholars Program.

Back to Top | Article Outline


1Epstein RM, Hundert EM. Defining and assessing professional competence. JAMA. 2002;287:226–35.
2Holden JD. Systematic review of published multi-practice audits from British general practice. J Eval Clin Pract. 2004;10:247–72.
3Weiss KB, Wagner R. Performance measurement through audit, feedback, and profiling as tools for improving clinical care. Chest. 2000;118(2 suppl):53S–58S.
4Kiefe CI, Allison JJ, Williams OD, Person SD, Weaver MT, Weissman NW. Improving quality improvement using achievable benchmarks for physician feedback: a randomized controlled trial. JAMA. 2001;285:2871–79.
5Kiefe CI, Weissman NW, Allison JJ, Farmer R, Weaver M, Williams OD. Identifying achievable benchmarks of care: concepts and methodology. Int J Qual Health Care. 1998;10:443–47.
6Jamtvedt G, Young JM, Kristoffersen DT, Thomson O’Brien MA, Oxman AD. Audit and feedback: effects on professional practice and health care outcomes. Cochrane Database Syst Rev. 2003 (3):CD000259.
7Thomson O’Brien MA, Oxman AD, Davis DA, Haynes RB, Freemantle N, Harvey EL. Audit and feedback versus alternative strategies: effects on professional practice and health care outcomes. Cochrane Database Syst Rev. 2000 (2):CD000260.
8Thomson O’Brien MA, Oxman AD, Haynes RB, Davis DA, Freemantle N, Harvey EL. Local opinion leaders: effects on professional practice and health care outcomes. Cochrane Database Syst Rev. 2000 (2):CD000125.
9Thomson O’Brien MA, Oxman AD, Davis DA, Haynes RB, Freemantle N, Harvey EL. Educational outreach visits: effects on professional practice and health care outcomes. Cochrane Database Syst Rev. 2000 (2):CD000409.
10Kogan JR, Reynolds EE, Shea JA. Effectiveness of report cards based on chart audits of residents’ adherence to practice guidelines on practice performance: a randomized controlled trial. Teach Learn Med. 2003;15:25–30.
11Holmboe E, Scranton R, Sumption K, Hawkins R. Effect of medical record audit and feedback on residents’ compliance with preventive health care guidelines. Acad Med. 1998;73:901–3.
12Paukert JL, Chumley-Jones HS, Littlefield JH. Do peer chart audits improve residents’ performance in providing preventive care? Acad Med. 2003;78(10 suppl):S39–S41.
13Kern DE, Harris WL, Boekeloo BO, Barker LR, Hogeland P. Use of an outpatient medical record audit to achieve educational objectives: changes in residents’ performances over six years. J Gen Intern Med. 1990;5:218–24.
14Golub RM. Theme issue on medical education: call for papers. JAMA. 2005;293:742.
15Mayefsky JH, Foye HR. Use of a chart audit: teaching well child care to paediatric house officers. Med Educ. 1993;27:170–74.
16Leshan LA, Fitzsimmons M, Marbella A, Gottlieb M. Increasing clinical prevention efforts in a family practice residency program through CQI methods. Jt Comm J Qual Improv. 1997;23:391–400.
17Chen FM, Bauchner H, Burstin H. A call for outcomes research in medical education. Acad Med. 2004;79:955–60.
18Accreditation Council for Graduate Medical Education. ACGME Outcome Project ( Accessed 30 March 2006.
19Lynch DC, Swing SR, Horowitz SD, Holt K, Messer JV. Assessing practice-based learning and improvement. Teach Learn Med. 2004;16:85–92.
20Frohna JG, Kalet A, Kachur E, et al. Assessing residents’ competency in care management: report of a consensus conference. Teach Learn Med. 2004;16:77–84.
21Shadish W, Cook D, Campbell D. Quasi-Experimental Designs that use both control groups and pretests. Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston, New York: Houghton Mifflin Company, 2002:135–69.
22Agency for Healthcare Research and Quality ( Accessed 30 March 2006.
23Health Plan Employer Data and Information Set. 2002 Summary Table of Measures, Product Lines and Change ( Accessed 30 March 2006.
24Shekelle PG, MacLean CH, Morton SC, Wenger NS. Assessing care of vulnerable elders: methods for developing quality indicators. Ann Intern Med. 135(8 Pt 2):647–52.
25Kerr EA, Smith DM, Hogan MM, et al. Building a better quality measure: are some patients with ‘poor quality’ actually getting good care? Med Care. 2003;41:1173–82.
26Yankaskas BC, Taplin SH, Ichikawa L, et al. Association between mammography timing and measures of screening performance in the United States. Radiology. 2005;234:363–73.
27U.S. Department of Health and Human Services. Centers for Medicare and Medicaid Services ( Accessed March 10, 2003.
28Allison JJ, Wall TC, Spettell CM, et al. The art and science of chart review. Jt Comm J Qual Improv. 2000;26:115–36.
29Soumerai SB, Avorn J. Principles of educational outreach (‘academic detailing’) to improve clinical decision making. JAMA. 1990;263:549–56.
30Davis DA, Taylor-Vaisey A. Translating guidelines into practice: a systematic review of theoretic concepts, practical experience and research evidence in the adoption of clinical practice guidelines. CMAJ. 1997;157:408–16.
31Hulscher ME, Laurant MG, Grol RP. Process evaluation on quality improvement interventions. Qual Saf Health Care. 2003;12:40–46.
32Walshe K, Freeman T. Effectiveness of quality improvement: learning from evaluations. Qual Saf Health Care. 2002;11:85–87.
33Grimshaw J, Eccles M, Tetroe J. Implementing clinical guidelines: current evidence and future implications. J Contin Educ Health Prof. 2004;24(1 suppl):S31–S37.
34Johnston G, Crombie IK, Davies HT, Alder EM, Millard A. Reviewing audit: barriers and facilitating factors for effective clinical audit. Qual Health Care. 2000;9:23–36.
35Gabbay J, Layton AJ. Evaluation of audit of medical inpatient records in a district general hospital. Qual Health Care. 1992;1:43–47.
36Firth-Cozens J, Storer D. Registrars’ and senior registrars’ perceptions of their audit activities. Qual Health Care. 1992;1:161–64.
37Baker R, Robertson N, Farooqi A. Audit in general practice: factors influencing participation. BMJ. 1995;311 (6996):31–34.
38Black N, Thompson E. Obstacles to medical audit: British doctors speak. Soc Sci Med. 1993;36:849–56.
39U.S. Preventive Services Task Force. Screening for Cervical Cancer ( Accessed 30 March 2006.
40Carney PA, Nierenberg DW, Pipas CF, Brooks WB, Stukel TA, Keller AM. Educational epidemiology: applying population-based design and analytic approaches to study medical education. JAMA. 2004;292:1044–50.
41Lim JK, Golub RM. Graduate medical education research in the 21st century and JAMA on call. JAMA. 2004;292:2913–15.
42Wasserman SI, Kimball HR, Duffy FD. Recertification in internal medicine: a program of continuous professional development. Task Force on Recertification. Ann Intern Med 2000;133:202–8.
Appendix 1

Appendix 1

Appendix 1

Appendix 1

© 2006 Association of American Medical Colleges