The Ethics of Conducting Graduate Medical Education Research on Residents

Keune, Jason D. MD, MBA; Brunsvold, Melissa E. MD, FACS; Hohmann, Elizabeth MD; Korndorffer, James R. Jr MD, FACS; Weinstein, Debra F. MD; Smink, Douglas S. MD, MPH, FACS

doi: 10.1097/ACM.0b013e3182854bef

The field of graduate medical education (GME) research is attracting increased attention and broader participation. The authors review the special ethical and methodological considerations pertaining to medical education research. Because residents are at once a convenient and captive study population, a risk of coercion exists, making the provision of consent important. The role of the institutional review board (IRB) is often difficult to discern because GME activities can have multiple simultaneous purposes, educational activities may go forward with or without a research component, and the subjects of educational research studies are not patients. The authors provide a road map for researchers with regard to research oversight by the IRB and also address issues related to research quality. The matters of whether educational research studies should have educational value for the study subject and whether to use individual information obtained when residents participate as research subjects are explored.

Dr. Keune is chief resident in general surgery, Washington University School of Medicine, St. Louis, Missouri.

Dr. Brunsvold is assistant professor, Division of Critical Care/Acute Care Surgery, University of Minnesota, Minneapolis, Minnesota.

Dr. Hohmann is associate professor of medicine, Harvard Medical School, and director, Institutional Review Board, Partners HealthCare System, Boston, Massachusetts.

Dr. Korndorffer is professor of clinical surgery, Tulane University School of Medicine, director, Tulane Center for Minimally Invasive Surgery, and associate residency program director, Tulane Department of Surgery, New Orleans, Louisiana.

Dr. Weinstein is vice president for graduate medical education, Partners Healthcare System, and assistant professor of medicine, Harvard Medical School, Boston, Massachusetts.

Dr. Smink is assistant professor of surgery, Harvard Medical School, associate medical director, STRATUS Center for Medical Simulation, and program director, General Surgery Residency Program, Brigham and Women’s Hospital, Boston, Massachusetts.

Correspondence should be addressed to Dr. Keune, Washington University School of Medicine, Department of Surgery, 660 S. Euclid Ave., Campus Box 8109, St. Louis, MO 63110; e-mail:

Article Outline

Research in the field of graduate medical education (GME) has attracted attention and broad participation in recent decades. The number of peer-reviewed publications regarding surgical education, for example, increased markedly in the 1990s,1 and productive research on this topic and others in GME has remained strong into the 21st century. Interest in GME has been fueled by many factors, including new knowledge and technology,2,3 advances in educational theory,3 growing emphasis on hospital and operating room efficiency,2,4 and increased focus on the interface between education and patients’ safety—including the controversy surrounding supervision requirements and limits on residents’ work hours.4,5 The revision of promotion criteria at many medical schools to value scholarship involving medical education innovation and research may also play a role.

In addition, as the process of educating residents has attracted greater scrutiny and increasing oversight from the Accreditation Council for Graduate Medical Education and other regulatory organizations, many medical educators are pursuing research to develop an evidence basis to inform educational policy. Finally, mounting pressure on funding for GME makes it imperative that educational methods be efficient as well as effective. Consensus recommendations, such as those from the Macy Foundation’s recent conferences on reforming GME,6,7 have called for more research to identify optimal approaches to GME and medical education overall. Indeed, medical education research is absolutely essential for ensuring that future generations of physicians achieve the necessary expertise to provide excellent care.

Research in GME is challenging. The small, widely dispersed study population makes accumulating a sufficient sample size difficult. Research is sometimes constrained by the number of endorsements required from regulatory organizations (in addition to the research institution’s own review board). Dedicated funding is scant, so most research is done by faculty who are primarily focused on clinical care and teaching. Perhaps most difficult, however, are the ethical and methodological issues specific to research involving clinical trainees. The literature does not clearly address these issues, leaving medical education researchers without a road map to navigate these challenges. In this article, we review the special ethical and methodological issues that pertain to medical education research, including informed consent, regulatory oversight, quality of research data, and the problem of dual purpose.

Back to Top | Article Outline

A Case

The program director at a large surgery residency announces an addition to the technical skills training sessions planned for the year: a “team/trauma simulation” session. Residents are notified that they will be assigned a specific date and that attendance is mandatory, just as it is for grand rounds, didactic conferences, and skills laboratories. The program director’s announcement does not indicate that the team/trauma simulation sessions are part of a study; residents learn this when they arrive at the simulation center. They are then presented with a consent-to-participate form.

Back to Top | Article Outline

The Issues

Coercion and consent

Informed consent, based on the notion of “respect for persons,”8 is important for minimizing the coercion of research study participants. The need to obtain informed consent from residents in GME research varies from study to study and is determined by each institution’s institutional review board (IRB), but a large majority of studies likely require informed consent.

In addition to informed consent, the issue of coercion must be addressed. As Jonathan Moreno9 has pointed out, “experiments of opportunity,” in which researchers study “convenient” populations of individuals who are simply the most available subjects at the time, raise unique ethical issues. These issues are intensified if the subjects are also “captive,” that is, constrained in their movement by “explicit conditions formally imposed on them by societal decision.” The paradigm for the captive research population is prisoners, but others, such as institutionalized persons, military personnel, and, of course, residents under the constraints of mandatory attendance, can be considered captive.

The principal problem with captive populations is the risk of coercion to participate, a risk related to the status relationship between the researcher and subject. In GME research, the researcher is often a faculty member or program director, and the subjects are resident trainees, so a clear hierarchical relationship exists.

The risks associated with residents’ participation in medical education research are not negligible. For example, participation levies an opportunity cost: the loss of time that might have been better spent in other educational activities with proven value, or even in leisure activities, which can be scarce for residents. Risks may affect patients as well. If residents, whose work hours are limited, participate in GME research while on duty, the time they spend engaged in a research study may subtract from the time they can spend caring for patients. Additional risks include fatigue, stress, embarrassment, alteration of self-concept, and loss of confidence. Residents who perform poorly on the tasks under study may be more intensely scrutinized in their daily clinical work outside the study. Loss of privacy may also occur, impacting both reputation and career.

Yet, although the study population comprising residents may be convenient and captive, it is also—in a way that diverges from Moreno’s model—essential, and this weighs against the other ethical concerns. GME research, such as assessment of new curricula, teaching methods, or duty hours schedules, necessarily involves the participation of residents. Using others, such as fully trained physicians or nonphysicians, if even feasible, would produce irrelevant results. The subject for a surgical education research study therefore must be a surgical resident. The recognition of this essentialness, however, can have a positive effect. If residents understand that they and others will benefit from the research, they may wish to contribute by participating.

The question remains: Is it possible for residents to say “no” to participation in education research? In the case described above, one resident did refuse to participate and suffered no evident retaliation, but that resident’s characteristics made it easier to refuse. The resident was senior, was already matched into a competitive fellowship, and had virtually nothing to lose.

We theorize that, in GME research, the coercion effect caused by the hierarchical relationship between researcher and resident is not static but, rather, on average, decreases as the resident ascends through the program. For this reason, junior residents are more easily coerced than are senior residents. Unfortunately, although understandably, most GME research focuses on the most junior residents, in whom pedagogical resources or interventions arguably have the greatest impact.

It is important for designers of any GME research study to clarify that residents may decline to participate as identifiable subjects, while making it clear that they may still participate in all of the educational aspects of the program. In the case described above, such an option was not made sufficiently clear. That one resident did opt out reveals some degree of “voluntariness,” but it was adversarial, not consistent with respect for persons; the resident’s refusal was only tacitly accepted, not positively endorsed. The possibility to opt out of the research component, without negative consequences, should have been made clear to all.

Back to Top | Article Outline

Regulatory oversight

A further challenge of GME research is determining how it should be evaluated by regulatory bodies that oversee research at the institutional level. There are several reasons why some argue that education research does not require institutional monitoring. First, activities with simultaneous purposes (i.e., both research and educational functions) may obscure the ethical issues typically associated with research. Second, educational interventions—unlike clinical interventions—are often implemented without prior study. If the intervention could go forward with or without research, is it necessary to apply the usual standards for research? Third, where educational research does not directly involve patients, the risks to participants may be perceived as negligible. Finally, there may be an inclination to view any GME research as an overall “good” for residents and to overlook its inherent risks, or to consider the risks so outweighed by the benefits that institutional oversight is unnecessary.

Several questions must be addressed when undertaking a GME research study (see Figure 1). First, a determination should be made as to whether the activity actually constitutes research or is more appropriately categorized as a quality improvement (QI) project.10 The distinction is important: Research studies are subject to federal regulations that require an IRB review to weigh risks and benefits and to ensure appropriate informed consent,11 whereas QI projects, designed to identify and implement best practices, may be mandated by an institution, with participation in the project considered a condition of employment. Institutions differ significantly in how they differentiate between research and QI. A central issue is whether the intent of the project is to produce generalizable knowledge. Secondary considerations include funding source, the nature of the intervention (whether standard or novel), and disposition of the information gained (whether for internal use, as in benchmarking, or for wider dissemination, such as publication). Determinations of whether projects constitute research or QI often rely on difficult judgment calls. In uncertain situations, it is advised that investigators consult the IRB to minimize the impact of investigator bias, use the expertise of IRB personnel, protect the ability to publish results, and ensure compliance with federal regulations.

Once it has been established that a proposed project constitutes research, another question is whether it represents “human subjects research.” Much of the academic literature and the legal and legislative action relating to human subjects research focuses on the relationship between physicians and study subjects who are under their care or who are recruited to test medical interventions. Physicians accustomed to that research paradigm may not appreciate that educational and sociobehavioral research that involves human subjects also requires regulatory and ethical review and oversight.

The part of the federal regulatory code (45 CFR 46.102(f)) that addresses research involving human subjects states:

Human subject means a living individual about whom an investigator (whether professional or student) conducting research obtains

1. data through intervention or interaction with the individual, or

2. identifiable private information.12

Clearly, GME research that involves patients, or that uses identifiable patient information, should be classified as human subjects research. Consider a study in which a residency program director wants to compare two methods for teaching about the management of glucose in the intensive care unit. The residents will be randomized to either a computer-based training session or to an in-person training session. The outcome measure will be the number of hypoglycemic episodes experienced by the patients under each resident’s care. Because this study would involve recording identifiable resident and patient data, it would qualify as human subjects research.

Consider, now, a different study, in which a program director wishes to see how residents’ standardized test scores (such as scores from the American Board of Surgery In-service Training Examination) have changed across 15 years. If the data come from lists of deidentified scores, the research may not be considered human subjects research. When in doubt as to whether a study constitutes human subjects research, an investigator should seek a determination from the institution’s IRB for the reasons noted above with respect to distinguishing research from QI projects.

Section 45 CFR 46.101 directly addresses the issue of possible exemption of educational research.

(b) Unless otherwise required by department or agency heads, research activities in which the only involvement of human subjects will be in one or more of the following categories are exempt from this policy:

1. Research conducted in established or commonly accepted educational settings, involving normal educational practices, such as (i) research on regular and special education instructional strategies, or (ii) research on the effectiveness of or the comparison among instructional techniques, curricula, or classroom management methods.

2. Research involving the use of educational tests (cognitive, diagnostic, aptitude, achievement), survey procedures, interview procedures or observation of public behavior, unless:

(i) information obtained is recorded in such a manner that human subjects can be identified, directly or through identifiers linked to the subjects; and (ii) any disclosure of the human subjects’ responses outside the research could reasonably place the subjects at risk of criminal or civil liability or be damaging to the subjects’ financial standing, employability, or reputation.13

We believe that GME represents a “commonly accepted educational setting, involving normal educational practices,” and thus we expect that many GME research projects may be eligible for exemption from human subjects research requirements via provision 1. Alternatively, many GME research projects could be exempt under provision 2—that is, by using information that is either deidentified or without risk to financial standing, employability, and reputation. Of course, these aspects may be difficult to assess.

“Exemption” is not automatically granted by these regulations; projects must still be submitted to an IRB. Some IRBs may be reluctant to exempt any research that collects identifiable data. Educational researchers should be familiar with the relevant federal guidance on exemptions so that they can explicitly address the employment or reputational risks of routine or inadvertent disclosures in their IRB submissions. It is interesting to consider how the recently proposed Department of Health and Human Services Advanced Notice of Proposed Rule Making related to human subjects research might affect educational research.14 Part of those proposed changes suggest that all survey and educational studies in competent adults could be exempted, provided certain data security standards are met, eliminating any consideration of employment or reputational risks or harms. Additionally, those proposed changes discuss simple registration of such projects rather than IRB review, potentially removing external oversight and mandating more self-review and internal assessment by GME program directors.

One additional problem regarding IRBs is how to address multi-institutional studies. For example, an education researcher affiliated with one residency program may develop a survey for residents at multiple institutions. In this situation, the principal investigators are only required to obtain exemption (or approval) from their own institutional IRB. In a practical sense, however, some type of endorsement from the other institutions is needed to facilitate contact with eligible subjects. The researchers often contact representatives (such as the GME director or designated institutional official) at each institution, who then consult with their own IRBs before deciding whether to forward the invitation to participate to their residents. For many institutions, local IRB review is not required when a program director or faculty member agrees to forward study information or allows researchers from other institutions to invite their residents to participate in studies.15

In summary, GME projects that are not clearly QI or implementation of known best educational practices should be evaluated by an IRB. Expedited review is often available for GME-related research, and in some cases the research will be issued a formal exemption by the IRB. The regulations themselves do not specify who at an institution may determine that research is exempt. However, the Office for Human Research Protections recommends that, because of the potential for conflict of interest, investigators not be given the authority to make an independent determination that human subjects research is exempt.16 Consultation with IRB representatives early in the process is advantageous.

Back to Top | Article Outline

Quality of research data

When studies are conducted in the intimate professional environment that characterizes a residency program, several concerns about the quality of research data are raised. These issues include investigator blinding, Hawthorne effects, effects relating to a prior relationship between subject and investigator/observer, and behavioral changes stemming from the resentment that can arise from perceived coercion.

Blinding evaluators to the identities of research subjects is a fundamental way to ensure that a study has baseline validity.17 Consider, for example, a study in which a program director wishes to compare the teaching of basic surgical skills in an intern “boot camp” by senior residents with the teaching of the same skills by surgery faculty. Performance is assessed by video review of the interns performing suturing tasks. In this example, the program director is one of the “blinded” observers. Even if the video only reveals the interns’ hands, the program director may be able to identify interns by their hands, jewelry, or skin color, adversely affecting the blinding.

The study’s validity is thrown into doubt at that point. Can the program director evaluate that intern’s performance without bias? (Beyond the study’s validity, there is also concern about whether the intern’s performance within the study will affect the program director’s evaluation of the intern outside of the study.)

A certain risk of bias is inherent in any research study in which individuals are observed by others; people under scrutiny may modify their behavior simply because they know they are being observed. The potential for this bias, known as “the Hawthorne effect,” is present in such studies as the one described at the outset of this article. Residents who are being observed in a simulated trauma scenario with video and audio recordings are likely to behave differently than they would in an actual trauma resuscitation. Also, it is reasonable to postulate that residents’ behavior may be similarly affected when being observed by attendings well known to them. Consider, again, the case that was presented at the beginning of this article. The resident enters the trauma simulation room and the simulation begins. He hears the voice of his program director (a surgeon with whom he has operated many times) over the loudspeaker. Subtle vocal cues might be communicated from attending to resident because of their familiarity. Conversely, the investigators’ subjective assessments of subjects who are their residents may be biased by prior experiences.

In addition, a resident who feels some resentment from perceived coercion to participate in a study may not exert maximum effort, and this may bias the results. One can imagine a tired junior resident feeling subtly coerced into participating in the team trauma training study. The resident may recognize that he will likely not be meaningfully evaluated on his performance, understand that no patients’ lives are currently at stake, and suspect that the pedagogical value of his full engagement with the trauma scenarios is minimal. Such a resident might exert only the minimal effort needed to have his name checked off of a list. Clearly, his performance might skew the overall results if compared with other residents who are more fully invested.

Back to Top | Article Outline

Dual purpose

Despite being told otherwise, a study subject who persists in a belief that he will benefit from participation has fallen prey to what is known as the “therapeutic misconception.” This phenomenon is most frequently observed in early studies designed to evaluate the safety, not the efficacy, of new therapies. Physicians may inadvertently encourage this by focusing inappropriately on the hope of therapeutic benefit in studies not designed to test or offer such benefit.

In education research, a type of therapeutic misconception may arise when an activity is intended to have educational value for those participating and to generate data intended to enhance GME. Given the time constraints of residents and the strong need for more education research, it is highly desirable that participating in educational research confer some educational value, but dual purpose activities can be particularly challenging.

One important question is how to evaluate and describe the potential for any personal benefit from participating in an education research study. Consider the following examples.

Example 1. Mercy Hospital has invested for many years in providing a live program to teach HIPAA, informed consent, and medical informatics. The impact of this ongoing educational activity has never been assessed, and evidence to support continued funding is now being sought. The assessment will be conducted via a pretest/posttest comparison.

In this scenario, interns will participate in the educational program whether or not they agree to participate in the research. The process of taking the tests is not expected to have educational value (correct answers are not indicated), and so the consent process should clearly indicate that no personal benefit is expected.

Example 2. Good Samaritan Hospital is considering implementing an educational program as described in example 1, but wants to evaluate the potential benefit before committing to long-term funding. A research study is being done where subjects will be provided with an educational program and will complete pre- and posttesting.

In contrast to example 1, this scenario includes an educational program only as part of a research study, so residents will only experience the education as part of their participation in the research. Hence, in this example it would be appropriate to describe the education as an expected benefit for research participants.

Another key question arises when educational and research activities are blended in this way: Can individual results collected for research purposes be used for assessment of residents that participated in the study? The assessment tool under scrutiny may not have been validated, and the teaching method itself may be unproven. Although both are often the case in GME settings, the tools and teaching methods used in a study may be substituted for others that have withstood the test of time. The problem, then, of assessing residents who undergo nonstandard assessment of routine education activities (example 1) is that the assessment tool may not be well understood, and residents who participate in nonstandard educational experiences (such as example 2) may encounter a substandard educational modality. In either situation, the results of the assessment become difficult to interpret. These issues warrant consideration and the close cooperation of training program directors and educational researchers in the interest of efficiently educating and fairly evaluating residents while advancing teaching methods.

Back to Top | Article Outline


In this article, we consider several ethical and methodological issues specific to GME research. As resident duty hours become increasingly limited, potential cuts in GME funding are considered, and as medical and educational technologies continue to advance, the need for high-quality research to guide efficient and effective education intensifies. Investigators should be cognizant of the need to minimize coercion, to appropriately use the services of local IRBs, to ensure that data of the highest quality be produced, and to balance the dual purpose inherent in such studies.

Funding/Support: None.

Other disclosures: None.

Ethical approval: Not applicable.

Previous presentations: This article is based on a panel discussion, “Using trainees for education research: Ethical dilemma or appropriate scholarly activity,” at the 2011 Association for Surgical Education Annual Meeting on March 23, 2011, in Boston, Massachusetts.

Back to Top | Article Outline


1. Derossis AM, DaRosa DA, Dutta S, Dunnington GL. A ten-year analysis of surgical education research. Am J Surg. 2000;180:58–61
2. Jwayyed S, Stiffler KA, Wilber ST, et al. Technology-assisted education in graduate medical education: A review of the literature. Int J Emerg Med. 2011;4:51
3. Hodges BD, Kuper A. Theory and practice in the design and conduct of graduate medical education. Acad Med. 2012;87:25–33
4. Buckley JD, Joyce B, Garcia AJ, Jordan J, Scher E. Linking residency training effectiveness to clinical outcomes: A quality improvement approach. Jt Comm J Qual Patient Saf. 2010;36:203–208
5. Huang GC, Newman LR, Tess AV, Schwartzstein RM. Teaching patient safety: Conference proceedings and consensus statements of the Millennium Conference 2009. Teach Learn Med. 2011;23:172–178
6. Johns MME chair Ensuring an Effective Physician Workforce for America: Proceedings of a Conference Sponsored by the Josiah Macy Jr. Foundation. 2010 New York, NY Josiah Macy Jr. Foundation Accessed December 18, 2012.
7. Weinstein DF chair. Ensuring an Effective Physician Workforce for the United States: Recommendations for Reforming Graduate Medical Education to Meet the Needs of the Public. Proceedings of a Conference Sponsored by the Josiah Macy Jr. Foundation. 2011 New York, NY: Josiah Macy Jr. Foundation Accessed December 18, 2012.
8. Faden R, Beauchamp T. A History and Theory of Informed Consent. 1986 Oxford, UK Oxford University Press;
9. Moreno J. Is There an Ethicist in the House?. 2005 Bloomington, Ind Indiana University Press;
10. Johansson AC, Durning SJ, Gruppen LD, Olson ME, Schwartzstein RM, Higgins PA. Perspective: Medical education research and the institutional review board: Reexamining the process. Acad Med. 2011;86:809–817
11. Casarett D, Karlawish JH, Sugarman J. Determining when quality improvement initiatives should be considered research: Proposed criteria and potential implications. JAMA. 2000;283:2275–2280
12. . Federal Register. Protection of Human Subjects. 45CFR46.102f. (2005) Accessed December 18, 2012.
13. . Federal Register. Protection of Human Subjects. 45CFR46.101b. (2005) Accessed December 18, 2012.
14. . Human Subjects Research Protections: Enhancing Protections for Research Subjects and Reducing Burden, Delay, and Ambiguity for Investigators. Department of Health and Human Services. Office of the Secretary. 45 CFR Parts 46, 160, and 164 Food and Drug Administration 21 CFR Parts 50 and 56.–18792.pdf. Accessed December 18, 2012.
15. Dyrbye LN, Thomas MR, Mechaber AJ, et al. Medical education research and IRB review: An analysis and comparison of the IRB review process at six institutions. Acad Med. 2007;82:654–660
16. U.S. Department of Health and Human Services.. Exempt Research Determination—FAQ: Who may determine that research is exempt? Accessed December 18, 2012.
17. Guyatt GH, Sackett DL, Cook DJ. Users’ guides to the medical literature. II. How to use an article about therapy or prevention. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA. 1993;270:2598–2601
© 2013 Association of American Medical Colleges