Medical education researchers, like other academics, face increasing pressure to demonstrate, or “prove,” the impact of their work.1–6 This pressure may represent accountability, but it also results from a need to prove the relative value of one’s work in a competitive, constrained funding environment in academic health sciences.5,7–9 Historically, conventional metrics for assessing research impact have prioritized counts of grants, publications, and citations. Together, these conventional measures of research productivity are considered evidence of the significance of a researcher’s work and thus drive promotions and affect future funding successes. Increasingly, however, researchers from diverse fields question the ability of these conventional metrics to adequately represent the impact of their work.10–12 The full influence and effects of research, including societal impact and research quality, are difficult to capture using only conventional, quantitative metrics.
Metrics establish incentives that influence behavior and can change the academic system.13 What counts as impact represents what is valued and rewarded in any given context, with implications for the types of questions researchers choose to pursue. Every knowledge community has its own definitions of what counts as knowledge, how that knowledge should be produced, and how the quality of that knowledge production should be evaluated.14 Definitions of impact and knowledge shape and constrain researchers’ foci and endeavors. Therefore, metrics that meaningfully evaluate the knowledge outputs of researchers need to be defined within each field.
It is time for medical education research, as a field, to examine how it measures research impact and consider the broader implications these measures may have. In this article, we will discuss developments in research metrics more broadly, critically examine impact metrics currently used in our field, and propose an alternative to more meaningfully track and represent impact in medical education research. The term “impact” is used to discuss the effect or influence of research, within and beyond academia. Metrics, measures, and indicators are the data collated to provide evidence of research impact.
Developments in Research Metrics
Commonly, researchers are evaluated by their number of peer-reviewed publications, which may include considerations of authorship position and journal impact factor (JIF).15,16 Another conventional metric is grant capture, encompassing number of grants, dollar amount of grants, and type of grant (e.g., local, national, international, foundation based). Often, researchers with more publications are more competitive for grant funding. Citations are a conventional metric to assess research impact, the rationale being that if a paper is cited, then dissemination of knowledge is occurring, and therefore the research is having an impact on policy or future research. The most common metric is citation counts, though some researchers may take the time to delve into who is citing their work or where their work is being cited. While conventional metrics are commonly used to evaluate researchers, the pressure to publish can lead to questionable authorship practices.17 Moreover, a journal’s JIF is derived from citations for all articles in that journal and does not indicate the quality of a specific journal article15; citations are slow to accrue and can often misrepresent an article’s content.18
In an attempt to address critiques of conventional metrics,15,18–20 Priem (an information science PhD student) and colleagues21 developed altmetrics as a way of expanding definitions of research impact measurement beyond the narrow focus on peer review, JIF, and citation count. Unfortunately, altmetrics still suffer many of the same pitfalls as conventional metrics.22 They focus on counting citations, albeit from different, nonacademic sources (e.g., blogs, social media, news reports), and are measures of attention to, rather than quality of, research output.23–26 For example, Wakefield and colleagues’ (1998) now-infamous article about the measles, mumps, and rubella vaccine and predisposition to pervasive developmental disorder in children is in the top 5% of all research outputs ever tracked via Altmetric, with 1,533 tweets, 168 news reports, and 106 blog mentions as of late 2018.27 This is because Wakefield’s publication was retracted, igniting a scandal that caused the paper to be highly shared. Such an instance exemplifies how altmetrics cannot answer the criticism that conventional metrics face—that is, they are measures of attention, not quality—and that quantitative measures cannot adequately represent societal impact resulting from research.18,28,29
In an attempt to address these remaining issues, scholars in other fields (e.g., health services and policy,10 biomedical sciences,30,31 management,12 research evaluation,32 arts-based health research11) have built upon conventional metrics and altmetrics approaches in a number of unique ways (see Table 1 for a summary of trends in the development of research impact from different fields). Although these frameworks offer promising ways forward, we cannot simply adopt them in medical education. Every field has unique forms of impact that are not necessarily captured by the tracking strategies used by other fields. The frameworks can, however, serve as a starting point for a customized way of representing impact for the field. This Perspective proposes a reconceptualization of both what is captured and how to communicate research impact.
Introducing Grey Metrics
Education is a complex social process, and medical education research contributes ideas and orientations that influence perspectives, practices, and policy33 that are not fully captured by conventional metrics and altmetrics. Thus, we propose a novel indicator of research impact called “grey metrics,” which can contribute to identifying meaningful research impact not currently captured. Grey metrics may include but are not limited to nonconventional citations, informal sharing of research findings, informal and formal consults, and communications indicating appreciation or applications related to one’s research. Grey literature is unindexed and/or unpublished materials, often difficult to find since it exists outside of academic databases. Yet, grey literature is a key source of knowledge (hence its inclusion in systematic and scoping reviews).34 Similarly, grey metrics are so named because they may be difficult to track due to being informal and unindexed, and yet contribute crucial insights to how one’s research is making an impact. The lack of a searchable database for grey metrics should not be confused with lower importance. Aligning research metrics with the purposes of the field is a more rigorous way of demonstrating impact than relying on existing metrics developed for other fields with different purposes. Grey metrics comprise many different indicators that capture the kinds of impacts researchers wish to see in education but that are insufficiently represented by conventional metrics or altmetrics. When collected, grey metrics demonstrate the influence and effects of one’s research.
By nonconventional citations (beyond the websites or podcasts suggested by altmetrics), we are referring to sources such as peer-reviewed conference posters, presentations, and keynote addresses (e.g., mentioned in the slides); listserv mentions; inclusion in curricula or to inform innovative teaching tools; or citation on websites as seminal papers (e.g., inclusion of publication in Key Journal Articles in MedEdWorld [https://www.mededworld.org/Resources/Publications/Articles.aspx]). These instances represent well-aligned markers of impact because they illustrate how an education researcher’s work is influencing actual education practices and thinking.
Another indicator of research uptake is informal requests to share research findings. Slide sharing requests represent how one’s research is addressing knowledge gaps, as one’s research findings are used to inform perspective, practice, or organizational changes. Although one’s research may have influenced practice or policy decisions, if the use is not published (e.g., to design a curriculum or inform an innovative teaching tool), there is no formal recognition. Tracking informal research sharing produces data documenting an otherwise-undiscoverable impact of research. Capturing informal sharing may be facilitated by sites such as ResearchGate (https://www.researchgate.net), which makes it easy to contact researchers and encourages users to provide comments about why they used a research output or how they adapted it.
In addition, informal and formal consults (telephone, in-person, electronically via email, or on platforms such as ResearchGate) can also indicate impact. Consults may range from providing insights on a grant citing one’s work to consulting on evidence-informed curriculum change using one’s findings. One might receive email requests for more information or guidance. These communiques improve understanding of one’s work and might lead to further recognition and are examples of how research affects future research questions and practice changes, as one informs work at other departments or institutions. Invitations to consult indicate that one is considered to have expertise and is serving as a collaborator and capacity builder. This is influential work, but often done invisibly and, if tracked, could serve as an indicator of significant impact. These important exchanges may be sacrificed if they don’t “count.”
A separate but related and important grey metric is the communication of appreciation or applications regarding one’s research (e.g., email, direct messages on ResearchGate, in-person conversation at conferences). An assessment of the United Kingdom Research Excellence Framework (UK REF) identified that evidence of realized outcomes for specific stakeholders “is particularly powerful when … backed by data or testimony from research users.”35 Such unsolicited stakeholder feedback may provide a chance to understand how one’s findings are being used, can lead to consultations or collaborations, and should be tracked as evidence of research impact.
With unconventional or grey metrics in mind, one begins to notice new types of indicators and different connections that demonstrate the influence of one’s research. As knowledge is created, shared, and applied to practice in nonlinear ways,36 grey metrics are key in expanding thinking about “what counts” as research impact. Just as grey literature makes important contributions to a systematic review,34 grey metrics are seemingly small pieces of evidence but, when collected and combined, more fully capture the processes of medical education research and its effects, charting research’s ability to influence perspectives, practice, and policy.
Piecing It All Together: Impact Stories
To capture the knowledge outputs and impact of researchers meaningfully, we need to not only reconceptualize what we think of and capture as impact (e.g., expanding our definition to grey metrics), but we also need to rethink how we present impact. A cohesive picture of research impact is missing in quantitative measures. Rather than tracking impact as a collection of piecemeal data sources, we suggest a move toward more cohesive representations of research impact.
Drawing on the power of narrative,37 we suggest telling “impact stories,” which can provide a memorable way to depict the effects of medical education research. Impact stories link different metrics to a broader educational goal by expressing how one’s work aligns with medical education’s purposes and values. Impact stories are useful in elucidating why a chosen metric is meaningful and how one’s work has been influential, offering a cohesive and compelling narrative structure rather than strings of sentences listing conventional metrics, altmetrics, or grey metrics. Examples of practical contexts where impact stories could be used include “Most Significant Contributions” sections on Canadian Institute of Health Research applications,38 when asked for case studies as evidence for research impact assessment (e.g., UK REF),39 research statements in curricula vitae, documents for promotions and tenure committees, and annual reports.
An impact story could encompass diverse metrics, tracked and combined into a narrative that shows why one’s research is valuable. Part of crafting an impact story is identifying and developing indicators that make sense for one’s particular context and research question. It is useful for medical education researchers to understand potential types of impact and stay attuned to the developments of new metrics, but one should not incorporate a metric simply because it is a new measure. The indicators of impact must be relevant to one’s story. Impact stories in medical education are effective when they elucidate how the research is helping achieve or transform educational goals. Lingard’s “The Writer’s Craft”40 series of papers in Perspectives on Medical Education offers many useful writing guidelines specifically tailored to medical education research; they may also be useful in writing impact stories.
An impact story should align with the educational goals of one’s department and institution, but what is emphasized will depend on the audience. Determining a static representation of value is difficult (perhaps impossible). Instead, foregrounding and backgrounding indicators that resonate with the audience can help the audience understand and remember the significance of one’s research. For example, when presenting to a hospital board, one could highlight how one’s research aligns with institutional missions and vision, or consider how medical education research might save or contribute money to the institution.
The guiding questions for crafting impact stories are many and may include the following: What are the purposes and values of education? As elucidated in Baker and colleagues’41 article about aligning and applying the paradigms of education, the answer to this question likely guides one’s work as a researcher, in the types of questions asked and pursued. Is the purpose of education to allow learners to gain knowledge and become competent? Is it to inspire lifelong learning? Is it to encourage learners to think critically? What goals does one want to accomplish with one’s research? What indicators more fully demonstrate whether one’s research has achieved those goals? Does one’s research address the moral work of education? If one’s work contributes to developing health professionals with a commitment to caring, or an awareness of social justice, include descriptions of the types of societal and educational benefits afforded by implementation of one’s research findings. What is the most meaningful story to help others understand the value of one’s research? Is the impact story told in a way that is authentic, transparent, and trustworthy (based on data, not embellishment42)?
To better represent the suggestions above, we provide an impact story example using conventional and grey metrics (see Box 1).
There are practical concerns regarding grey metrics and impact stories, including time and resources for tracking, asystematic collection, and subjective interpretation of their significance. A limitation of grey metrics is the focus on measures within academia, with the caveat that medical education research applied to curriculum and instructional design holds the promise of effecting changes beyond academia (e.g., critical pedagogy to help learners recognize social structures and relations, challenge dominant beliefs, and become agents of change).43 All metrics are susceptible to being gamed, which can include inflation or misrepresentation, or inspire questionable practices earlier in the research process to better meet a metric (e.g., unethical authorship practices or salami slicing to increase publication counts). In the case of grey metrics, individuals are given freedom to provide their own rationale for why a measure is meaningful and indicative of impact (e.g., acting as consultant on curriculum development). This is a highly subjective activity; a grey metric could be interpreted in myriad ways and is susceptible to manipulation. Yet the act of having to interpret an indicator, to always question why an indicator is important and how it is meaningful, is an important activity researchers should engage in to ensure that metrics are not collected and used without thinking—this applies to conventional metrics, altmetrics, and grey metrics. Reconceptualizing the types of metrics that may be appropriate for medical education research is not to pit quantitative (conventional and altmetrics) against qualitative (grey metrics). Numbers can be made meaningful through sound analysis and application; the same is true of qualitative measures. “One size is unlikely to fit all,” but carefully selected quantitative and qualitative indicators can complement one another.44 As per the impact story example (Box 1), combining quantitative and qualitative metrics constructs a more holistic, meaningful representation of why and how medical education research affects academic and societal perspectives, practices, and policy. One of the main challenges facing new metrics and ways of communicating impact is that adoption is predicated on leaders, departments, and institutions recognizing the need for and valuing measures beyond conventional metrics. We hope this article contributes to leaders’ considerations of how “inappropriate indicators create perverse incentives.”44 Individuals need organizations to create and support change if new cultural values are to be created and promoted.
Conventional and altmetrics should not be used uncritically, and neither should grey metrics and impact stories. As with all metrics, there are challenges to collecting grey metrics and ensuring their responsible use.44 Grey metrics and impact stories are not meant to supplant existing metrics but, instead, are meant to offer a starting point for medical education researchers to further investigate possible ways of understanding and communicating research impact.
It behooves those in the field of medical education to critically examine current research metrics and engage in creating and supporting indicators of research impact that align with educational philosophy, values, and goals. If one conducts medical education research to transform how curriculum, teaching, and learning occur in the field, then conventional metrics may fall short in demonstrating this type of impact. With conventional metrics and altmetrics, medical education researchers risk capturing too few, or inappropriate, forms of impact. Grey metrics advance a way to track other types of impact—not only numbers but also collaborations and processes indicative of research impact (e.g., formal and informal consults).
Numbers alone do not express the human impact of education work, but impact stories can begin to convey the richness and nuance of medical education research impact. Existing research metrics too often focus on measuring, without attending to meaning. As it has been observed, “We need to ensure that we are indeed measuring what we value,”45 not just measuring what we can easily count, and thus end up valuing what we can count. Medical education researchers need to value and develop the use of stories to build credible and interesting narratives about the impact of their work. Doing so successfully can increase interest and investments in education. We hope that this article will begin a conversation and set out a research agenda to help medical education conceptualize and study metrics more appropriate for the field. This suggestion is timely and aligned with work around expanding definitions of scholarly activity in medical education.46,47
Yet we end with a note of caution: As the medical education research community works toward a broader definition of impact, researchers must be mindful about the urge to further quantify their achievements and be wary of the “impact agenda.”48 The purpose of reconceptualizing impact is not to continually invent new metrics. Our observations are not meant to be prescriptive, nor to create increasingly burdensome ways for researchers to “prove their worth.” The hope is to inspire alternative, innovative ways of seeing, thinking about, and representing medical education research impact more appropriately aligned with medical education researchers’ contexts, values, and drive for quality. Demonstrating meaningful impact matters because it has implications for individual careers and institutional reputations, but we must try to resist a full embrace of the impact agenda. We must always be mindful that the time we spend tracking impact could often be better spent doing work that has impact.
Acknowledgments: The authors wish to thank Arno Kumagai for his guidance, input, and encouragement; Ayelet Kuper for her suggestions; and Karen Leslie for her support.
1. Carpenter CR, Cone DC, Sarli CC. Using publication metrics to highlight academic productivity and research impact. Acad Emerg Med. 2014;21:1160–1172.
2. Camp M, Escott BG. Authorship proliferation in the orthopaedic literature. J Bone Joint Surg Am. 2013;95:e44.
3. Weiss AP. Reviews and overviews measuring the impact of medical research: Moving from outputs to outcomes. Am J Psychiatry. 2007;164:206–214.
4. Greenhalgh T. Research impact: Defining it, measuring it, maximising it, questioning it. BMC Health Serv Res. 2014;14(suppl 2):O30.
6. Rawat S, Meena S. Publish or perish: Where are we heading? J Res Med Sci. 2014;19:87–89.
7. Carline JD. Funding medical education research: Opportunities and issues. Acad Med. 2004;79:918–924.
8. Reed DA, Kern DE, Levine RB, Wright SM. Costs and funding for published medical education research. JAMA. 2005;106:1410.
9. Todres M, Stephenson A, Jones R. Medical education research remains the poor relation. BMJ. 2007;335:333–335.
10. Kuruvilla S, Mays N, Pleasant A, Walt G. Describing the impact of health research: A research impact framework. BMC Health Serv Res. 2006;6:134.
11. Parsons JA, Gladstone BM, Gray J, Kontos P. Re-conceptualizing “impact” in art-based health research. J Appl Arts Heal. 2017;8:155–173.
12. Aguinis H, Shapiro DL, Antonacopoulou EP, Cummings TG. Scholarly impact: A pluralist conceptualization. Acad Manag Learn Educ. 2014;13:623–639.
13. Hicks D, Wouters P, Waltman L, de Rijcke S, Rafols I. The Leiden Manifesto for research metrics. Nature. 2015;520:429–431.
14. Cetina KK. Culture in global knowledge societies: Knowledge cultures and epistemic cultures. Interdiscip Sci Rev. 2007;32:361–375.
15. PLoS Medicine Editors. The impact factor game. PLoS Med. 2006;3:e291.
16. Archambault É, Larivière V. History of the journal impact factor: Contingencies and consequences. Scientometrics. 2009;79:635–649.
17. Artino AR Jr, Driessen EW, Maggio LA. Ethical shades of gray: International frequency of scientific misconduct and questionable research practices in health professions education. Acad Med. 2019;94:76–84.
18. Gruber T. Academic sell-out: How an obsession with metrics and rankings is damaging academia. J Mark High Educ. 2014;24:165–177.
19. Martin BR. Editors’ JIF-boosting stratagems: Which are appropriate and which not? Res Policy. 2016;45:1–7.
20. Hoeffel C. Journal impact factors. Allergy. 1998;53:1225.
23. Neylon C, Wu S. Article-level metrics and the evolution of scientific impact. PLoS Biol. 2009;7:e1000242.
25. Haustein S. Grand challenges in altmetrics: Heterogeneity, data quality and dependencies. Scientometrics. 2016;108:413–423.
26. Maggio LA, Meyer HS, Artino AR Jr. Beyond citation rates: A real-time impact analysis of health professions education research using altmetrics. Acad Med. 2017;92:1449–1455.
27. Wakefield AJ, Murch SH, Anthony A, et al. RETRACTED: Ileal-lymphoid-nodular hyperplasia, non-specific colitis, and pervasive developmental disorder in children. Altmetric.com. https://elsevier.altmetric.com/details/102093
. Accessed February 20, 2019.
29. Azer SA, Holen A, Wilson I, Skokauskas N. Impact factor of medical education journals and recently developed indices: Can any of them support academic promotion criteria? J Postgrad Med. 2016;62:32–39.
31. Sarli CC, Dubinsky EK, Holmes KL. Beyond citation analysis: A model for assessment of research impact. J Med Libr Assoc. 2010;98:17–23.
32. Morton S. Progressing research impact assessment: A “contributions” approach. Res Eval. 2015;24:405–419.
33. Watson L. Developing indicators for a new ERA: Should we measure the policy impact of education research? Aust J Educ. 2008;52:117–128.
34. Schöpfel J, Farace D. Grey Literature in Library and Information Studies. 2010.New York, NY: De Gruyter Saur.
36. Greenhalgh T, Wieringa S. Is it time to drop the “knowledge translation” metaphor? A critical literature review. J R Soc Med. 2011;104:501–509.
37. Davies JE. Stories of Change: Narrative and Social Movements. 2002.Albany, NY: State University of New York Press.
40. Lingard L. The writer’s craft. Perspect Med Educ. 2015;4:79–80.
41. Baker L, Wright S, Mylopoulos M, Kulasegaram K, Ng S. Aligning and applying the paradigms and practices of education [published online ahead of print March 5, 2019]. Acad Med. doi:10.1097/ACM.0000000000002693.
43. Halman M, Baker L, Ng S. A critical review of critical consciousness in health care education. Perspect Med Educ. 2017;6:12–20.
45. Biesta G. Good education in an age of measurement: On the need to reconnect with the question of purpose in education. Educ Assess Eval Account. 2009;21:33–46.
46. Ellaway R, Topps D. METRICS: A pattern language of scholarship in medical education. MedEdPublish. 2017;6:30.
47. Cheng A, Calhoun A, Topps D, Adler MD, Ellaway R. Using the METRICS model for defining routes to scholarship in healthcare simulation. Med Teach. 2018;40:652–660.
48. Watermeyer R. Impact in the REF: Issues and obstacles. Stud High Educ. 2016;41:199–214.
Box 1 “Relationships of Power” Article Grows “Louder Than Words”: An Example of an Impact Story