Research Methodologies in Health Professions Education Publications: Breadth and Rigor : Academic Medicine

Secondary Logo

Journal Logo

Surveying the Medical Education Landscape

Research Methodologies in Health Professions Education Publications: Breadth and Rigor

Han, Heeyoung PhD1; Youm, Julie PhD2; Tucker, Constance PhD3; Teal, Cayla R. PhD, MA4; Rougas, Steven MD, MS5; Park, Yoon Soo PhD6; J. Mooney, Christopher PhD, MPH7; L. Hanson, Janice PhD, EdS8; Berry, Andrea MPA9

Author Information
Academic Medicine 97(11S):p S54-S62, November 2022. | DOI: 10.1097/ACM.0000000000004911



Research methodologies represent assumptions about knowledge and ways of knowing. Diverse research methodologies and methodological standards for rigor are essential in shaping the collective set of knowledge in health professions education (HPE). Given this relationship between methodologies and knowledge, it is important to understand the breadth of research methodologies and their rigor in HPE research publications. However, there are limited studies examining these questions. This study synthesized current trends in methodologies and rigor in HPE papers to inform how evidence is gathered and collectively shapes knowledge in HPE.


This descriptive quantitative study used stepwise stratified cluster random sampling to analyze 90 papers from 15 HPE journals published in 2018 and 2019. Using a research design codebook, the authors conducted group coding processes for fidelity, response process validity, and rater agreement; an index quantifying methodological rigor was developed and applied for each paper.


Over half of research methodologies were quantitative (51%), followed by qualitative (28%), and mixed methods (20%). No quantitative and mixed methods papers reported an epistemological approach. All qualitative papers that reported an epistemological approach (48%) used social constructivism. Most papers included participants from North America (49%) and Europe (20%). The majority of papers did not specify participant sampling strategies (56%) or a rationale for sample size (80%). Among those reported, most studies (81%) collected data within 1 year.

The average rigor score of the papers was 56% (SD = 17). Rigor scores varied by journal categories and research methodologies. Rigor scores differed between general HPE journals and discipline-specific journals. Qualitative papers had significantly higher rigor scores than quantitative and mixed methods papers.


This review of methodological breadth and rigor in HPE papers raises awareness in addressing methodological gaps and calls for future research on how the authors shape the nature of knowledge in HPE.

Research is a scientific social process that systematically synthesizes evidence to create knowledge. Foundational to research practices are assumptions underlying the nature of reality (ontology) that lead to assumptions about the nature of knowing (epistemology), which in turn, influence decisions about the nature of inquiry (methodologies). 1–3 Given the sequential alignment among ontology, epistemology, and methodologies of research, it is imperative that researchers be explicit regarding their assumptions. Yet researcher assumptions are often not explicitly in health professions education (HPE) research. 4 Indeed, many studies take ontological and epistemological assumptions for granted and excessively focus on methodology. 3 Further, empirical research in HPE is often limited to a specific subset of methodologies.

The habitual use of research methodologies leads to questions regarding whether our research methodologies shape the nature of knowledge in HPE rather than the reverse, and if so, how research methodologies have shaped evidence and body of knowledge in this field. Thomas and colleagues 5 argued that research methodologies and researchers’ epistemologies shape the nature of knowledge, as methodologies determine what kind of knowledge is possible, legitimate, and trustworthy. Similarly, Balmer and colleagues reported that the nature of knowledge described within longitudinal qualitative papers depends on researchers’ assumptions around the nature of time as fluid or static. 6 In addition, in an HPE context, the prevalence of a limited subset of research methodologies and our inclination to focus on specific methodologies potentially narrow the scope of our knowledge. Biesta and van Braak 7 criticized the limited medical education research methodologies, noting that currently used methodologies fail to recognize dynamic and complex educational practices. They encourage researchers to expand their methodologies to capture what happens in learning interactions. While there are expectations that the research question should guide the choice of a study methodology, 8,9 the use of specific study methodologies can also shape research questions. In this sense, perhaps there is a more complex relationship between methodology and knowledge. Therefore, it becomes relevant to consider how research methodologies have shaped knowledge in HPE.

Reviewing the history of HPE provides evidence on how researchers’ methodologies have shaped knowledge and practice in HPE. Informed by the milieu of medical education in the mid-1900s, Ham 10 described the trends and evolution of medical education research and curricula 60 years ago in the United States. The medical education research approach at that time heavily relied on a quantitative methodology that was grounded in empiricism and the scientific method of investigation. In the 1980s, qualitative research paradigms originated from other disciplines such as anthropology and sociology were introduced into medical education. 2 In their commentary in the mid-1990s, Colliver and Verhulst 11 argued against a qualitative research methodology, claiming it provided weak subjective evidence. They concluded by advocating positivist research methodologies and stated, “Research should be driven by research questions, not research methods, and any attempt to legislate the use of a particular method or combination of methods is a threat to the creativity and viability of scientific research.” 11(p211) Within this rigid atmosphere that exclusively promoted quantitative methodologies, HPE research favored positivism-oriented methodologies, even when using qualitative methods, and this limited focus continues to affect opinions about research quality and rigor in HPE research. 12,13

The field’s preference for quantitative methodologies has been changing over the last 2 decades as qualitative research became better appreciated as another legitimate research methodology. 2,14–16 Increasing uptake and support for methodological diversities has changed the landscape of HPE literature with greater acceptance of subjective and interpretive constructivism paradigms. Since HPE research topics are broad, 17 the field needs such openness to diverse ontological and epistemological assumptions and methodologies. 18 Educational research is a dynamic process where researchers merge existing methodologies (between and within different types of methods) and reconceptualize ways of asking and answering questions. 19 Researchers must remain nimble, as they may need to apply different paradigms depending on the nature of knowledge about the phenomenon under study. 20–22

As HPE research methodologies evolved from positivism to include social constructivism and critical inquiry, and as methodological pluralism emerged, the education research community increasingly questioned the notion of research methodology rigor. 18 Medical educators recognized that diverse methodologies including varying epistemological assumptions and their relevant standards of rigor were needed to advance the field. 12–15,18,20,21 In the foreword to the 2020 Research in Medical Education supplement of Academic Medicine, Park and colleagues 23 noted the importance of scientific rigor in the diverse research approaches in HPE research that comprises methodological processes including epistemological assumptions, research designs, and research methods (data collection, analysis, and interpretation). Although they noted the variety of standards of rigor based on the elements of research methodologies, they argued that standards of rigor should be implemented and reported in research papers.

Diverse research methodologies and the implementation of methodological standards for rigor are essential in shaping the collective sets of knowledge we create in the field of HPE. Given the importance of a comprehensive repertoire of epistemological and methodological approaches in creating collective sets of knowledge in HPE, it is critical to understand the breadth of research methodologies and their rigor in HPE research publications. To date, however, there are limited studies examining these questions. By answering the following 2 research questions, we aim to describe current trends in research methodologies and level of rigor in the field of HPE: (1) What is the breadth of research designs reported by current HPE publications? (2) What is the level of methodological rigor described by current HPE publications?


We adopted a descriptive quantitative study using a positivist epistemology to understand the observable breadth of research designs and corresponding methodological rigor in a sample of HPE empirical research papers.

Sampling/data collection

We conducted stepwise stratified cluster random sampling of articles, where we first randomly selected journals, issues, and then research articles. As a starting point, we used the Association of American Medical Colleges (AAMC) Group on Educational Affairs (GEA) Medical Education Scholarship Research and Evaluation (MESRE) Annotated Bibliography of Journals of Educational Scholarship in 2019. We (1) selected journals focusing on HPE (n = 44); (2) categorized the selected journals into 3 categories: general medical education (e.g., Academic Medicine), medical education in specific domains (e.g., Journal of Cancer Education), and other HPE (e.g., Journal of Nursing Education); (3) randomly selected 5 journals from each category (n = 15, 34%) and chose publications from a 2-year period between January 2018 and December 2019; and then (4) randomly selected an issue in each year and 3 research articles in each selected issue. We did not include publications in 2020 and 2021 as there were unusual patterns in submissions and publications due to the COVID-19 pandemic. This resulted in 90 articles for analysis. We decided on a sample size of 30 articles in each group (general medical education, domain-specific medical education, and other HPE journals) to enable the comparison of group differences, assuming a sufficient number in each research approach based on the IMB SPSS Statistics (version 27) calculation of a sample size with an estimated power of 0.80. The inclusion criteria included empirical studies and original research from education-focused journals. We excluded journals publishing general medical research without a specific focus on education due to feasibility concerns as we would have had to review all papers for sampling to determine whether each article was eligible for our study. We also excluded papers that were innovation studies, case reports, conceptual and literature reviews, meta-analyses, and perspectives. The data collection was conducted in March 2021.

Coding structure

We used audit methodology to analyze the sample articles. 24 An audit procedure has been introduced in social science research to investigate the quality of studies. 25,26 We applied this methodology to determine whether the articles that we sampled met specific standards of methodological rigor. For the audit process, we developed a coding structure based on the literature around research design and rigor. 12,13,16,20,21,23,27–39 The coding structure included subcategories: methodological philosophy, research design, and research methods, which we elaborated into epistemology, 40 population, sampling, data sources, data collection, and data analysis. These subcategories included specific coding questions based on 3 different research approaches: qualitative, quantitative, and mixed methods. We also included general questions such as population, sampling and recruitment methods, rationale for sample size, and data collection duration.

Questions addressing standards of rigor were added for each research methodology. For qualitative papers, these questions related to the reporting of specific qualitative study approaches, sampling strategy, source of data, data analysis process, reflexivity, and trustworthiness. 14,20,21,27,33–36 For quantitative research methodologies, the literature suggested that we tailor our rigor questions to the specific research design. 14,20,21,28–32 For example, for papers using a survey measurement for either relational or descriptive studies, we added questions regarding common method bias that could introduce validity threat in the measurement, 28 which could be addressed by statistically controlling or using different measures rather than relying on a single survey measurement. For studies using a causal inference design, we included a question regarding the use of an active control group rather than an inactive control group. 32,41 For validation or measurement development studies, we included a question inquiring about a pilot study before the main study. Last, for mixed methods studies, our rigor questions asked about a rationale for this specific design, discussing the data mixing/integration approach, and sharing insights from mixing the methods. 37,38

With a set of preliminary coding questions, we formed 3 coding groups based on study team members’ expertise: qualitative, quantitative, and mixed methods. Each group reviewed the coding questions to improve the fidelity of the coding tool and response process validity. The group coding development process included reviewing the questions, keeping notes about the questions and responses, providing corrections and clarifications on the questions, individual coding, reviewing group agreement and disagreement, reconciling discrepancies, and recommending changes in the questions. The coding questions development process was iterative as we implemented changes in the questionnaire, coding additional articles using the updated questions, and discussed further changes until we did not have any additional changes. We completed 3 cycles of this codebook improvement process from March through July 2021.

Table 1 includes the overall question structure to measure rigor reported in the papers and the breadth of research methodologies in each category. Actual questions are attached in Supplemental Digital Appendix 1 at We had 10 questions exclusively for the descriptions of the breadth of research design (annotated as footnote “a” in Table 1). There were 33 questions to measure rigor for mixed method papers, which included 8 general questions, 3 mixed method specific questions, 10 qualitative research questions, and 12 quantitative research questions. Qualitative papers had 18 questions including 8 general rigor questions. Quantitative papers had 20 rigor questions. We implemented the coding questions using SurveyMonkey.

Table 1:
Coding Questions

Coding process

After developing the aforementioned coding structure, we completed individual coding of all 90 articles by working group: quantitative, qualitative, and mixed methods. Each group had 3 members to enable 3 coding pairs for each round of group coding. When the pair disagreed on codes, the members discussed and negotiated consensus codes by providing rationale and evidence. By doing this, the groups were able to develop consensus on all codes. Each group went through 4–6 group coding processes until 9 pairs achieved > 80% agreement, 42 which resulted in completing 38 articles through the group coding processes. Once we met > 80% rater agreement, we conducted individual coding to complete the remaining 52 articles from July to September 2021.

Data analysis

The coding questions were designed to provide answers to the stated research questions focused on the breadth of methodologies and the level of methodological rigor in HPE publication. For the breadth of methodologies, we used descriptive statistics, relying on frequencies and response counts. For example, one question was, “What qualitative research design did the authors use?” It included 6 different qualitative research design options: narrative, grounded theory, phenomenology, case study, ethnography, conversation analysis, other, and not explicitly specified. We calculated the frequencies of each option to understand the breadth of research designs.

For methodological rigor, we created an index (rigor score below) reflecting rigor from the coding responses. To measure the level of rigor, we placed a score of 1 for each question unless it was marked “not explicitly specified.” Questions (annotated as footnote “a” in Table 1) that did not fall into the answer option of “No” or “Not explicitly specified” did not get counted for the level of rigor but were used for descriptions of the breadth of research design. In addition, as each paper had a different research design that required different methodological rigor standards, there were “Not Applicable (N/A)” responses, which we removed in calculating a rigor score. Therefore, we calculated the level of rigor of each paper using the formula below:

Rigor score (%) =gained score § sum total of items measuring rigor×100

††Sum total of items measuring rigor = the number of rigor questions in each method, excluding the number of N/A questions

§Gained score = the paper-specific sum of possible rigor score, excluding “No” or “Not Explicitly Stated” responses.

To minimize bias in rigor scoring, we followed the coding structure based on existing literature and group consensus. The rigor score reinforced the study’s positivist epistemology and philosophical framework, which assumes one objective reality. To view aspects of methodology as one objective reality, the qualitative research coding group developed a concrete, agreed-upon definition of each item on the checklist—our translation of the one objective reality. The possible rigor scores ranged from 15 points to 19 points for qualitative papers, from 8 to 19 for quantitative papers, and from 18 to 28 for mixed methods papers. We used the percentage scores given each paper’s different score ranges and conducted ANOVA to see group differences. We did not pursue IRB approval.


The research team is composed of 9 current and past elected members of the MESRE section of the GEA of the AAMC. Based on our expertise in research across paradigms, our MESRE group work included promoting and improving medical education research through facilitating regional conference abstract review processes, grant proposal reviews, and national workshops on scholarship. Many team members are editorial members of journals with substantial manuscript review experience. Ethical approval was reported as not applicable.


Research question 1: What is the breadth of research design reported by current HPE publications?


Data analysis demonstrated that most research methodologies reported in the papers were quantitative (n = 46, 51%), followed by qualitative (n = 25, 28%), and mixed methods (n = 18, 20%) (Figure 1). Only one paper, a Delphi study, was categorized as “other” as it did not fit well into the defined rigor standards. None of the quantitative and mixed methods studies reported an epistemological approach. About half of the qualitative papers (n = 13, 52%) did not report one either. Those that did report an epistemological approach (n = 12, 48%) used social constructivism including postmodernism, phenomenology, and interpretivist epistemology. No papers explicitly used other epistemological approaches, such as critical theory.

Figure 1:
Research methodologies in health professions education (HPE).

Most papers included study participant populations from North America (n = 44, 49%) and Europe (n = 18, 20%). There were studies conducted in other locations, which were smaller numbers (Figure 2). Study participants included medical students (n = 23, 26%) and students in other HPE programs, such as nursing, dental, or veterinary medicine programs (n = 35, 39%), faculty including community physicians involved in teaching (n = 18, 20%), residents/fellows (n = 8, 9%), patients (n = 7, 8%), and staff (n = 3, 3%). Other types of data sources (n = 20, 22%) included electronic medical records, archives, websites, and community members, including policymakers and patients’ families.

Figure 2:
Locations of study participants/data sources.

More than half of the papers did not specify their participant sampling strategies (n = 50, 56%) or a rationale for the sample size (n = 72, 80%). Most studies (n = 58, 64%) occurred at a single institution or site, while 23 papers (26%) recruited participants from multiple sites. Fifty-four (60%) studies collected data within 1 year, while 7 (8%) studies collected data within 1–3 years and 6 (7%) had a data collection timeline of more than 3 years. Almost all papers (n = 82, 91%) analyzed in this study discussed the limitations of their research methodologies.

Qualitative research papers.

In the sample selected for this study, over half of the qualitative research papers specified research design approaches (n = 14, 56%). These approaches included grounded theory (n = 6, 24%) or phenomenology (n = 4, 16%) more frequently than ethnography (n = 1, 2%), other (action research, n = 1), narrative (n = 0), case study (n = 0), or conversation analysis (n = 0).

A majority of the qualitative papers explicitly used purposeful (n = 18, 72 %) and/or convenience sampling (n = 9, 40%) methods to recruit participants. The dominant sources of data were interviews (n = 14, 56%) and focus groups (n = 7, 28%). While not as prominent, some papers used documents or websites (n = 3, 12%), participant and nonparticipant observations (n = 2, 8%), or surveys (n = 1, 4%).

Most of the qualitative papers analyzed data using an inductive approach (n = 21, 84%) and/or thematic analysis (n = 14, 56%). In addition, most papers explicitly described using techniques to improve trustworthiness including triangulation (n = 14, 56%), additional reviewers to confirm findings (n = 6, 24%), member checking (n = 5, 20%), prolonged observation (n = 3, 12%), or audits (n = 2, 8%). Three papers (12%) did not specify methods to improve trustworthiness.

Quantitative research papers.

Half of the quantitative research papers used relational (n = 23, 50%) study design. Also, there were causal inference (n = 17, 37%), descriptive/observational (n = 9, 20%), and validation (n = 4, 9%) methods. Among papers with relational and/or descriptive cross-sectional designs, most papers did not address common method bias issues (n = 25, 89%). Papers with a causal inference design used an active control group (n = 10, 59%) and a pretest (n = 9, 53%). Only one paper (25%) among the validation studies conducted a pilot study.

Data collection occurred at one time point cross-sectionally (n = 24, 52%), longitudinally with the same group (n = 17, 37%), or longitudinally with different cohorts (n = 7, 15%). Most data were collected (n = 38, 83%) prospectively rather than retrospectively (n = 9, 20%). Data sources included self-reports/perceptions (n = 32, 70%) or knowledge assessment (n = 15, 33%). There were few papers with observed behaviors/performance (n = 3, 7%).

Some papers did not report a rationale for statistical tests (n = 13, 28%) or the quality of statistical analysis (n = 19, 41%), such as effect size or confidence interval. Only some papers using relational design incorporated analysis of control/confounding variables (n = 14, 30%) and moderating/mediating variables (n = 6, 29%) when expected. The papers used parametric (n = 35, 76%) and nonparametric (n = 19, 41%) data analysis techniques. Only a few papers provided discussions of missing data (n = 10, 22%), measurement validity (n = 9, 20%), and reliability (n = 14, 30%) evidence when expected.

Mixed methods papers.

Of the 18 mixed methods papers, a majority did not report a rationale for their use of a mixed methods approach (n = 16, 89%). Most of the mixed method papers used quantitative methodologies (n = 10, 56%) as their dominant inquiry approach, while only one paper (6%) had a dominant qualitative inquiry approach. There were 7 papers (39%) that used balanced quantitative and qualitative methods. Mixed methods papers mostly used triangulation design (n = 14, 78%) that complemented each data set on the same topic. 38 However, most triangulation design papers used a survey that included quantitative measures and open-ended questions rather than conducting independent qualitative and quantitative data collection methods. There were only 2 papers that used explanation design, 1 paper that used embedded design, and 1 that used exploration design. 38 Nearly all mixed methods papers did not specify a qualitative research design (n = 17, 94%). For the quantitative data component, most of the papers used descriptive/observational design (n = 14, 78%). Other quantitative components included causal inference (n = 4, 22%), relational designs (n = 1, 6%), and consensus research using a modified Delphi technique (n = 1, 6%). Only 5 studies (28%) explicitly discussed how the quantitative and qualitative approaches were linked and merged.

Research Question 2: What is the level of methodological rigor described by current HPE publications?

Rigor scores were analyzed for descriptive statistics and group differences by journal categories and research methodologies. The Kolmogorov–Smirnov and Shapiro–Wilk normality test showed the data’s normal distribution as the significance P value was greater than .05.

The average rigor score of the papers analyzed in the sample was 56% (SD = 17), ranged from 19% to 94% (Table 2). The rigor scores varied by journal categories and research methodologies. ANOVA showed that the group differences were statistically significant by journal categories (F [2, 87] = 5.82, P = .004, η2 = .12) and research methodologies (F [2, 86] = 32.68, P = .000, η2 = .43). Tukey HSD revealed group differences between general medical education journals (M = 62.92, SD = 18.94) and discipline-specific medical education journals (M = 48.78, SD = 14.04) (Table 3). The group difference by research methodologies was also statistically significant (F [2, 86] = 32.68, P = .000, η2 = .43). The post hoc analysis showed that qualitative papers had statistically significant higher rigor scores than quantitative and mixed methods papers. There was no statistically significant difference between quantitative and mixed methods papers.

Table 2:
Descriptive Statistics of Rigor Scores
Table 3:
Post Hoc


Although HPE has made progress in embracing diverse research paradigms, 3,4,43 this study found that there is still limited variation in the field. HPE research continues to have a modest diversity of epistemological approaches, population, sampling, and nature of data (e.g., short-term cross-sectional, perceptions based via survey or interviews), and limitations of the rigor in research methodologies reported in HPE papers—which may contribute to a limited set of knowledge in the field. Quantitative methodologies are more prevalent in HPE papers, including serving as the dominant approach to inquiry in mixed methods designs. Within qualitative methodologies, the most commonly reported are grounded theory and phenomenology. This finding is not surprising, as these 2 methods have extensive literature regarding the processes of their use. 33

The absence of reported epistemological frameworks among all the manuscripts is noteworthy; perhaps there is an inherent researcher assumption made about quantitative approaches that assume a positivist framework, but that is not a conclusion we can draw. Approximately half of the sampled qualitative papers reported a social constructivism framework without specific details of epistemological assumptions. It is striking that there were no subjectivist epistemologies and philosophical frameworks highlighting critical inquiry, including feminism, or postmodernism among any papers in the sample. Critical theory and related ideological views have a dialectic methodological nature. 1 These epistemological approaches focus on transforming misapprehension—historically shaped—into consciousness through developing and resolving contradictions. 1 Postmodernism is opposed to any epistemological stances based on grand narratives and pursues continuous reevaluation of practices and theories through deconstruction processes. 44 Diverse epistemological approaches, including critical theory and postmodernism, can provide different and unique understandings of HPE practices. However, the findings showed limited epistemological variation in our sample, even with calls to make room for other ways of knowing in HPE literature. 45,46

Reflecting on the absences

According to the findings, the HPE papers reviewed in this study are outlined as dominantly shaped by students’ perceptions from Western countries using a survey collected at one point in time within one year at a single institution. What does it mean that we rarely saw studies done in Africa or Asia, the use of ethnography, or longitudinal data? Paton and colleagues 47 recently published an insightful paper that investigated the absences in HPE research.

Despite our field’s compelling need to cite evidence, serious voids in the literature remain: areas in HPE where the literature fails to support educational practices. These voids are no mere “gap”; they are absences. They are important in their effects on how we construct the field of health professions education research: what we include or exclude, what we count or not, what we believe to be true or false, what we do or do not read, who speaks and who is silenced. 47(p6)

If HPE research should be varied with respect to epistemological and methodological approaches, these study findings suggest that much work remains to be done. Further, reflection is needed on the meaning of the absence of stated epistemologies and key methodological details such as participant population, sampling strategies, or rationale for design and analytic choices. Research methodology has symbolic power as it determines the legitimacy of who is included in studies and how, and implicitly give value to ways of knowing. The absences in HPE papers loudly voice the missed opportunities to broaden exploration and innovation in HPE. We call for future studies investigating and addressing the absences to advance how we construct the collective knowledge in HPE.

Lack of reported rigor

Despite the prevalence of quantitative methodologies, there is significant room for improvement regarding methodological rigor in quantitative papers. Discipline-specific medical education journals dominantly published quantitative and mixed methods papers with lower rigor scores compared with other journals. Surprisingly, almost half of the papers with an experimental design were published without reporting the use of an active control group and/or pretest. Practices for rigor, such as controlling confounding variables, analyzing moderating or mediating variables, or providing measurement validity and reliability when expected, were rarely and variably reported in the papers. Given the historical dominance of the quantitative research paradigm and its emphasis on rigor standards, this is counterintuitive yet consistent with prior studies in HPE and other fields. 28,32,41 It may be that for qualitative research papers using a relatively newer group of methodologies, reviewers more often required specific terminology related to methodological rigor, compared with other papers based on a quantitative methodology, in which rigor was assumed. 47 The application of standards for methodological rigor should be emphasized and reinforced, even for those with arguably better understood quantitative methodological approaches.

Relationships between epistemology, research design, and methodologies

Scientific research interplays between research questions, research approaches, and research answers. 23 As a research approach is also based on the dynamic relationship between epistemology, research design, and methodologies, we should not focus on one domain but rather the interaction between them in building a collective set of knowledge in HPE. Prior literature 48 describes how common approaches to mixed methods studies focus too heavily on the methods and data, rather than the relationship between the research question and the logic of inquiry. This is consistent with our findings of the lack of a clear rationale presented in the mixed methods papers reviewed in our study. This reinforces the call for mixed methods researchers to provide a clear description and justification for design choices emphasizing the integration of data and findings from both components. 49 Additionally, it was concerning that the majority of mixed methods papers were described as using triangulation design, often based on data from a quantitative survey with several open-ended qualitative questions. This potentially oversimplifies the nuance that triangulation requires, such as distinctions between within-methods and between-methods triangulation. 50 It begs the question of whether mixed methods is a selected research approach of convenience, rather than a deliberate paradigm unto itself.

Guidelines for methodological rigor

While the guidelines for rigor do exist and should be followed, reporting of rigor standards in publications is varied in HPE papers. This may be caused by different levels of attention to the reporting standards and understanding of rigor in HPE literature. While there has been an explicit effort to communicate reporting standards for qualitative 27,33 and quantitative research, 14,20,21 there are relatively few studies that establish specific guidelines for the rigor of mixed methods studies in HPE. 51 Of those noted in the literature, 13 many rely on quantitative factors (such as statistical power) or oversimplify mixing strategies 52 without clear reference to the philosophical reasoning for the choices of which methods are used. 48 Since no clear guidelines for rigor in mixed methods exist among HPE researchers, it is not surprising that mixed methods papers had the lowest rigor rating among the 3 research approaches. As Creswell and colleagues note, 38 mixed methods research is a research paradigm unto itself, not simply a mixing or joining of 2 different paradigms. The components of rigor, therefore, must also be unique to mixed methods papers, not simply a mixing or joining of quantitative and qualitative rigor. The logic and need for combining is one consideration, but not the only consideration. There should be further considerations about the way in which mixed methods methodological rigor can be determined beyond looking at the qualitative and quantitative components as simply additive.

One quagmire in this exploration is how to appropriately situate Delphi studies in our analysis. Depending on the study, 53 Delphi studies may be seen as mixed methods or qualitative, 54 with little consensus on the definition and characteristics that discern this research approach. As such, there are few studies 55 examining criteria for rigor. 56 Given the wide array of methodological variants on the Delphi technique, it is difficult to situate this research approach in our current work. Future studies are needed to examine the breadth of research design and level of rigor specific to the Delphi technique.


The current study has several limitations. First, the current study focused on the rigor of research methodologies explicitly stated in the papers, not the rigor of the entire studies reported in the research papers. For example, it was not within the scope of the study to evaluate whether the selected methodology (e.g., experimental design) was adequate to answer the study’s research questions. Instead, we focused on how the chosen methodology was appropriately implemented and reported as guided in the literature (e.g., use of active control group, pretest, measurement validity and reliability, and control variables). Second, it is likely that the rigor score questions were not exhaustive to include all rigor standards in each design. We coded based on whether the authors explicitly reported that they used a particular method or not, but often disagreed on whether the remainder of the article described using the method in an appropriate and rigorous way (e.g., whether the authors stated a specific sampling design but later contradicted that design with additional language in the Method section). A future study is needed to investigate research rigor in a more holistic way, especially by adopting a qualitative or mixed methods research approach. In particular, an approach is needed that integrates the role of interpretation in the assessment of methodological rigor. There is also an opportunity to study trends on the breadth and rigor of research methodologies in HPE literature by adopting a longitudinal research design.


Research methodologies in the published papers in HPE journals demonstrated limited variation in epistemological approaches and research designs. The methodological rigor reported in the published papers was limited, which calls for improvement, especially in reporting quantitative and mixed methods research papers. These problems we reported in this paper call for HPE journals’, editors’, and reviewers’ awareness and reflections as well as individual researchers’ as published research papers are the artifacts of co-constructing knowledge production processes. If specific epistemologies and methodologies get rejected at higher rates among HPE journals, and if reviewers are not equipped with guidelines for the rigor of each methodology, researchers will follow the limited institutional expectations, which would create a vicious cycle. This limited methodological breadth and rigor in HPE papers observed in the current study are significant and call for future research and reflections on how our habitual inquiry languages—research methodologies—shape the nature of knowledge in HPE.


1. Guba EG, Lincoln YS. Competing paradigms in qualitative research. Handbook of Qualitative Research. Thousand Oaks, CA: Sage Publications, Inc; 1994:105–117.
2. Ng SL, Baker L, Cristancho S, Kennedy TJ, Lingard L. Qualitative research in medical education: Methodologies and methods. In: Swanwick T, Forrest K, O’Brien BC, eds. Understanding Medical Education: Evidence, Theory, and Practice. 3rd ed. Hoboken, NJ: John Wiley & Sons, Inc; 2018:427–441.
3. Bunniss S, Kelly DR. Research paradigms in medical education research. Med Educ. 2010;44:358–366.
4. McMillan W. Theory in healthcare education research: The importance of worldview. In: Cleland J, Durning SJ, eds. Researching Medical Education. Hoboken, NJ: John Wiley & Sons, Ltd; 2015;15–24.
5. Thomas A, Lubarsky S, Varpio L, Durning SJ, Young ME. Scoping reviews in health professions education: Challenges, considerations and lessons learned about epistemology and methodology. Adv Health Sci Educ. 2020;25:989–1002.
6. Balmer DF, Varpio L, Bennett D, Teunissen PW. Longitudinal qualitative research in medical education: Time to conceptualise time. Med Educ. 2021;55:1253–1260.
7. Biesta GJJ, van Braak M. Beyond the medical model: Thinking differently about medical education and medical education research. Teach Learn Med. 2020;32:449–456.
8. Beckman TJ, Cook DA. Developing scholarly projects in education: A primer for medical teachers. Med Teach. 2007;29:210–218.
9. Turner TL, Balmer DF, Coverdale JH. Methodologies and study designs relevant to medical education research. Int Rev Psychiatry. 2013;25:301–310.
10. Ham TH. Current trends in medical education: A research approach. Acad Med. 1958;33:297–309.
11. Colliver JA, Verhulst SJ. Medical research and qualitative methods: A rational approach. Acad Med. 1996;71:211.
12. Reznik M, Ozuah PO. Trends in study designs in pediatric medical education research, 1992-2011. J Pediatr. 2013;162:222–223.
13. Baernstein A, Liss HK, Carney PA, Elmore JG. Trends in study methods used in undergraduate medical education research, 1969-2007. JAMA. 2007;298:1038–1045.
14. Shea JA, Arnold L, Mann KV. A RIME perspective on the quality and relevance of current and future medical education research. Acad Med. 2004;79:931–938.
15. Munoz-Najar Galvez S, Heiberger R, McFarland D. Paradigm wars revisited: A cartography of graduate research in the field of education (1980–2010). Am Educ Res J. 2020;57:612–652.
16. Johnson RB, Onwuegbuzie AJ. Mixed methods research: A research paradigm whose time has come. Educ Res. 2004;33:14–26.
17. Regehr G. Trends in medical education research. Acad Med. 2004;79:939–947.
18. Dauphinee WD, Wood-Dauphinee S. The need for evidence in medical education: The development of best evidence medical education as an opportunity to inform, guide, and sustain medical education research. Acad Med. 2004;79:925–930.
19. Pivovarova M, Powers JM, Fischman GE. Moving beyond the paradigm wars: Emergent approaches for education research. Rev Res Educ. 2020;44:vii–xvi.
20. Tavakol M, Sandars J. Quantitative and qualitative methods in medical education research: AMEE guide no 90: Part I. Med Teach. 2014;36:746–756.
21. Tavakol M, Sandars J. Quantitative and qualitative methods in medical education research: AMEE guide no 90: Part II. Med Teach. 2014;36:838–848.
22. Varpio L, Martimianakis MA, Mylopoulos M. Qualitative research methodologies: Embracing methodological borrowing, shifting and importing. In: Cleland J, Durning SJ, eds. Researching Medical Education. Hoboken, NJ: John Wiley & Sons, Ltd; 2015;245–256.
23. Park YS, Zaidi Z, O’Brien BC. RIME foreword: What constitutes science in educational research? Applying rigor in our research approaches. Acad Med. 2020;95:Si–Sv.
24. Colbert-Getz JM, Bierer SB, Berry A, et al. What is an innovation article? A systematic overview of innovation in health professions education journals. Acad Med. 2021;96:S39–S47.
25. Akkerman S, Admiraal W, Brekelmans M, Oost H. Auditing quality of research in social sciences. Qual Quant. 2008;42:257–274.
26. Colbert CY, French JC, Arroliga AC, Bierer SB. Best practice versus actual practice: An audit of survey pretesting practices reported in a sample of medical education journals. Med Educ Online. 2019;24:1673596.
27. O’Brien BC, Harris IB, Beckman TJ, Reed DA, Cook DA. Standards for reporting qualitative research: A synthesis of recommendations. Acad Med. 2014;89:1245–1251.
28. Nimon KF, Astakhova M. Improving the rigor of quantitative HRD research: Four recommendations in support of the general hierarchy of evidence. Human Res Dev Quart. 2015;26:231–247.
29. Maula M, Stam W. Enhancing rigor in quantitative entrepreneurship research. Entrep Theory Pract. 2020;44:1059–1090.
30. Marquart F. Methodological rigor in quantitative research. In: Matthes J, ed. The International Encyclopedia of Communication Research Methods. Hoboken, NJ: John Wiley & Sons, Inc; 2017;1–9.
31. Cook DA, Beckman TJ. Reflections on experimental research in medical education. Adv Health Sci Educ. 2010;15:455–464.
32. Kerry MJ, Huber M. Quantitative methods in interprofessional education research: Some critical reflections and ideas to improving rigor. J Interprof Care. 2018;32:254–256.
33. Hanson JL, Balmer DF, Giardino AP. Qualitative research methods for medical educators. Acad Pediatr. 2011;11:375–386.
34. Johnson JL, Adkins D, Chauvin S. A review of the quality indicators of rigor in qualitative research. Am J Pharm Educ. 2020;84:7120.
35. Morse JM. Critical analysis of strategies for determining rigor in qualitative inquiry. Qual Health Res. 2015;25:1212–1222.
36. Tong A, Sainsbury P, Craig J. Consolidated criteria for reporting qualitative research (COREQ): A 32-item checklist for interviews and focus groups. Int J Qual Health Care. 2007;19:349–357.
37. Harrison RL, Reilly TM, Creswell JW. Methodological rigor in mixed methods: An application in management studies. J Mixed Methods Res. 2020;14:473–495.
38. Creswell JW. Choosing a mixed methods design. In: John W. Creswell, Creswell JD, eds. Research Design: Qualitative, Quantitative, and Mixed Methods Approaches. 4th ed. Thousand Oaks, CA: Sage; 2014;58–89.
39. American Educational Research Association. Standards for reporting on empirical social science research in AERA publications: American Educational Research Association. Educ Res. 2006;35:33–40.
40. Crotty M. The Foundations of Social Research. Thousand Oaks, CA: Sage; 2015.
41. Cook DA, Levinson AJ, Garside S. Method and reporting quality in health professions education research: A systematic review. Med Educ. 2011;45:227–238.
42. Cohen JA. coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37–46.
43. Irby DM. Shifting paradigms of research in medical education. Acad Med. 1990;65:622–623.
44. Han H, Kuchinke KP, Boulay DA. Postmodernism and HRD theory: Current status and prospects. Human Res Dev Rev. 2009;8:54–67.
45. Tsai J, Crawford-Roberts A. A call for Critical Race Theory in medical education. Acad Med. 2017;92:1072–1073.
46. Paradis E, Nimmon L, Wondimagegn D, Whitehead CR. Critical Theory: Broadening our thinking to explore the structural factors at play in health professions education. Acad Med. 2020;95:842–845.
47. Paton M, Kuper A, Paradis E, Feilchenfeld Z, Whitehead CR. Tackling the void: The importance of addressing absences in the field of health professions education research. Adv Health Sci Educ. 2021;26:5–18.
48. Shaw RL, Hiles DR, West K, Holland C, Gwyther H. From mixing methods to the logic(s) of inquiry: Taking a fresh look at developing mixed design studies. Health Psychol Behav Med. 2018;6:226–244.
49. O’Cathain A, Murphy E, Nicholl J. The quality of mixed methods studies in health services research. J Health Serv Res Po. 2008;13:92–98.
50. Johnson RB, Onwuegbuzie AJ, Turner LA. Toward a definition of mixed methods research. J Mixed Methods Res. 2007;1:112–133.
51. Lavelle E, Vuk J, Barber C. Twelve tips for getting started using mixed methods in medical education research. Med Teach. 2013;35:272–276.
52. Bryman A. Integrating quantitative and qualitative research: How is it done? Qual Res. 2006;6:97–113.
53. Brady SR. Utilizing and adapting the Delphi method for use in qualitative research. Int J Qual Methods. 2015;14:1609406915621381–1609406915621386.
54. McPherson S, Reese C, Wendler MC. Methodology update: Delphi studies. Nurs Res. 2018;67:404–410.
55. Hasson F, Keeney S, McKenna H. Research guidelines for the Delphi survey technique. J Adv Nurs. 2000;32:1008–1015.
56. Niederberger M, Spranger J. Delphi technique in health sciences: A map. Front Public Health. 2020;8:457.

Supplemental Digital Content

Copyright © 2022 by the Association of American Medical Colleges