Pushing Critical Thinking Skills With Multiple-Choice Questions: Does Bloom’s Taxonomy Work? : Academic Medicine

Secondary Logo

Journal Logo


Pushing Critical Thinking Skills With Multiple-Choice Questions: Does Bloom’s Taxonomy Work?

Zaidi, Nikki L. Bibler PhD; Grob, Karri L. EdS; Monrad, Seetha M. MD; Kurtz, Joshua B.; Tai, Andrew MD, PhD; Ahmed, Asra Z. MD; Gruppen, Larry D. PhD; Santen, Sally A. MD, PhD

Author Information
Academic Medicine 93(6):p 856-859, June 2018. | DOI: 10.1097/ACM.0000000000002087
  • Free
  • AM Rounds Blog Post


Medical schools have an obligation to train aspiring physicians to develop higher-order thinking skills that support clinical reasoning and deep learning. Clinical reasoning requires an ability to synthesize large amounts of information, apply critical thinking, and evaluate possible outcomes.1 Good assessment practices can help support medical students’ understanding of core concepts and foster their ability to integrate and synthesize information.2–4 Therefore, medical schools rely heavily on sound assessments to support learning.5–9

Multiple-choice questions (MCQs) remain the most frequently used method to measure student learning.5,7,10,11 MCQs assess a large number of concepts, cover a wide variety of domains, provide a wealth of item-level statistical data, and deliver efficient grading and feedback.10 It is generally believed that well-written MCQs call for examinees to engage higher levels of cognitive reasoning such as application or synthesis of knowledge,6,10,12–15 which can better assess their critical thinking skills.5,11,13,16,17 One method for ensuring that MCQs measure higher-order thinking, rather than an examinee’s ability to simply recall factual information, is to apply the cognitive domains of Bloom’s taxonomy when creating MCQs.10,12,18,19

Application of Bloom’s Taxonomy for Assessment

Bloom’s taxonomy involves six cognitive domains used by learners to acquire, retain, and use new information: knowledge, comprehension, application, analysis, synthesis, and evaluation.20 This cognitive hierarchy is based on the premise that to achieve higher-order levels of learning such as synthesis and evaluation, students must first apply the lower levels of learning such as recall and comprehension.21 The complex and nuanced world of clinical practice requires learners to develop higher-order thinking skills that include all levels of Bloom’s taxonomy.

While Bloom’s framework was originally developed to assist with curriculum design and development, it has also been used to inform and guide assessment.6,10,16 In fact, Bloom’s taxonomy has been used to identify MCQs that assess students’ critical thinking skills, with evidence suggesting that higher-order MCQs support a deeper conceptual understanding of scientific process skills.16 In line with this ideology, the National Board of Medical Examiners’ item writing guide recommends that basic science and clinical MCQs begin with a vignette that contextualizes a topic and compels examinees to determine relevant information. This, in turn, is meant to assess higher-level cognitive processes.13

At the University of Michigan Medical School, a team of assessment specialists (including N.L.B.Z., K.L.G., and S.A.S.) developed a process for categorizing MCQs used in preclerkship exams as “lower order” and “higher order,” according to a dichotomized Bloom’s taxonomy.22 We developed this process to encourage faculty to write questions beyond simple identification of factual information. The 1956 Bloom’s taxonomy was used to categorize the knowledge and comprehension levels as “lower order,” and the other levels—application, analysis, synthesis, and evaluation—as “higher order.”10,18 While this dichotomy truncates Bloom’s, these two dichotomized levels help differentiate between MCQs that rely on factual recall and those that probe higher-order thinking skills. Given the variability in how Bloom’s framework has been applied to MCQs,10 we developed a dichotomy that could be reliably applied among our non-content-expert assessment specialists and easily understood by our faculty, many of whom do not have backgrounds in educational theory.

This process demonstrated good initial reliability and validity evidence when applied by assessment specialists22; in this Perspective, however, we discuss differences between the application of the dichotomized Bloom’s taxonomy to MCQs by question writers and examinees. Overall, we found that faculty content experts (generally the question writer) and student novice learners (the examinee) approach MCQs in different ways—largely because of differences in foundational knowledge, which influence the cognitive levels being used to address an MCQ.10 Faculty question writers may think they are assessing higher-order thinking skills with their MCQs; however, the examinees may only need lower-order thinking skills to successfully answer these questions. Likewise, ostensibly lower-order questions may actually tap into higher-order thinking skills. We describe some factors that may influence approaches to writing and answering MCQs. Overall, we believe four key factors affect an examinee’s interaction with test material and subsequently influence the cognitive levels needed to answer MCQs: pedagogy and approach to learning, confidence with test material, images and graphs, and question format.

Pedagogy and Approach to Learning

The manner in which students are taught, as well their individual approaches to learning, may affect how they approach MCQs. This may also influence whether an MCQ is perceived to assess higher-order versus lower-order thinking skills. Some content is intentionally taught as lower order by focusing on knowledge and comprehension because the material represents core material that all students must know or foundational building blocks necessary to engage in higher-order thinking at a later stage. For example, a didactic session on the anatomy of the hand that teaches the names of the metacarpal bones may represent lower-order instruction. MCQs that test content taught verbatim during lecture are perceived as lower order by students because these items generally assess factual recall.

In contrast, other material may be taught in a manner designed to teach conceptual schemas and their application. Associated MCQs may require students to draw from the schemas using higher-order thinking at the four highest of Bloom’s levels—application, analysis, synthesis, and evaluation. For example, a biochemistry MCQ might ask how many ATP molecules are generated per glucose molecule through glycolysis. Students who have been instructed to learn the individual reactions comprising glycolysis may view this as a higher-order question because they must integrate the individual reactions that make up the overall pathway. On the other hand, the students who recall from rote memory that two ATP molecules per glucose molecule are generated during glycolysis will view this as a lower-order question. As another example, some symptoms or disease processes are taught within a conceptual schema. Chronic diarrhea can be subdivided by pathophysiologic mechanisms, and students may be taught to approach a patient with chronic diarrhea by deciding on the mechanism of diarrhea first before making a specific disease diagnosis. Therefore, when material is taught as part of a larger conceptual schema (e.g., different types of diarrhea) and students then bring in diseases to place in the schema, this fosters higher-order thinking skills beyond simple memorization. Accompanying MCQs will likely be regarded as higher order because faculty intend for the students to use conceptual schemas to analyze distinctions rather than remember each detail.

Bloom’s taxonomy is also influenced by how students choose to learn. While material may be taught with higher-level frameworks or conceptual links between facts, students may choose instead to memorize facts rather than synthesize concepts to develop deeper understanding. Students may consider an MCQ as higher order if they apply a conceptual schema to analyze all possible options; conversely, the very same MCQ may be lower order if the examinee simply memorized a key fact within the MCQ that allows for rote recall. For example, some students may choose to memorize all of the diseases that cause diarrhea and the frameworks they are associated with, thus rendering MCQs that should be higher-order analysis to a recall level. On the other hand, some learners approach all testing as higher order, choosing to analyze and apply concepts between subjects when it is only intended for them to recall facts. Thus, the method of teaching as well as the students’ approach to learning and testing will affect the perception of MCQs as higher or lower order.

Confidence With Test Material

Students’ confidence with the tested material presented in the MCQ will also influence Bloom’s categorizations. When examinees are confident in their ability to correctly answer a question, they may be more likely to use a lower-order approach, even if it was intended by the question writer to assess higher-order thinking skills. On the contrary, a lack of confidence in their command of the content may require examinees to spend more time evaluating and synthesizing information to rank the possible answers in order of likelihood (e.g., process of elimination, which is akin to creating differential diagnoses), thereby employing a higher-order approach. Examinees’ educational and professional backgrounds also influence their confidence and approach to the tested material. Students with a deeper background in specific content areas are able to draw from a larger body of heuristic techniques and knowledge in the process of answering an MCQ, thus allowing one or more cognitive steps to be skipped. In contrast, students without similar backgrounds will likely consider the same MCQs to be testing higher-order thinking.

Images and Graphs

If an MCQ includes an image that has been previously shown in the lecture, it will test memorization rather than deductive reasoning skills. Therefore, it will likely be considered a lower-order question by examinees—even if the MCQ requires examinees to do more than simply recognize the image. For example, identifying a line on a pulmonary function graph previously shown in a lecture may be considered memorization and require only lower-order thinking (e.g., knowledge), even if the MCQ includes a clinical scenario to place the material within conceptual schema. In contrast, MCQs that use images similar to, but slightly different from, those shown in lecture are often considered higher order. These types of questions force examinees to analyze similarities and differences between the image being tested and the image presented in the lecture.

Question Format

The format and level of detail used in an MCQ also factor into Bloom’s categorizations. In general, MCQs incorporating stems with complex vignettes in which examinees have to discern relevant from irrelevant information are generally considered higher order. Conversely, MCQs are lower order if the entire question stem is irrelevant to selecting the correct response (e.g., when the question lead-in and response options are sufficient to answer the question without the question stem) because students need only to comprehend the tested material and not analyze the information provided.

At times, when MCQs are intended to assess higher-order cognitive processes, examinees may rely on the core material that they memorized to answer the questions. For example, MCQs may include complex, clinically oriented vignettes and require students to synthesize multiple sources of information to arrive at the correct answer. This, in theory, should make these MCQs higher order. However, when the pivotal piece of information needed to answer these types of MCQs is specifically highlighted in a lecture, examinees can quickly recognize the correct answer through memorization alone. For instance, a clinical vignette might incorporate a single pathognomonic data point that, if recognized by the learner, indicates the correct answer and reduces the question to lower order (e.g., Kayser–Fleischer rings in Wilson disease). Therefore, when a vignette provides information that is too specific, making a particular diagnosis obvious based on simple pattern recognition, the MCQs are considered lower order by examinees. Omitting this data point forces the test taker to sort through the other information in the vignette to answer the question correctly.

Implications and Considerations

We believe that a variety of factors will affect an examinee’s interaction with test material and influence the cognitive processes involved in answering MCQs. The interaction among these factors will also influence examinees’ approach to testing. Therefore, identifying where questions fit along Bloom’s taxonomy cannot be absolute but, rather, is relative to student learning and perception.

The contextual nature of medical education requires that students engage in meaningful learning from a variety of sources.23 But, to accomplish this, learners must construct schemas that draw from prior knowledge to assist in the acquisition of new knowledge.23 As students build on previous knowledge, their developing schemas allow for more rapid resolution to “routine” tasks and influence their approaches to learning.23,24 Consequently, students’ individualized approaches to learning and subjective goals affect their approach to testing.25,26 The diverse experiences of medical students may therefore result in varied perceptions of the cognitive difficulty of and approach toward MCQs. This connection between experience and learning raises provocative questions regarding assessment of student knowledge and development. Exploration of other factors, such as cognitive ability and understanding of Bloom’s taxonomy, may provide additional insight. Therefore, further investigation is important because it will continue to guide assessment question-writing efforts and ensure that MCQs are written to align with their intended cognitive level.

Our original intent for implementing a Bloom’s taxonomical framework for categorizing MCQs was to encourage faculty to create more test items that probe students’ higher-order thinking skills. This goal was rooted in the assessment literature, which suggests that higher-order MCQs are important for stimulating the critical thinking skills that support clinical reasoning.5,6,12,17 Here, we propose that it may not be possible to objectively apply Bloom’s taxonomy for this purpose. This conclusion aligns with similar research that found faculty struggled to objectively and reliably apply Bloom’s framework27,28 and that Bloom’s categorizations may be applied differently based on background and overall expertise.10,18,28–30 While training, calibration exercises, and rubrics can support a consistent application of Bloom’s taxonomy,10,18 these measures alone may not ultimately improve the validity of the categorizations. Despite these limitations, perhaps training faculty in Bloom’s taxonomy will create a greater awareness of the different cognitive levels and foster greater intentionality in question writing. In doing so, however, it is important to recognize that the students may still approach questions from a different perspective.

In summary, as faculty work to encourage deep understanding and higher-order thinking, it is important to remember that medical students’ approach to learning and confidence with the material can influence the way that students approach and answer MCQs. Likewise, question writers’ choice of MCQ format and inclusion of graphs and images will also influence the examinees’ approach. Therefore, as faculty develop MCQs, they should consider the way students approach learning and test questions. Varying the types of questions used can help foster critical thinking that supports all levels of Bloom’s framework.

Acknowledgments: The authors wish to thank Paula Ross, PhD, for her many contributions to this article.


1. Eva KW. What every teacher needs to know about clinical reasoning. Med Educ. 2005;39:98106.
2. Epstein RM. Assessment in medical education. N Engl J Med. 2007;356:387396.
3. Cilliers FJ, Schuwirth LW, Herman N, Adendorff HJ, van der Vleuten CP. A model of the pre-assessment learning effects of summative assessment in medical education. Adv Health Sci Educ Theory Pract. 2012;17:3953.
4. Buckwalter JA, Schumacher R, Albright JP, Cooper RR. Use of an educational taxonomy for evaluation of cognitive performance. J Med Educ. 1981;56:115121.
5. Ali SH, Ruit KG. The impact of item flaws, testing at low cognitive level, and low distractor functioning on multiple-choice question quality. Perspect Med Educ. 2015;4:244251.
6. Palmer EJ, Devitt PG. Assessment of higher order cognitive skills in undergraduate education: Modified essay or multiple choice questions? Research paper. BMC Med Educ. 2007;7:49.
7. Freiwald T, Salimi M, Khaljani E, Harendza S. Pattern recognition as a concept for multiple-choice questions in a national licensing exam. BMC Med Educ. 2014;14:232.
8. Kibble JD. Best practices in summative assessment. Adv Physiol Educ. 2017;41:110119.
9. Wass V, McGibbon D, Van der Vleuten C. Composite undergraduate clinical examinations: How should the components be combined to maximize reliability? Med Educ. 2001;35:326330.
10. Thompson AR, O’Loughlin VD. The Blooming Anatomy Tool (BAT): A discipline-specific rubric for utilizing Bloom’s taxonomy in the design and evaluation of assessments in the anatomical sciences. Anat Sci Educ. 2015;8:493501.
11. Tarrant M, Ware J. Impact of item-writing flaws in multiple-choice questions on student achievement in high-stakes nursing assessments. Med Educ. 2008;42:198206.
12. Kim MK, Patel RA, Uchizono JA, Beck L. Incorporation of Bloom’s taxonomy into multiple-choice examination questions for a pharmacotherapeutics course. Am J Pharm Educ. 2012;76:114.
13. Billings MS, DeRuchie K, Haist SA, et al. Constructing Written Test Questions for the Basic and Clinical Sciences. 2016.4th ed. Philadelphia, PA: National Board of Medical Examiners.
14. Schultheis NM. Writing cognitive educational objectives and multiple-choice test questions. Am J Health Syst Pharm. 1998;55:23972401.
15. Burns ER. “Anatomizing” reversed: Use of examination questions that foster use of higher order learning skills by students. Anat Sci Educ. 2010;3:330334.
16. Jensen JL, McDaniel MA, Woodard SM, Kummer TA. Teaching to the test…or testing to teach: Exams requiring higher order thinking skills encourage greater conceptual understanding. Educ Psychol Rev. 2014;26:307329.
17. Choudhury B, Freemont A. Assessment of anatomical knowledge: Approaches taken by higher education institutions. Clin Anat. 2017;30:290299.
18. Crowe A, Dirks C, Wenderoth MP. Biology in bloom: Implementing Bloom’s Taxonomy to enhance student learning in biology. CBE Life Sci Educ. 2008;7:368381.
19. Thomson AR, Scopa Kelso R, Ward PJ, Wines K, Hanna JB. Assessment driven learning: The use of higher-order and discipline-integrated questions on gross anatomy practical examinations. Med Sci Educ. 2016;26:587596.
20. Bloom B, Englehart M, Furst E, Hill W, Krathwohl D. Taxonomy of Educational Objectives: The Classification of Educational Goals. Handbook I: Cognitive Domain. 1956.New York, NY: Longmans, Green.
21. Krathwohl D. A revision of Bloom’s Taxonomy: An overview. Theory Pract. 2002;4:212218.
22. Zaidi NB, Grob KL, Yang J, et al. Theory, process, and validation evidence for a staff-driven medical education exam quality improvement process. Med Sci Educ. 2016;23:331336.
23. Ruiter DJ, van Kesteren MT, Fernandez G. How to achieve synergy between medical education and cognitive neuroscience? An exercise on prior knowledge in understanding. Adv Health Sci Educ Theory Pract. 2012;17:225240.
24. Merriam S, Caffarella R, Baumgartner L. Learning in Adulthood: A Comprehensive Guide. 2007.3rd ed. San Francisco, CA: Jossey-Bass.
25. Entwistle N, Entwistle A. Contrasting forms of understanding for degree examinations: The student experience and its implications. Higher Educ. 1991;22:205227.
26. Cilliers FJ, Schuwirth LW, Adendorff HJ, Herman N, van der Vleuten CP. The mechanism of impact of summative assessment on medical students’ learning. Adv Health Sci Educ Theory Pract. 2010;15:695715.
27. Kibble JD, Johnson T. Are faculty predictions or item taxonomies useful for estimating the outcome of multiple-choice examinations? Adv Physiol Educ. 2011;35:396401.
28. Cunnington JP, Norman GR, Blake JM, Dauphinee WD, Blackmore DE. Applying learning taxonomies to test items: Is a fact an artifact? Acad Med. 1996;71(10 suppl):S31S33.
29. Karpen SC, Welch AC. Assessing the inter-rater reliability and accuracy of pharmacy faculty’s Bloom’s Taxonomy classifications. Curr Pharmacol Teach Learn. 2016;8:885888.
30. Zaidi NB, Hwang C, Scott S, Stallard S, Purkiss J, Hortsch M. Climbing Bloom’s Taxonomy pyramid: Lessons from a graduate histology course. Anat Sci Educ. 2017;10:456464.
Copyright © 2017 by the Association of American Medical Colleges