The United States Medical Licensing Examination (USMLE) aims to protect the public by providing medical licensing authorities with information about a physician’s competence in the knowledge and skills important for the provision of safe and effective patient care. As the final examination in the USMLE sequence, Step 3 assesses the application of foundational and clinical medical knowledge required “for the unsupervised practice of medicine, with an emphasis on patient management in ambulatory settings.” 1 It includes both a multiple-choice question (MCQ) and computer-based case simulation (CCS) component. Overall, CCS is designed to evaluate an examinee’s approach to patient management, including elements of provisional diagnosis, treatment, and monitoring. 1 Through a combination of both content and format, CCS is designed to capture unique information about a physician that other aspects of the licensing examination sequence may not. 2 More specifically, CCS requires that examinees manage a series of virtual patients that present with diverse histories and symptoms in dynamic, interactive, simulated patient care settings. 3
As with any examination program, the ongoing collection and evaluation of validity evidence for USMLE score interpretations is a central component of ensuring that the USMLE fulfills its mission. One source of validity evidence focuses on an analysis of the relationships among examination outcomes and other external criterion measures thought to measure similar constructs 4 and speaks to the extrapolation element of a validity argument. 5 The extrapolation element of a validity argument focuses on the extent to which the inferences made from examination scores can be extended to an examinee’s ability outside of the test setting, an important aspect of any licensure examination aimed at protecting public safety.
For medical licensure examinations, disciplinary actions may be a particularly useful external criterion measure, as disciplinary actions can be received for behavior that threatens patient safety and can include sanctions up to and including license revocations. Responsible for disciplining physicians for professionally improper or clinically incompetent behavior, state medical boards receive complaints about physicians from patients, hospitals, health care professionals, and other concerned individuals. 6 Once a complaint is received, state medical boards conduct thorough investigations and determine whether a disciplinary action is warranted and if it is, the severity of the action to be taken against the physician. 6 Past investigations of medical licensure examination scores and disciplinary actions generally reveal negative associations, providing some validity evidence for the use and interpretation of licensure examination scores. 7–9 In addition to medical licensure examination performance, other studies identify negative associations between board certification in a specialty area and disciplinary actions, providing some validity evidence for the use and interpretation of certification examination scores. 10–17
With respect to Step 3, previous validity research has highlighted associations between Step 3 scores and scores from other educational and professional assessments thought to cover similar content. For example, several specialty-specific studies note positive associations between Step 3 scores and performance on in-training examinations 18–20 and board certification examinations. 21–23 Less attention, however, has focused on the extent to which Step 3 scores may relate to future behavior in clinical practice. Moving beyond Step 3’s associations with educational and professional assessments, such work would help to expand the types of external criterion measures examined in Step 3 validity research and in doing so provide a more robust evaluation of various sources of validity evidence for Step 3 score uses and interpretations. This in turn would provide some assurance that Step 3 captures information relevant to the licensure process.
To begin to address this gap, this study examines the associations between Step 3 scores and subsequent receipt of disciplinary actions taken by state medical boards for problematic behavior in practice. It analyzes Step 3 total, CCS, and MCQ scores separately. This approach allows for an examination of the total test scores used to make licensure decisions of which CCS is a component, as well as an examination of the scores for the unique competencies that CCS is intended to measure—above and beyond what the MCQs measure—many of which are central to the successful independent practice of medicine.
Method
Data and sample
The dataset used in this study was obtained by merging physicians’ USMLE performance information from NBME and associated licensure and disciplinary action information from the Federation of State Medical Boards (FSMB). The licensure and disciplinary data for this study came from FSMB’s Physician Data Center, a national repository for state medical boards that provides information about all licensed physicians in the United States and its territorial jurisdictions. All state medical boards provide updated licensure data to the Physician Data Center, with almost all boards providing information monthly. Disciplinary data is provided continuously throughout the year by medical boards typically when action information is finalized and made public.
The final sample included 275,392 board-certified physicians who graduated from MD-granting medical schools and who passed Step 3 between 2000 and 2017. Some physicians passed Step 3 the first time that they took the examination, while others passed it on a repeat attempt. The physicians in the sample represented 50 practice jurisdictions, including Washington, DC, reflecting a national sample. Pertinent data were unavailable for South Dakota which has fewer than 5,600 physicians, representing 0.34% of the total licensed physician population in the United States; this state was removed from the sample. In addition, the physicians in the sample practiced in 17 major medical specialty areas representing a range of training and expertise. As part of a data sharing agreement with the American Board of Medical Specialties, the FSMB receives specialty certification data directly from the American Board of Medical Specialties. The specialty information used in this study stems from this arrangement and as such all physicians in the study sample were board certified.
This study was reviewed and approved by the American Institutes for Research Institutional Review Board.
Variables
In all analyses, the dependent variable was a binary measure indicating whether a physician had ever received a punitive disciplinary action from a state medical board (0 = no action, 1 = at least one action). A binary measure was used to understand potential differences in Step 3 scores for physicians with disciplinary actions compared with those without them. Here, behaviors that rise to the level of formal disciplinary actions are considered to be meaningfully different, in general, from behaviors that do not warrant actions.
The primary independent variables included Step 3 total score, Step 3 CCS score, and Step 3 MCQ score. Although separate scores are computed for the CCS and MCQ components of Step 3, it is the total score that is used to make pass/fail decisions and that is reported to examinees. Step 3 total scores are standardized scores that range from 1 to 300. CCS performance is evaluated based on case-specific scoring algorithms that represent codified expert physician-defined criteria. 1,24 CCS cases are scored between 1 and 9, with 9 being the optimum performance. For this study, Step 3 CCS scores were calculated as the average of the case-level 1-to-9 scores. A measure of examinee performance on just the MCQ portion of Step 3 was estimated using the Rasch model 25 to make scores comparable across years; MCQ scores mainly range from −3 to 3, where higher scores indicate better performance. Step 3 total, CCS, and MCQ scores were converted to standardized z scores for analysis purposes with a mean of 0 and a standard deviation (SD) of 1 within year. This was done to account for possible differences in scores across a 17-year time period and for ease of interpretation.
To account for possible differences in prior ability and achievement, Step 1 and Step 2 Clinical Knowledge (CK) scores were included in the analyses as covariates. From a validity perspective, this allowed for comparisons of combined and relative effects of each of the steps in the USMLE sequence. Other independent variables treated as covariates included physician gender (0 = female, 1 = male), medical school location (United States = 0, outside of the United States = 1), Step 3 attempt (0 = first-time taker, 1 = repeater), and the number of years a physician had been practicing medicine. Time in practice was used as a proxy for exposure to receive a disciplinary action; in this way, time was controlled for in the analyses. Several studies have shown that male physicians 8–11,13–17,26,27 and physicians who have practiced in medicine longer 8,10,26,27 tend to have a higher risk of receiving a disciplinary action from a state medical board. With respect to medical school location, inconsistent findings have been reported in the literature. For example, several studies have found that graduates of medical schools outside of the United States are more likely to be disciplined by state medical boards. 11,13,27 Yet, other research has found that graduates of medical schools outside of the United States are less likely to be disciplined 15 or has demonstrated no significant difference between graduates of U.S. medical schools and graduates of medical schools outside of the United States in terms of the likelihood of receiving a disciplinary action. 10
Statistical analyses
Descriptive statistics were computed based on the full sample. In addition, descriptive statistics were computed for USMLE scores and years in practice by physician subgroups (e.g., gender, medical school location). These results provide basic information for understanding the observed patterns as well as for interpreting the results of the inferential statistics.
Physicians are jointly nested in both jurisdiction and specialty in that physicians in the same jurisdiction do not all practice in the same clinical area and physicians in the same specialty do not all practice in the same place. Moreover, previous research shows significant variations in disciplinary actions by medical specialty area 10,11,26,27 and practice jurisdiction. 28–30 Given the data structure and prior research findings, cross-classified multilevel models 31,32 were used to estimate the primary relationships of interest—i.e., between Step 3 scores and receipt of a disciplinary action—while also accounting for other physician-level factors and controlling for both jurisdiction and specialty. To facilitate interpretation of results, all Step 1 and Step 2 CK scores were converted to z scores before entry into the models.
Cross-classified multilevel logistic regression models were used to examine the effects of Step 3 total, CCS, and MCQ scores on the likelihood of receiving a disciplinary action. One set of models included Step 3 total score as an independent variable and did not include Step 3 CCS or MCQ scores. This was done because CCS and MCQ performance is already considered in calculations of the Step 3 total score and thus a model including these 3 performance indicators as independent variables would be conceptually redundant as well as pose possible issues of multicollinearity. A second set of models included Step 3 CCS and MCQ scores as independent variables and did not include Step 3 total score. Including Step 3 CCS and MCQ scores in the same model allows for the estimation of the unique and relative effects of the CCS and MCQ portions of Step 3. All analyses were conducted using R 3.5.1. 33
Results
Table 1 summarizes the descriptive statistics for the full sample (n = 275,392). Fifty-two percent (n = 143,672) of the sample was male, 32% (n = 86,984) attended a medical school outside of the United States, and 9% (n = 23,485) passed Step 3 on a repeat attempt during the study period. On average, the physicians in the sample had been practicing medicine for 8 years (SD = 4). They had an average Step 1 score of 216 (SD = 24) and an average Step 2 CK score of 220 (SD = 27). Their average Step 3 total score was 212 (SD = 20), their average Step 3 CCS score was 5.76 (SD = 0.76), and their average Step 3 MCQ score was 1.21 (SD = 0.34). One percent (n = 2,391) of the sample had received at least one disciplinary action from a state medical board for improper or incompetent patient care.
Table 1: Descriptive Statistics for the Study Sample (n = 275,392 Board-Certified Physicians Who Graduated From MD-Granting Medical Schools and Passed USMLE Step 3 Between 2000 and 2017)
Table 2 provides the results from the final cross-classified multilevel logistic regression model investigating the association between Step 3 total score and the likelihood of receiving a disciplinary action. Results showed that physicians with higher Step 3 total scores tended to have lower chances of receiving a disciplinary action. Specifically, the odds ratio (OR) was 0.77 (P < .001), which indicates that, on average, there was a 23% decrease in the odds that a physician would receive a disciplinary action for each 1-SD increase (the equivalent of about 20 points) in Step 3 total score. The OR for Step 2 CK score was 0.91 (P = .004), suggesting a 9% decrease in the odds of receiving a disciplinary action for every 1-SD increase (the equivalent of about 27 points) in Step 2 CK score. With respect to the other covariates included in the model, Step 3 repeaters (OR = 1.23, P = .003), physicians who have been practicing for more years (OR = 1.25, P < .001), and male physicians (OR = 2.07, P < .001) were more likely to receive a disciplinary action. Step 1 score and medical school location were statistically unrelated to the likelihood of receiving a disciplinary action.
Table 2: Results of Final Cross-Classified Multilevel Logistic Regression Model Predicting Disciplinary Action Using USMLE Step 3 Total Score (n = 275,392 Board-Certified Physicians Who Graduated From MD-Granting Medical Schools and Passed USMLE Step 3 Between 2000 and 2017)
Table 3 presents the results from the final cross-classified multilevel logistic regression model examining the associations between Step 3 CCS and MCQ scores and the likelihood of receiving a disciplinary action. In general, the results mirrored those found for Step 3 total score in that physicians with higher Step 3 CCS and MCQ scores were less likely to receive a disciplinary action. More specifically, the odds that a physician would receive an action, on average, decreased by 11% for each 1-SD increase (the equivalent of about .76 points) in Step 3 CCS score (OR = 0.89, P < .001) and by 17% for each 1-SD increase (the equivalent of about .34 logits) in Step 3 MCQ score (OR = 0.83. P < .001). The OR for Step 2 CK score was 0.91 (P = .003), suggesting a 9% decrease in the odds of receiving a disciplinary action for every 1-SD increase (the equivalent of about 27 points) in Step 2 CK score. Again, Step 3 repeaters (OR = 1.42, P < .001), physicians who have been practicing for more years (OR = 1.24, P < .001), and male physicians (OR = 2.07, P < .001) were more likely to receive a disciplinary action. Consistent with the Step 3 total score model, Step 1 score and medical school location were statistically unrelated to the likelihood of receiving a disciplinary action.
Table 3: Results of Final Cross-Classified Multilevel Logistic Regression Model Predicting Disciplinary Action Using USMLE Step 3 CCS and MCQ Scores (n = 275,392 Board-Certified Physicians Who Graduated From MD-Granting Medical Schools and Passed USMLE Step 3 Between 2000 and 2017)
Discussion
This study examines the associations between physicians’ performance on Step 3—the last step in the USMLE licensing examination sequence that physicians take before entering independent practice—and the likelihood of being disciplined by a state medical board for improper or incompetent behavior in practice. To our knowledge, this is the first study to focus on validity evidence for Step 3 scores as they relate to future practice behaviors. Using a large national sample of physicians representing a range of practice jurisdictions and specialty areas, our results show an average expected 23%, 11%, and 17% decrease in the chances of receiving a disciplinary action from a state medical board for each 1-SD increase in Step 3 total, Step 3 CCS, and Step 3 MCQ score, respectively.
Because Step 3 is intended to assess application of foundational and clinical medical knowledge essential for the unsupervised practice of medicine, disciplinary actions related to both incompetent and unprofessional behavior may be related to Step 3 performance. Moreover, a portion of Step 3 content covers the following competencies—communication and interpersonal skills, professionalism, legal and ethical issues, systems-based practice, and patient safety—suggesting that the content measured by Step 3 may overlap with the reasons for which physicians may receive disciplinary actions.
In general, the results of this study provide some validity evidence in support of Step 3 scores for use in determining medical licensure in the United States, given that licensure is intended to signal readiness for the safe and effective unsupervised practice of medicine. Our findings suggest that how well a physician does on Step 3 may, on average, provide useful information about the safety and effectiveness of their subsequent clinical practice as measured by disciplinary actions.
As noted, the CCS portion of Step 3 represents a unique element within the USMLE sequence that captures examinee behaviors related to patient management in a simulated environment. Overall, this study suggests that CCS scores may contribute important discrete information, above and beyond the information that can be gleaned from Step 3 MCQ scores, for understanding practice patterns as characterized by disciplinary actions received from a state medical board for unprofessional or incompetent behavior. It is possible that both the content and format of CCS provide examinees an opportunity to demonstrate application of knowledge and skills relevant to safe and effective practice in ways that the MCQ portion of Step 3 does not. While CCS may contribute unique information, our results indicate that the negative associations between Step 3 total and MCQ scores and receipt of disciplinary actions may be stronger than the comparable negative associations for CCS scores. Past research documents potential threats to the reliability of CCS scores, 24,34–36 which provides one possible explanation for the smaller association for CCS scores. Arguably, the associations might be greater if CCS scores were more reliable.
It is important to note that the negative association between Step 3 scores and the likelihood of disciplinary action were found after accounting for physicians’ performance on Step 1 and Step 2 CK. Consistent with previous research, 8 the negative effect of Step 1 score became statistically indistinguishable from zero after Step 2 CK and Step 3 scores were included in the analysis. This may reflect Step 1’s focus on the scientific foundations of medicine, which may be less influential once competence in certain knowledge and skill domains is gleaned from Step 2 CK and Step 3 scores. In addition, the temporal order in which the Step examinations tend to be taken may play a role, as Step 1 is typically taken first, thus, its completion tends to be the most distant from entry into independent practice.
Unlike Step 1, the effect for Step 2 CK scores continued to be statistically distinguishable from zero when Step 3 scores were included in the analysis. This suggests that both Step 2 CK and Step 3 provide unique information about the likelihood that a physician will receive a disciplinary action for problematic behavior in practice. Furthermore, our results indicate that Step 3 total scores (which, as noted, encompass CCS performance) are more strongly related to the chance of disciplinary action than Step 2 CK scores. This may be due to examination content and timing. Step 2 CK focuses on the knowledge and skills needed for supervised practice, whereas Step 3 focuses on the knowledge and skills needed for unsupervised practice. Therefore, the content included on the Step 3 examination may be more representative of what is required of physicians once they enter independent practice. Additionally, Step 3 is the last licensing examination physicians take before entering independent practice; thus, its stronger effect may be due to its proximity to when disciplinary actions are most likely to be taken.
In terms of the other characteristics examined as covariates in this study, findings support previous research showing that male physicians 8–11,13–17,26,27 and physicians who have been in practice longer 8,10,26,27 are more likely to receive disciplinary actions. In addition, the effect of medical school location was statistically indistinguishable from zero, which one other study has also found. 10 However, the broader findings in the literature regarding differences in disciplinary actions for graduates of U.S. medical schools compared with graduates of medical schools outside of the United States remain mixed. 11,13,15,27
This study does have limitations. First, determinations about whether individuals who pass Step 3 provide better patient care than individuals who fail the examination cannot be drawn, since failure prevents the possibility of legal practice. Focusing on individuals who have obtained a medical license, however, allows for an examination of the underlying score scale used to determine pass/fail standards and as such provides an approach for evaluating the relevance of competence in the knowledge and skills measured by Step 3 for performance in unsupervised medical practice.
Second, given that board-certified physicians tend to have higher USMLE scores, 19–23 it may be that our inclusion of only board-certified physicians underestimated the extent of the associations found among Step 3 performance measures and subsequent receipt of disciplinary actions. To examine this possibility, future research should study similar relationships as those examined in the current study for both board-certified and non-board-certified physicians when specialty area data are available for the noncertified group.
Finally, this study treated all disciplinary actions the same and considered neither the types of offenses for which physicians were sanctioned nor the severity of the punishments that they received. It is possible that for offenses linked more closely to the content assessed by Step 3 (e.g., mismanagement of a patient resulting in undue patient harm), the effect of Step 3 scores on the likelihood that a physician will receive a disciplinary action may be greater. As an initial investigation, this study demonstrates that overall, physicians with higher Step 3 scores are less likely to be disciplined for problematic behavior in practice. Future studies should explore the associations between Step 3 scores and the types of behaviors for which physician receive disciplinary actions and the severity of the sanctions that they receive.
In summary, this study analyzed the associations between 2 medical regulatory practices in the United States—licensure and discipline—both of which have implications for patient safety. By examining the connections between licensure and discipline, it provides a lens through which to view the overarching role of medical regulation in ensuring that physicians are adequately prepared and able to provide safe and effective patient care across the continuum of their learning and career paths. The results of this study suggest that physicians who perform better on Step 3 (i.e., who have higher total, CCS, and MCQ scores) are less likely to receive a disciplinary action from a state medical board for problematic behavior in practice. The results also imply that CCS scores provide unique information above and beyond the information gleaned from the MCQ component of Step 3. As part of ongoing research efforts, this study provides some validity evidence for the use of USMLE Step 3 scores for making medical licensure decisions in the United States.
References
1. United States Medical Licensing Examination. Step 3 Overview. Accessed April 12, 2022.
https://www.usmle.org/step-3.
2. Clauser BE, Margolis MJ, Swanson DB. An examination of the contribution of computer-based case simulations to the USMLE Step 3 examination. Acad Med. 2002;77:S80–S82.
3. Margolis MJ, Clauser BE. A regression-based procedure for automated scoring of a complex medical performance assessment. Williamson DM, Mislevy RJ, Bejar II, eds. In: Automated Scoring of Complex Tasks in Computer-Based Testing. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.; 2006:123–167.
4. American Educational Research Association; American Psychological Association; National Council on Measurement in Education. Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association; 2014.
5. Kane MT. Validating the interpretations and uses of test scores. J Educ Meas. 2013;50:1–73. doi:10.1111/jedm.12000.
6. Federation of State Medical Boards. U.S. Medical Regulatory Trends and Actions 2018. Published 2018. Accessed April 12, 2022.
https://www.fsmb.org/siteassets/advocacy/publications/us-medical-regulatory-trends-actions.pdf.
7. Papadakis MA, Teherani A, Banach MA, et al. Disciplinary action by medical boards and prior behavior in medical school. N Engl J Med. 2005;353:2673–2682. doi:10.1056/NEJMsa052596.
8. Cuddy MM, Young A, Gelman A, et al. Exploring the relationships between USMLE performance and disciplinary action in practice: A validity study of score inferences from a licensure examination. Acad Med. 2017;92:1780–1785. doi:10.1097/ACM.0000000000001747.
9. Roberts WL, Gross GA, Gimpel JR, et al. An investigation of the relationship between COMLEX-USA licensure examination performance and state licensing board disciplinary actions. Acad Med. 2020;95:925–930. doi:10.1097/ACM.0000000000003046.
10. Morrison J, Wickersham P. Physicians disciplined by a state medical board. JAMA. 1998;279:1889–1893. doi:10.1001/jama.279.23.1889.
11. Kohatsu ND, Gould D, Ross LK, Fox PJ. Characteristics associated with physician discipline: A case-control study. Arch Intern Med. 2004;164:653–658. doi:10.1001/archinte.164.6.653.
12. Kocher MS, Dichtel L, Kasser JR, Gebhardt MC, Katz JN. Orthopedic board certification and physician performance: An analysis of medical malpractice, hospital disciplinary action, and state medical board disciplinary action rates. Am J Orthop. 2008;37:73–75.
13. Papadakis MA, Arnold GK, Blank LL, Holmboe ES, Lipner RS. Performance during internal medicine residency training and subsequent disciplinary action by state licensing boards. Ann Intern Med. 2008;148:869–876. doi:10.7326/0003-4819-148-11-200806030-00009.
14. Lipner RS, Young A, Chaudhry HJ, Duhigg LM, Papadakis MA. Specialty certification status, performance ratings, and disciplinary actions of internal medicine residents. Acad Med. 2016;91:376–381. doi:10.1097/ACM.0000000000001055.
15. Peabody MR, Young A, Peterson LE, et al. The relationship between board certification and disciplinary actions against board-eligible family physicians. Acad Med. 2019;94:847–852. doi:10.1097/ACM.0000000000002650.
16. Kopp JP, Ibanez B, Jones AT, et al. Association between American Board of Surgery initial certification and risk of receiving severe disciplinary actions against medical licenses. JAMA Surg. 2020;155:e200093. doi:10.1001/jamasurg.2020.0093.
17. Nelson LS, Duhigg LM, Arnold GK, Lipner RS, Harvey AL, Reisdorff EJ. The association between maintaining American Board of Emergency Medicine certification and state medical board disciplinary actions. J Emerg Med. 2019;57:772–779. doi:10.1016/j.jemermed.2019.08.028.
18. Perez JA Jr, Greer S. Correlation of United States Medical Licensing Examination and Internal Medicine In-Training Examination performance. Adv Health Sci Educ. 2009;14:753–758.
19. McDonald FS, Zeger SL, Kolars JC. Associations between United States Medical Licensing Examination (USMLE) and Internal Medicine In-Training Examination (IM-ITE) scores. J Gen Intern Med. 2008;23:1016–1019.
20. Peterson LE, Boulet JR, Clauser BE. Associations between medical education assessments and American Board of Family Medicine certification examination score and failure to obtain certification. Acad Med. 2020;95:1396–1403. doi:10.1097/ACM.0000000000003344.
21. Fish DE, Radfar-Baublitz L, Choi H, Felsenthal G. Correlation of standardized testing results with success on the 2001 American Board of Physical Medicine and Rehabilitation Part 1 Board Certification Examination. Am J Phys Med Rehab. 2003;82:686–691. doi:10.1097/01.PHM.0000083688.23340.EA.
22. Dillon GF, Swanson DB, McClintock JC, Gravlee GP. The relationship between the American Board of Anesthesiology Part 1 Certification Examination and the United States Medical Licensing Examination. J Grad Med Educ. 2013;5:276–283. doi:10.4300/JGME-D-12-00205.1.
23. Klein GR, Austin MS, Randolph S, Sharkey PF, Hilibrand AS. Passing the boards: Can USMLE and Orthopaedic In-Training Examination scores predict passage of the ABOS Part-I examination?. J Bone and Joint Surg. 2004;86:1092–1095.
24. Harik P, Baldwin P, Clauser BE. Comparison of automated scoring methods for a computerized performance assessment of clinical judgement. Appl Psychol Meas. 2013;37:587–597. doi:10.1177/0146621613493829.
25. Rasch G. Probabilistic Models for Some Intelligence and Attainment Tests. Copenhagen, Denmark: Danish Institute for Educational Research; 1960.
26. Clay SW, Conatser RR. Characteristics of physicians disciplined by the state medical board of Ohio. J Am Osteopath Assoc. 2003;103:81–88. doi:10.7556/jaoa.2003.103.2.81.
27. Khaliq AA, Dimassi H, Huang C, Narine L, Smego RA. Disciplinary action against physicians: Who is likely to get disciplined? Am J Med. 2005;118:773–777. doi:10.1016/j.amjmed.2005.01.051.
28. Harris JA, Byhoff E. Variations by state in physician disciplinary actions by US medical licensure boards. BMJ Qual Saf. 2017;26:200–208.
29. Law MT, Hansen Z. Medical licensing board characteristics and physician discipline: An empirical analysis. J Health Polit Policy Law. 2010;356:63–93. doi:10.1215/03616878-2009-041.
30. Lillvis DF, McGrath RJ. Directing discipline: State medical board responsiveness to state legislatures. J Health Polit Policy Law. 2017;42:123–165.
31. Gelman AB, Hill J. Data Analysis Using Regression and Multilevel/Hierarchical Models. New York, NY: Cambridge University Press; 2007.
32. Raudenbush SW, Bryk AS. Hierarchical Linear Models: Applications and Data Analysis Methods. 2nd ed. Thousand Oaks, CA: Sage Publications; 2002.
33. R 3.5.1. R Foundation for Statistical Computing, Vienna, Austria. Accessed April 12, 2022.
https://www.R-project.org.
34. Clauser BE, Swanson DB, Clyman SG. The generalizability of scores from a performance assessment of physicians’ patient management skills. Acad Med. 1996;71(10 suppl):S109–S111.
35. Clauser BE, Margolis MJ, Clyman SG, Ross LP. The generalizability of scores for a performance assessment scored with a computer-automated scoring system. J Educ Meas. 1997;37:245–261. doi:10.1111/j.1745-3984.2000.tb01085.x.
36. Margolis MJ, Clauser BE, Harik P. Scoring the computer-based case simulation component of USMLE Step 3: A comparison of preoperational and operational data. Acad Med. 2004;79(10 suppl):S62–S64.