The rater facet had virtually no contribution to variance (0%), which implies that the raters were in agreement about the assessment of the various leaders and argues for high inter-rater reliability. The g-study for 4 raters and 4 subjects resulted in an absolute generalizability coefficient of 0.80. The D-study, which shows the theoretical effect of changing the number of raters or scenarios on the generalizability coefficient, is shown in Figure 2.
Of note, the error variance contributed 31% to the overall variance in scores. This represents possible triple order interactions (i.e. the interaction of person, scenario, and rater together) as well as other unidentified factors, possibly due to incomplete capture of scenarios by video or differences in camera angles, and bears further investigation.
In this prospective validation study, we present initial evidence on content and internal structure validity to support the use of the CALM instrument as a reliable tool to provide formative feedback to leaders of simulated pediatric resuscitations. The instrument was rigorously developed based off of existing tools, professional experience, and expert consensus and subjected to modified Delphi process and pilot testing. The generalizability study yielded a generalizability coefficient of 0.80, which is above the acceptable range of 0.70 to 0.79 for formative assessments and is consistent with the performance assessment literature.43–45
The CALM instrument is a concise, easy-to-use instrument that requires minimal rater training to assess team leaders of simulated pediatric resuscitations for the provision of formative feedback. Several other tools to address resuscitation leaders exist, although none of them are as brief and focused on the leader as ours is. The Simulation Team Assessment Tool, while excellent for research, may be cumbersome in practice, with 94 discrete tasks evaluating multiple domains and not exclusive to the team leader.25 It was validated using raters who received 4 hours of training and practice along with very detailed definitions and was not intended for real-time evaluation. The Resuscitation Team Leader Evaluation is another tool that was designed to comprehensively assess resuscitation team leaders but may similarly be considered unwieldy for real-time use.27 Another instrument was developed to assess clinical performance during Pediatric Advanced Life Support simulated scenarios.21 This instrument is designed to be used for specific scenarios and therefore may not be as generalizable as our instrument, which was applied across a variety of scenarios.
We validated our instrument in the context that it is intended to be used in, which is real time, “off the shelf” with minimal rater training. In its current iteration, the instrument is intended primarily as a means of providing formative feedback. Thus, although the long-term effects of the instrument's use on learner behavior were not assessed, the psychometrics presented previously are adequate to support this usage, implying an appropriate consequence validity when applied in formative situations. Applying the instrument in more high-stakes scenarios, however, would require additional study focusing on the relationship between the instrument scores and long-term clinical performance of the residents assessed.
The major limitation in this study was the use of videos. Although the videos were required for feasibility of the study, and were the closest to “real-time” possible, some actions may have been hard to hear or see simply because of the way they were recorded. For example, the leader may have “announced role as leader” before the videotape began. This likely was a contributing factor to the large percentage of variance attributed to error in the generalizability study. In addition, the phrasing of the tool, although concise, may have allowed for multiple interpretations of the same response options, also contributing to the error variance. For example, if a leader asked for input from other team members once during the simulation, a rater may have had difficulty determining whether they should receive credit for “always” “engaging team members in decision making,” or if that would better be classified as “mostly” or “sometimes.” It may be beneficial to add a few brief sentences to future iterations of the tool to define the anchored rating scale so that there is a more cohesive understanding of the meaning of each response. Another limitation is that raters were all PEM fellowship directors and experts in leadership. This may affect the generalizability of our study, such that nonexperts in leadership may not rate leaders using the CALM instrument similarly. The small sample size, with only 16 videos, is also a limitation. Although the generalizability study was fully crossed (4 leaders, 4 scenarios, and 4 raters), a larger sample size may alter the generalizability and φ coefficients. This underscores the preliminary nature of this validation study. In addition, we did not gather learner feedback regarding the usefulness of the formative data provided by the instrument. This will be a key area of further research, because such data are needed to support the instrument's stated purpose.
These results provide initial evidence to support the validity of the CALM instrument as a reliable assessment instrument that can guide the provision of formative feedback to leaders of pediatric simulated resuscitations. Although further validation data is needed, we recommend the initial usage of the instrument in this manner and offer it to the simulation community in the hope that it assists facilitators to shape their learners' future crisis resource management practice.
1. Nadel FM, Lavelle JM, Fein JA, Giardino AP, Decker JM, Durbin DR. Assessing pediatric senior residents' training in resuscitation
: fund of knowledge, technical skills, and perception of confidence. Pediatr Emerg Care
2. Chen EH, Cho CS, Shofer FS, Mills AM, Baren JM. Resident exposure to critical patients in a pediatric emergency department. Pediatr Emerg Care
3. Guilfoyle FJ, Milner R, Kissoon N. Resuscitation
interventions in a tertiary level pediatric emergency department: implications for maintenance of skills. CJEM
4. Chen EH, Shofer FS, Baren JM. Emergency medicine resident rotation in pediatric emergency medicine: what kind of experience are we providing? Acad Emerg Med
5. Knudson JD, Neish SR, Cabrera AG, et al. Prevalence and outcomes of pediatric in-hospital cardiopulmonary resuscitation
in the United States: an analysis of the Kids' Inpatient Database*. Crit Care Med
6. Topjian AA, Nadkarni VM, Berg RA. Cardiopulmonary resuscitation
in children. Curr Opin Crit Care
7. Bhanji F, Donoghue AJ, Wolff MS, et al. Part 14: Education: 2015 American Heart Association Guidelines Update for Cardiopulmonary Resuscitation
and Emergency Cardiovascular Care. Circulation
2015;132(18 Suppl 2):S561–S573.
8. Cooper S, Wakelam A. Leadership of resuscitation
teams: “Lighthouse Leadership”. Resuscitation
9. Gilfoyle E, Gottesman R, Razack S. Development of a leadership skills workshop in paediatric advanced resuscitation
. Med Teach
10. Hunziker S, Buhlmann C, Tschan F, et al. Brief leadership instructions improve cardiopulmonary resuscitation
in a high-fidelity simulation
: a randomized controlled trial. Crit Care Med
11. Fernandez Castelao E, Boos M, Ringer C, Eich C, Russo SG. Effect of CRM team leader
training on team performance and leadership behavior in simulated cardiac arrest scenarios: a prospective, randomized, controlled study. BMC Med Educ
12. Marasch SC, Müller C, Marquardt K, Conrad G, Tschan F, Hunziker PR. Human factors affect the quality of cardiopulmonary resuscitation
in simulated cardiac arrests. Resuscitation
13. Yeung JH, Ong GJ, Davies RP, Gao F, Perkins GD. Factors affecting team leadership skills and their relationship with quality of cardiopulmonary resuscitation
. Crit Care Med
14. Weller J, Boyd M, Cumin D. Teams, tribes and patient safety: overcoming barriers to effective teamwork in healthcare. Postgrad Med J
15. Nishisaki A, Nguyen J, Colborn S, et al. Evaluation of multidisciplinary simulation
training on clinical performance and team behavior during tracheal intubation procedures in a pediatric intensive care unit. Pediatr Crit Care Med
16. Issenberg SB, McGaghie WC, Petrusa ER, Lee Gordon D, Scalese RJ. Features and uses of high-fidelity medical simulations that lead to effective learning: a BEME systematic review. Med Teach
17. Cheng A, Goldman RD, Aish MA, Kissoon N. A simulation
-based acute care curriculum for pediatric emergency medicine fellowship training programs. Pediatr Emerg Care
18. Doughty CB, Kessler DO, Zuckerbraun NS, et al. Simulation
in pediatric emergency medicine fellowships. Pediatrics
19. McGaghie WC, Issenberg SB, Petrusa ER, Scalese RJ. A critical review of simulation
-based medical education research: 2003–2009. Med Educ
20. Cooper S, Cant R, Porter J, et al. Rating medical emergency teamwork performance: development of the Team Emergency Assessment Measure (TEAM). Resuscitation
21. Donoghue A, Nishisaki A, Sutton R, Hales R, Boulet J. Reliability and validity of a scoring instrument for clinical performance during Pediatric Advanced Life Support simulation
22. Kim J, Neilipovitz D, Cardinal P, Chiu M. A comparison of global rating scale and checklist scores in the validation of an evaluation tool to assess performance in the resuscitation
of critically ill patients during simulated emergencies (abbreviated as “CRM simulator study IB”). Simul Healthc
23. Brett-Fleegler MB, Vinci RJ, Weiner DL, Harris SK, Shih MC, Kleinman ME. A simulator-based tool that assesses pediatric resident resuscitation
24. Donoghue A, Ventre K, Boulet J, et al. Design, implementation, and psychometric analysis of a scoring instrument for simulated pediatric resuscitation
: a report from the EXPRESS pediatric investigators. Simul Healthc
25. Reid J, Stone K, Brown J, et al. The Simulation
Team Assessment Tool (STAT): development, reliability and validation. Resuscitation
26. Lockyer J, Singhal N, Fidler H, Weiner G, Aziz K, Curran V. The development and testing of a performance checklist to assess neonatal resuscitation
megacode skill. Pediatrics
27. Grant EC, Grant VJ, Bhanji F, Duff JP, Cheng A, Lockyer JM. The development and assessment of an evaluation tool for pediatric resident competence in leading simulated pediatric resuscitations. Resuscitation
28. LeFlore JL, Anderson M, Michael JL, Engle WD, Anderson J. Comparison of self-directed learning versus instructor-modeled learning during a simulated clinical experience. Simul Healthc
29. LeFlore JL, Anderson M. Alternative educational models for interdisciplinary student teams. Simul Healthc
30. International Network for Simulation
-based Pediatric Innovation, Research, & Education Website. Available at: http://inspiresim.com
. Accessed February 29, 2016.
32. Calhoun AW, Boone M, Miller KH, Taulbee RL, Montgomery VL, Boland K. A multirater instrument for the assessment of simulated pediatric crises. J Grad Med Educ
33. Zajano EA, Brown LL, Steele DW, Baird J, Overly FL, Duffy SJ. Development of a survey of teamwork and task load among medical providers: a measure of provider perceptions of teamwork when caring for critical pediatric patients. Pediatr Emerg Care
34. Jelovsek JE, Kow N, Diwadkar GB. Tools for the direct observation and assessment of psychomotor skills in medical trainees: a systematic review. Med Educ
35. Hunt EA, Walker AR, Shaffner DH, Miller MR, Pronovost PJ. Simulation
of in-hospital pediatric medical emergencies and cardiopulmonary arrests: highlighting the importance of the first 5 minutes. Pediatrics
36. Box Web site. Available at: http://box.com
. Accessed November 2014.
38. Messick S. Validity. In: Linn RL, ed. Educational Measurement
. 3rd ed. New York, NY: American Council on Education and Macmillan; 1989.
39. Cook DA, Brydges R, Ginsburg S, Hatala R. A contemporary approach to validity arguments: a practical guide to Kane's framework. Med Educ
40. Brennan RL. Performance assessments from the perspective of generalizability theory. Appl Psychol Meas
41. Brennan RL. Generalizability Theory. Educ Meas Issues Prac
42. Cronbach LJ. The Dependability of Behavioral Measurements: Theory of Generalizability for Scores and Profiles
. New York, NY: Wiley; 1972.
43. Boulet JR. Summative assessment in medicine: the promise of simulation
for high-stakes evaluation. Acad Emerg Med
44. Cook DA, Hatala R. Validation of educational assessments: a primer for simulation
and beyond. Adv Simul
45. Downing SM. Reliability: on the reproducibility of assessment data. Med Educ