An effective response to public health emergencies and their consequences is crucial to protecting the health of our communities and ensuring stability in access to care. In the last decade, great strides have been made toward the goals of an efficient and sustainable public health emergency preparedness and response system. These strides have been made through on-going reassessments of the potential hazards and risks that health and public health systems may face,1 review of available response resources and refinement of strategies for resource use and allocation,2 enhancement of existing emergency operations plans and systems,3 integration of preparedness training programs into routine practice,4 sharing and incorporation of lessons learned from after-action reports (AARs),5 and development of community education programs.6 As public health and health system preparedness for emergencies have improved, drills and exercises have been frequently recommended both as a means to measure the magnitude of the improvements to their emergency response plans, procedures, and protocols and as a means to identify remaining gaps and challenges. For instance, Centers for Disease Control and Prevention (CDC) requires the Public Health Emergency Preparedness awardees to conduct at least one preparedness exercise per budget period.7 Also, The Joint Commission requires that all hospitals participate in at least 2 emergency preparedness exercises per year as part of their accreditation requirements.8 Critical elements in the design, conduct, and structure of exercises will impact their utility in testing and improving emergency preparedness. It is thought that well-designed exercises can serve as a useful approximation of a real emergency or incident.9–12 Multiple studies have described how participation in these exercises is effective in familiarizing personnel with emergency plans, allowing different agencies to practice working together and revealing flaws in established emergency plans and/or actions of responders.9,12–15 However, little data exists that describes whether participation itself in emergency response exercises can be predictive of performance improvement over time.
Emergency response exercises are not intended to be stand-alone events, disconnected from other preparedness and response efforts. Instead, they are intended to be part of a larger continuous cycle of planning, training, exercising, analyzing shortcomings, and identifying areas requiring improvement, as well as subsequently taking of corrective actions.16 Given the costs of exercising, the importance of integrating exercises into this broader cycle of improving preparedness is crucial. A pilot study conducted by the CDC, for instance, showed that the cost of an exercise for Public Health Emergency Preparedness awardees ranged from $1172 to $22 792.17 A separate study conducted by the Harvard School of Public Health estimated that a regional tabletop exercise involving 5 local health departments with 46 participants would cost about $24 000 for planning, conduct, and evaluation.18 In this study, we endeavored to investigate whether exercise participation itself could be correlated with improved preparedness. In this study, our goal was to determine whether there was an association between participation in prior preparedness exercises and performance on objective measures of response.
In collaboration with the Massachusetts Department of Public Health, we developed and conducted a statewide tabletop exercise series between August 2011 and January 2012 in all 6 preparedness regions of Massachusetts.19 A tabletop exercise involves key personnel discussing simulated scenarios in an informal setting based on existing operational plans and identifying where those plans need to be refined.20 The exercises focused on a hazardous materials (HAZMAT) scenario. All of the acute care hospitals in the state, as well as their neighboring public health and emergency medical service providers, were invited to participate.
Exercise design and evaluation process
Participating hospitals in each region played at the same time in the same venue. Hospital-based participants were seated at individual tables with their local public health and public safety partners. Each table was assigned a facilitator to guide the discussion and ensure focus on the salient issues of the exercise. State and regional public safety, emergency management, and public health representatives were grouped at a central table during each of the regional exercise to allow local institutions and responders to communicate with their regional and state counterparts. The facilitators moderated the exercise discussions and posed a set of decision-oriented questions that were developed by the exercise planners. Evaluators compiled notes during real-time exercise play and documented performance using a standardized exercise evaluation tool (see later). The evaluators had an average of 17 years of experience in emergency preparedness (range, 6-32 years), and all had received formal training on incident command system, information management, and HAZMAT. Prior to the exercise, they also received training on the use of the evaluation instrument and scoring guidance and were provided with information regarding the capabilities being tested during the exercise. In the exercise program, we defined hospitals as having sufficient representation if they have at least 4 participants who they considered integral to responding to a HAZMAT event present for the exercise. Therefore, hospitals that were unable to bring sufficient representatives to the exercise did not play independently. These hospital representatives were combined with other representatives in a similar situation during the exercise. Data from hospitals that were unable to participate independently in the exercises were excluded from this study.
In previous work, the Harvard School of Public Health has developed an instrument for evaluation of performance in exercises (see Supplemental Digital Content for a copy of the instrument, available at: http://links.lww.com/JPHMP/A24). Our evaluation instrument contains both subjective and objective performance measures aligned in a grid that allows documentation of performance. Each measure contains a checklist of actions and a score so that evaluators can objectively document the performance of specific actions as well as subjectively rate the quality of such performance. The measures are selected on the basis of the exercise objectives. Data are collected using both checklists of expected actions for each measure and completion of a 10-point Likert scale for each measure with responses ranging from 1 (very poor) to 10 (very good) (see Supplemental Digital Content for details, available at: http://links.lww.com/JPHMP/A24). The evaluation instrument has been pilot-tested for its reliability, usability, and validity by independent evaluators during 5 tabletop exercises. The interrater reliability ranged from moderate to substantial agreement (k = 0.52-0.68; P < .001). Ranking of the entities that participated in these exercises using the evaluation score from the instrument was consistent with what was originally hypothesized by a group of experts. Recommendations from content experts and evaluators were used to modify the instrument. In addition to collecting specific data on the chosen measures, evaluators document unique observations in their evaluation forms and also offer responses to open-ended questions at the end of the instrument. For this exercise, the evaluation instrument consisted of 23 measures. The measures selected for this exercise program were identified through a comprehensive review of literature based on the exercise objectives, review of prior valid measures identified by our research program, and recommendation by content experts. The measures were then grouped into 3 subsections to correlate with exercise play: (1) ability to maintain situation awareness, receive appropriate notifications, and act on incident information (8 tasks); (2) ability to request, activate, and receive/transport CHEMPACK (6 tasks); and (3) ability to decontaminate, triage, and manage contaminated or potentially contaminated patients (9 tasks). CHEMPACK is part of the CDC's Strategic National Stockpile program and is designed to preposition antidotes to expedite the treatment of individuals exposed to chemical nerve agents.21
Data collection and analysis of performance
Individual evaluators transferred their data from their handwritten forms into an online data collection form designed by our program to facilitate electronic submission. Evaluation data were then downloaded from the online version of the evaluation tool into Microsoft excel database. A point was allocated for each checklist of action that was accomplished (n = 87), and a quantitative score between 1 and 10 was allocated for every measure (maximum value = 230). The total performance score was calculated by summing the checklist of action score and the quantitative score (maximum value = 317). The open-ended questions were used to validate the scores and to prepare the AAR on the exercise.
Variable of interest
To try to correlate performance data with hospital characteristics that might also affect exercise performance, we analyzed the evaluative score in relation to the following selected variables of interest: (1) mean number of hospital HAZMAT exercise attended by participants of each hospital in the past 3 years—this was used as a proxy for the hospital HAZMAT experience of each hospital; (2) hospitals participation in prior CHEMPACK-specific exercise; (3) teaching status of the hospital (teaching hospitals vs nonteaching hospital); (4) hospital size (small hospitals, <180 beds; medium-sized hospitals, 181-339 beds; and large hospitals, ≥340 beds); (5) mean number of preparedness training attended by participants of each hospital in the past year and; (6) mean years of emergency preparedness and response experience of participants of each hospital.
Data were analyzed using STATA 12.0. Descriptive analysis was used to assess the performance. The total performance scores were normally distributed as tested by a normal probability plot and skewness/kurtosis test; therefore, parametric statistical methods were used for comparison between subgroups. A 2-sided t test was used to compare means of exercise performance between hospitals with respect to the number of prior HAZMAT exercise, participation in CHEMPACK exercise, and teaching status of the hospital. Univariate analysis of variance was used to compare means of exercise performance with respect to hospital size. Spearman ρ coefficient was used to assess the correlation between exercise performance and other variables of interest.
Of 74 acute care hospitals in Massachusetts, 47 (65%) hospitals participated in the exercise. In all, 22 (47%) of the 47 participating hospitals were deemed to have insufficient participants to fully participate in the exercise independently. These hospitals were gathered into groups with other similar hospitals for exercise play, but they were excluded from our data analysis because they were evaluated as a group rather than individually. Twenty-five (53%) hospitals represented by 170 participants were therefore included in our analysis. Of the hospitals that were included in the analysis, 11 (44%) were small, 4 (16%) were medium-sized, and 10 (40%) were large. Twenty (80%) hospitals were nonteaching hospitals, whereas the remaining 5 (20%) were teaching hospitals. In terms of the number of prior exercise participation, 11 (44%) hospitals had participated in 2 or fewer HAZMAT exercises in the past 3 years (mean number of exercises = 1.7). Fourteen (56%) hospitals had participated in 3 or more exercises in the past 3 years (mean number of exercises = 4.9). Four (16%) hospitals participated in the prior CHEMPACK exercise, whereas 21 (84%) hospitals did not. On average, exercise participants had attended 3 emergency preparedness trainings in the past year and their average duration of experience in emergency preparedness was 11.6 years.
The mean aggregate performance score of the 25 hospitals was 62% (SD = 11%). Performance scores ranged from 42% to 85%. The Figure shows the categories of the performance score. The mean performance score for the 3 evaluation subsections were as follows: situational awareness and notification section was 66% (SD = 10%), CHEMPACK section was 56% (SD = 18%), and decontamination section was 62% (SD = 18%).
Impact of hospital and participants' characteristics on exercise performance
Number of prior HAZMAT exercises in the past 3 years
The overall performance score of hospitals that had participated in 3 or more exercises in the past 3 years was higher (mean = 67%; range, 50%-85%) than those that had participated in fewer exercises (mean = 55.7%; range, 42%-62%). The performance difference between the 2 groups was significant (P = .004). This difference persisted, even when analyzed by each of the subsection of the exercise (Table).
Participation in prior CHEMPACK exercises
By coincidence, some hospitals in our exercise program had participated in a different response exercise program focused solely on the CHEMPACK program within the past 6 months. We performed a separate analysis of the data contrasting those hospitals that had recently tested their CHEMPACK plans with those that had not. The performance score of hospitals that had participated in the CHEMPACK exercise that took place about 6 months prior to this exercise was 62%, whereas those that did not participate in the CHEMPACK exercise had a slightly higher score of 63%. The difference was not statistically significant (P > .05).
Size of hospitals
The mean performance score of medium and large hospitals was the same (66%), whereas that of the smaller hospitals was lower (mean = 57%). A one-way analysis of variance test showed that the differences among the sizes of the hospitals were not statistically significant (P > .05).
Teaching status of hospitals
The mean performance score of teaching hospitals and nonteaching hospitals was 66% and 60%, respectively. No statistically significant differences were identified in the scores (P > .05).
Emergency preparedness training attended in the past year
The correlation between the mean number of emergency preparedness training attended by participants of a hospital and the performance score was low (r = 0.2; P > .05). Hospitals with an average number of training of 3 or more (mean score) scored an average of 58%, whereas hospitals with an average number of training of fewer than 3 (mean score) scored an average of 59%.
Average time involvement of participants in emergency preparedness
The correlation between the average time involvement of participants in emergency preparedness in years and the performance evaluation score of hospitals was very low (r = 0.17; P > .05).
Exercises are commonly used by health sector entities to test emergency preparedness and response capabilities. It is well documented that such exercises can uncover gaps in planning, conflicts with other plans, inadequate resources, and other problems.9,12–15 It is less known, however, whether the act of participating in exercises itself assists in improving emergency preparedness and response. In this study, we endeavored to assess the relationship between repeated exercising and the overall emergency response capabilities of hospitals, as measured by performance during a tabletop exercise.
Our study shows that hospitals that participate in emergency response exercises more frequently perform better on a standardized assessment tool than those that exercise less frequently. There was, however, no correlation between performance and hospital size and teaching status. Djalali et al22 obtained similar findings when they assessed the relationship between performance during exercise as measured by indicators of hospital incident command system and hospital size. Interestingly, performance was also not correlated with years of experience with emergency preparedness of the hospital staff participating in the exercise or the number of recent trainings they had attended. These data suggest that participation in previous exercises and subsequent improvement of emergency plans and systems are more likely the driver of better emergency response performance.
One might hypothesize that the improved performance observed in participants who exercised more frequently was the result of their having “taken the test” multiple times and therefore knowing the “right answers” more often. We believe this is not the case. First, none of the hospitals had participated in this exercise scenario previously and therefore had not known what to expect. Second, the improved performance was seen in only the 2 phases of exercise play that focused on emergency plans that are under local control. The first and third phases of our exercise tested notification/communication systems and on-site decontamination plans respectively, which are under the control of the individual hospitals. The second phase of our exercise tested aspects of the CHEMPACK plan, which is not written by local entities or under their control. Therefore, it is not surprising that hospitals that exercised more frequently were unable to improve performance on a phase not under their control (CHEMPACK). This relation between lack of difference in observed performance and ability to control plans is underscored by the subset analysis we were able to perform. Although our sample size is small, those hospitals who recently exercised their CHEMPACK plans did not seem to demonstrate improved performance when compared with those that did not exercise those plans recently. We believe that this lack of improvement may be the result of a lack of ability to amend those plans and make local improvements. This observation suggests that the observed differences in performance are not the result of taking the test multiple times but may be the result of implementation of local improvements by hospitals after exercising. In the 2008 National Profile, 76% of the local health departments that participated in exercises reported that they revised their written emergency response plans on the basis of recommendations from an exercise AAR.23 A review of the AAR of the response to the 2011 Joplin, Missouri, Tornado also showed how participation in National Level exercise and other periodic exercises helped federal, state, regional, local, and private sector personnel respond effectively to the Joplin Tornado.24 We believe that the improved performance seen in hospitals that exercise more frequently is likely the result of greater iterative identification of gaps and problems in emergency response and iterative correction of those gaps and problems. Therefore, independent of the scenario, hospitals that exercise more frequently might be expected to generally perform better than those that do not on aspects of emergency response that are under their local control. On the basis of other studies, we also hypothesize that this observed improvement in exercise performance is also likely to translate into improved performance in response to real-world emergencies25,12; however, our data do not address this issue.
Our study had several important limitations. First, we were only able to observe the performance of hospitals in one exercise scenario. Our findings would be stronger if they were found to persist over multiple exercises and multiple scenarios. Second, although we collected data from a statewide exercise series sponsored by the state Department of Public Health, not all hospitals in the state were able to participate. Furthermore, some hospitals were unable to send sufficient participants to the exercise to meaningfully represent their institution and were therefore excluded from this analysis. The resulting smaller sample size resulted in reduced power. It is possible that our sample size was simply too small to detect an association between performance and some variables. A larger sample could provide more definitive evidence and would allow for greater statistical analysis. The size also limits the generalizability of this study. Third, the mean number of the participants' hospital HAZMAT exercise experience (over a 3 year period) from each table was used as a proxy for the HAZMAT exercise experience of each hospital. We believe that this gave us a good estimate of the experience of each hospital because the analysis was limited to the hospitals that had sufficient representation, that is, having at least 4 participants present for the exercise who they considered integral to responding to any HAZMAT event. Fourth, we did not take into account the types of exercises conducted or the types of trainings held. There are clearly differences in exercises and trainings that may have had an impact on the resulting improvement of the participants. Finally, the method we used to calculate the total evaluation score from the instrument has not yet been validated. Therefore, this may not give us an accurate estimate of the individual hospital performance. However, we feel that the estimate gives an indication of how well the hospitals performed relative to each other. There are several possibilities for future research. In addition to increasing the sample size and looking at different exercise scenario, future research should analyze the effect of participation in exercises on real-world response.
Exercises to test preparedness are frequently used to test capabilities and response plans in the face of various scenarios. Little is known about how participation in exercises affects preparedness and response in other scenarios. Our data show that more frequent participation in emergency preparedness exercises is correlated with improved performance measured in a tabletop exercise. Performance appears to be unrelated to hospital size, teaching status, the years of experience of the participants in preparedness, and the number of trainings participants recently attended. This suggests that more frequent participation in exercises may result in improved overall response. This appears to support current recommendations for exercising frequently7,8 to ensure optimal readiness and response.
1. Adini B, Laor D, Cohen R, Lev B, Israeli A. [The five commandments for preparing the Israeli healthcare system for emergencies]. Harefuah. 2010;149(7):445–480.
2. Hick JL, Hanfling D, Cantrill SV. Allocating scarce resources in disasters: emergency department principles. Ann Emerg Med. 2012;59(3):177–187.
3. Ross KL, Bing CM. Emergency management: expanding the disaster plan. Home Healthc Nurse. 2007;25(6):370–377; quiz 386-387.
4. Rumoro DP, Bayram JD, Malik M, Purim-Shem-Tov YA. Emergency Response Training Group, A comprehensive disaster training program to improve emergency physicians' preparedness: a 1-year pilot study. Am J Disaster Med. 2010;5(6):325–331.
5. Savoia E, Agboola F, Biddinger PD. Use of after action reports (AARs) to promote organizational and systems learning in emergency preparedness. Int J Environ Res Public Health. 2012;9(8):2949–2963.
6. Centers for Disease Control and Prevention. Assessment of household preparedness through training exercises—two metropolitan counties, Tennessee, 2011. MMWR Morb Mortal Wkly Rep. 2012;61(36):720–722.
9. Biddinger PD, Savoia E, Massin-Short SB, Preston J, Stoto MA. Public health emergency preparedness exercises: lessons learned. Public Health Rep. 2010; 125(suppl 5):100–106.
10. Sarpy SA, Warren CR, Kaplan S, Bradley J, Howe R. Simulating public health response to a severe acute respiratory syndrome (SARS) event: a comprehensive and systematic approach to designing, implementing, and evaluating a tabletop exercise. J Public Health Manag Pract. 2005:(suppl):S75–S82.
11. Lurie N, Wasserman J, Nelson CD: Public health preparedness: evolution or revolution? Health Aff (Millwood). 2006;25(4):935–945.
12. Dausey DJ, Buehler JW, Lurie N. Designing and conducting tabletop exercises to assess public health preparedness for manmade and naturally occurring biological threats. BMC Public Health. 2007;7:92.
13. Bartley BH, Stella JB, Walsh LD What a disaster?! Assessing utility of simulated disaster exercise and educational process for improving hospital preparedness. Prehosp Disaster Med. 2006;21(4):249–255.
14. Fowkes V, Blossom HJ, Sandrock C, Mitchell B, Brandstein K. Exercises in emergency preparedness for health professionals in community clinics. J Community Health. 2010;35(5):512–518.
15. Henning KJ, Brennan PJ, Hoegg C, O'Rourke E, Dyer BD, Grace TL. Health system preparedness for bioterrorism: bringing the tabletop to the hospital. Infect Control Hosp Epidemiol. 2004;25(2):146–155.
16. US Department of Homeland Security. Homeland Security Exercise and Evaluation Program Volume III: Exercise Evaluation and Improvement Planning, Revised 2007. Washington, DC: US Department of Homeland Security. https://hseep.dhs.gov/support/VolumeIII.pdf
. Accessed December 1, 2012.
18. McCarthy T, Agboola F, Biddinger PD. The cost of a tabletop exercise. Poster session presented at: Regroup, Refocus, Refresh: Sustaining Preparedness in an Economic Crisis. 2012 Public Health Preparedness Summit; February 21-24, 2012; Anaheim, CA.
20. US Department of Homeland Security. Homeland Security Exercise and Evaluation Program Volume I: HSEEP Overview and Exercise Program Management, Revised February 2007. Washington, DC: US Department of Homeland Security. https://hseep.dhs.gov/support/volumeI.pdf
. Accessed February 14, 2013.
22. Djalali A, Castren M, Hosseinijenab V, Khatib M, Ohlen G, Kurland L. Hospital incident command system (HICS) performance in Iran; decision making during disasters. Scand J Trauma Resusc Emerg Med. 2012;20:14.
25. Biddinger PD, Cadigan RO, Auerbach BS, et al. On linkages: using exercises to identify systems-level preparedness challenges. Public Health Rep. 2008;123(1):96–101.
emergency preparedness; emergency preparedness exercise; exercise evaluation; hospital preparedness; performance