More than a decade after September 11, policy makers legitimately ask whether the more than $21 billion funds that has been in spent on public health emergency preparedness (PHEP) at the federal, state, and local levels have been effective. And, in light of severe pressures on governmental budgets, policy makers must ask whether this investment can be maintained and what must be preserved. At the same time, the National Health Security Strategy calls for systematic quality improvement (QI) efforts to improve health security. Whether the goal is to ensure accountability to policy makers or to facilitate QI, valid and reliable measures of preparedness are needed.
Measuring and assessing the state of the nation's preparedness, however, are challenging. From a systems dynamic perspective, one can think of a preparedness production function P = f(X, Y, Z), where P is a 1-dimensional measure of “preparedness,” X represents federally supported preparedness activities, Y denotes other activities conducted by state and local public health agencies, hospitals, etc, and Z represents other factors that affect preparedness. The problem is that we do not know how to represent P as a single dimension (or in other words, what aspects of preparedness matter most), the functional form of f(·), which factors X, Y, and Z matter the most, or how to measure X, Y, and Z.
Ideally, we would like to know about the health or social outcomes that are associated with greater degrees of preparedness, but these are impossible to measure. The challenge derives in part from the lack of opportunities to assess outcomes by direct observation, as public health emergencies are thankfully rare. Moreover, public health emergencies often require a multijurisdictional, multisectoral response, so it is difficult to know what the best approach would have been. The lack of “counterfactuals”—what would have happened under another response—complicates the matter further.
To address these challenges, my colleagues at the Harvard School of Public Health Preparedness and Emergency Response Research Center have been taking a systematic approach to the development of PHEP measures, modeled after the “science of measurement” in methodology that is increasingly common in health care QI efforts.1 We begin by identifying the basic dimensions of preparedness, which we believe are the PHEP system's response capabilities, as opposed to preparedness capacities or health or social outcomes. This distinction is reflected in a logic model that specifies the goals and objectives of public health preparedness and consolidates evidence and credible professional knowledge about what works in certain circumstances (Figure). To the extent that the logic model is correct and supported by evidence, one can measure capacities or capabilities as proxies for outcomes, just as structure and process measures are used in health care quality assessments.1
In this model, capacities represent the resources—infrastructure, response mechanisms, and knowledgeable and trained personnel—that a public health system has to draw upon. The capacities are organized using the legal, economic, and operational domains of Potter and colleagues,2 with the addition of “social capital,” the intangible partnership and informal relationships between individuals and organizations that research shows are critical to effective emergency operations and community resilience.3 The major problem with a focus on capacities is that we often have no credible evidence that the capacities—individually or in combination—in fact ensure the desired outcome.
Capabilities, on the contrary, describe the actions a public health system is capable of taking to effectively identify, characterize, and respond to emergencies: surveillance, epidemiologic investigations, laboratory, disease prevention and mitigation, surge capacity for health care services, risk communication to the public, and coordination of system responses through an effective incident management system. Capabilities, therefore, are latent characteristics of the PHEP system that are best measured and assessed when the PHEP system responds to an emergency. The problem with capacities is that actual emergencies are not frequent enough or repeated consistently to allow for statistical performance measures.1
Current Approaches to Measuring and Assessing PHEP
In the last decade, there has been much progress in PHEP measurement. This includes the development of specific validated measurement systems such as Strategic National Stockpile program Technical Assistance Reports as well as valid and reliable methods for measuring performance during exercises.4 In addition, the performance measures required by the Centers for Disease Control and Prevention's PHEP cooperative agreements have evolved from inventories and capacity assessments to a capability-based framework.5 Current measures of the PHEP capabilities, however, are largely capacity-based or not yet developed. And, only 2 of the 10 measures in the most recent report from the Trust for America's Health Ready or Not report reflect capabilities: the whooping cough vaccination coverage rate for children and the state's ability to rapidly notify and assemble public health staff.6
Experience with hurricanes Katrina, Ike, and Gustav, as well as 2009 H1N1 pandemic, demonstrates both the challenges and the potential of learning from actual critical events. Public health agencies commonly use After Action Reports for this purpose, but analyses of After Action Reports from the 2009 H1N1 pandemic have limitations. Savoia and colleagues7 and Stoto and colleagues8 have found that key aspects of the public health response are often not addressed in these reports and that the reports typically do not identify or address root causes or even identify why a specific response was a success or a failure. Many of these After Action Reports follow the Federal Emergency Management Agency's Homeland Security Exercise Evaluation Program format, which focuses on evaluating the response and preparedness plans in terms of individual capabilities from the Department of Homeland Security's Target Capability List. Target Capability Lists do not adequately represent actual public health system responses, and, moreover, the focus on individual capabilities made it difficult to understand the overall functioning of the public health preparedness system and explore root causes.8 For instance, none of the Target Capability Lists represent PHEP systems' ability to adapt in the face of uncertainty, which as the case study in the following text shows, was an important factor in the 2009 H1N1 response. This approach can work well for evaluating exercises, where the focus is on testing plans and the system's ability to execute them but is less effective in assessing system performance during an actual event.
As an alternative to these structured approaches, expert assessments prepared by organizations such as the Institute of Medicine9 and the Government Accountability Office10 or individual experts11,12 can be very useful. Such assessments offer the potential for a thoughtful, in-depth, holistic analysis by informed experts along with concrete, practical, and relevant suggestions for improvement. Perhaps, one of the best of this genre is the independent review of the United Kingdom's response to the 2009 influenza pandemic.13 Reviews of this type, however, typically lack clear a definition of PHEP or a comparative framework. They tend to be idiosyncratic and can depend more on the perceptions and experience of the “expert” than on actual performance data. As a result, such assessments are difficult to compare across jurisdictions for accountability purposes or to compare over time to monitor QI efforts.
Case Study: The Public Health Response to the 2009 H1N1 Pandemic
Our research on the 2009 H1N1 pandemic has demonstrated the potential of rigorous qualitative assessments of PHEP system capabilities. We have used the major PHEP response capabilities (assessment, policy development, and assurance) from the logic model in the Figure to structure this analysis, and the fourth major capability—coordination and communication—appears throughout.
With regard to public health assessment capabilities (surveillance, epidemiologic analysis, and laboratory analysis), Stoto14 demonstrated how public health surveillance data are potentially biased because they depend on a series of decisions made by patients, health care providers, and public health professionals about seeking and providing health care and about reporting cases or otherwise taking action that comes to the attention of health authorities. Outpatient, hospital-based, and emergency department surveillance systems, for instance, all rely on individuals deciding to present themselves to obtain health care, and these decisions are based in part on their interpretations of their symptoms. Similarly, virologic surveillance and systems based on laboratory confirmations depend on physicians deciding to send specimens for testing. Even the number of Google searches and self-reports of influenza-like illness in the Behavioral Risk Factor Surveillance System survey can be influenced by individuals' interpretation of the seriousness of their symptoms. Every element of this decision making is potentially influenced by what these people know and think, both of which change during the course of an outbreak.14
Zhang and colleagues15 have shown that enhanced laboratory capacity in the United States and Canada, as well as a trilateral agreement enabling collaboration among United States, Canada, and Mexico, led to earlier detection and characterization of the 2009 H1N1. In addition, improved global notification systems contributed by helping health care officials understand the relevance and importance of their own information. Yet, despite these improved capacities, it took the global public health system months to detect and characterize the newly emerged pH1N1 virus.15
Thus, with respect to both surveillance and outbreak detection, increased preparedness capacities did not ensure adequate response capabilities. Many emergency preparedness professionals think in terms of single cases triggering a response in hours or at most days, and this thinking is reflected in such key public health preparedness documents. The Centers for Disease Control and Prevention's and the Trust for America's Health's state-by-state assessments of public health preparedness, for instance, focus on ensuring that state and local public health laboratories can respond rapidly in an emergency, identify or rule out particular known biological agents, and have the workforce and surge capacity to process large numbers of samples during an emergency. Although such capabilities are clearly necessary for some events, they are not sufficient, and none of these measures would have ensured that the public health system could have identified the emergence of and characterized pH1N1 as well and as efficiently as it was done in Mexico and the United States in April 2009. Rather, the surveillance system capabilities that were most essential were the availability of laboratory networks capable of identifying a novel pathogen, notification systems that made health care officials aware of the epidemiologic facts emerging from numerous locations in at least 2 countries, and the intelligence necessary to “connect the dots” and understand their implications.
The public health system's capabilities to develop and implement policies, especially population-based disease control measures, were also tested during the 2009 H1N1 pandemic. With regard to school closings, for instance, Klaiman and colleagues16 document significant variation in the stated goal of closure decision, including limiting community spread of the virus, protecting particularly vulnerable students, and responding to staff shortages or student absenteeism. Because the goal of closure is relevant to its timing, nature, and duration, unclear rationales for closure can challenge its effectiveness. There was also significant variation in the decision-making authority to close schools in different jurisdictions, which, in some instances, was reflected in open disagreement between school and public health officials. Finally, decision makers did not appear to expect the level of scientific uncertainty encountered early in the pandemic and they often expressed significant frustration over changing Centers for Disease Control and Prevention guidance. The challenge is not whether jurisdictions have emergency authorities but rather whether a public health system, operating under uncertainty, can effectively clarify the goals of school closure, tailor closure decisions to those goals, and implement these decisions seamlessly.16
The primary public health assurance capability tested during the 2009 H1N1 pandemic was the national vaccination campaign. In many respects, this was a success: in less than a year, a new 2009 H1N1 vaccine was developed, produced, and delivered to 81 million people. The response, however, had its limitations. Less than half of the population was vaccinated, and nearly half of the 162 million 2009 H1N1 vaccine doses produced went unused.17 Moreover, there were substantial disparities in vaccination uptake associated with sociodemographic factors, H1N1-related beliefs, and seasonal vaccination. Blacks were 22% less likely to be vaccinated, 35% less likely to believe vaccine is safe, and more likely to have tried and failed to find vaccine.18
One of the challenges public health faced is that the distribution of pandemic vaccine to local health departments was slow and unpredictable. A case study of the public health response to 2009 H1N1 in Massachusetts illustrates the challenges of managing the distribution of vaccine under these circumstances and identifies several lessons about community resilience. The ad hoc approach health care officials adopted—pooling vaccine among the 6 towns on the island and sharing resources such as vaccination teams that would go from one school to another in a region—was easy to explain and well accepted by the public and thus an efficient and fair flexible response to uncertainty about when vaccine would arrive. The case also demonstrates the importance of strong community-wide partnerships to address public health problems and the need to balance precise policies with flexible implementation as well as the importance of local involvement in decision making and increasing the transparency of communications.19 None of these capabilities are well reflected by typical measures such as whether health departments had a mass dispensing plan and has experience with the Incident Command System.
The Massachusetts experience also demonstrates how the most effective way to implement a given capability—mass administration of a medical countermeasure, specifically the pH1N1 vaccine—depends on the local context. The Table displays the results in terms of the realist evaluation approach, in which explore how causal mechanisms generate outcomes depending on the context in which they are deployed. For instance, because children made up a large proportion of the target population, most communities found school-based clinics to be an effective way to administer the vaccine. Boston, on the contrary, found that its network of community health centers with established trusting links to the populations they serve worked well. And, a large group practice effectively used its existing electronic medical record system to call and set up appointments for patients for whom the available vaccine was appropriate on any given weekend. On Martha's Vineyard and in other rural settings, small local health departments found it useful to collaborate and pool resources to conduct regional clinics. Flexibility and adaptivity were more important than local health departments having capacities that met a statewide or national standard for points of distribution.
Valid and reliable measures of preparedness are needed to both ensure accountability and facilitate QI, but measuring and assessing the state of the nation's public health preparedness are challenging. But as the 2009 H1N1 case study shows, current capacity-based structured measurement approaches are not sufficient to predict how the PHEP system would actually perform in an emergency. It is not whether we have laboratory capacity and surveillance systems, but how well they perform when called upon to detect and characterize a new pathogen and provide accurate situational awareness. It is not whether we have emergency authorities but whether a public health system, operating under uncertainty, can effectively use them to clarify the goals of and implement measures such as school closures. And, it is not whether health departments have a mass dispensing plan but whether they can collaborate with all of the public and private organizations in their community to actually dispense vaccines.
To supplement current approaches to measuring PHEP, the nation needs a new approach that combines the structure of the Homeland Security Exercise Evaluation Program to ensure that critical capabilities are covered with a system for realistically assessing PHEP system capabilities in actual situations. This is not to say that capacities are not important—indeed, without them the PHEP systems would not have the needed capabilities. Rather, the point is that capability-based preparedness measures based on actual observation of a PHEP system in action are likely to be better measures of how that system will perform in the future.
One such approach would be a PHEP Critical Incident Registry that fosters in-depth analyses of individual incidents and provides incentives to share results with others working in similar contexts and for cross-incident analysis.20 For comparative purposes, Critical Incident Registry reports would address specific PHEP capabilities and could be a platform for a structured set of performance measures. When the focus is on QI and on complex PHEP systems, rather than their components or individuals, qualitative assessment of the system capabilities of PHEP systems can be more useful than quantitative metrics. Ensuring that such assessments are rigorous can be challenging, but a well-established body of social science methods provides a useful approach.1
To enable organizational learning, the capabilities must be defined at a high enough level so that lessons learned in one example can be transferred to similar situations and in future emergencies. Defining capabilities in a more generic way can also make measures of these capabilities more comparable. The Massachusetts case study shows, for instance, that what matters is how well a PHEP system can administer a countermeasure to the target population, not whether local health departments meet national capacity standards for, say, points of distribution. The 2009 experience also shows that the local public health system in Massachusetts had this capability, including the flexibility to adapt to local context. While future emergencies will likely require other countermeasure and target different populations, the positive experience in 2009 is probably a better measure of the state's ability to respond effectively to future emergencies than capacity-based measures focusing on points of distribution standards and similar capacities.
2. Potter MA, Houck OC, Miner K, Kimberley S. Data for preparedness metrics: legal, economic, and operational. J Public Health Manag Pract. 2013;19(5):S22–S27.
3. Koh H, Elqura L, Judge C, et al. Regionalization of local public health systems in the era of preparedness. Annu Rev Public Health. 2008;29:205–218.
4. Nelson C, Chan E, Fan C, et al. New Tools for Assessing State and Local Capabilities for Countermeasure Delivery, RAND TR-665. Santa Monica, CA: RAND Corporation; 2009. http://www.rand.org/pubs/technical_reports/TR665/
. Accessed July 2, 2013.
5. Centers for Disease Control and Prevention. Public Health Preparedness Capabilities: National Standards for State and Local Planning, 2012. Atlanta, GA: Centers for Disease Control and Prevention; 2012. http://www.cdc.gov/phpr/capabilities
. Accessed July 2, 2013.
7. Savoia E, Agboola A, Biddinger P. Use of After Action Reports (AARs) to promote organizational and systems learning in emergency preparedness. Int J Environ Res Public Health. 2012;9:2949–2963.
8. Stoto M, Nelson C, Higdon M, et al. Learning about after action reporting from the 2009 H1N1 pandemic: a workshop summary. J Public Health Manag Pract. 2013;19(5):420–427.
9. Institute of Medicine. The 2009 H1N1 Influenza Vaccination Campaign: Summary of a Workshop Series. Washington DC: National Academies Press; 2010.
10. Government Accountability Office. Influenza Pandemic: Lessons From the H1N1 Pandemic Should Be Incorporated Into Future Planning. Washington, DC: United States Government Accountability Office; 2009. GAO-11-632. http://www.gao.gov/assets/330/320176.pdf
. Accessed July 2, 2013.
11. Hanfling D, Hick J. Hospitals and the novel H1N1 outbreak: the mouse that roared? Disaster Med Public Health Prep. 2009;3(suppl 2):S100–S106.
12. Barry J. The next pandemic. World Policy J. 2010;27:10–12.
14. Stoto M. The effectiveness of U.S. public health surveillance systems for situational awareness during the 2009 H1N1 pandemic: a retrospective analysis. PLOS One. 2012;7(8):e40984.
15. Zhang Y, Lopez-Gatell H, Alpuche-Aranda C, Stoto M. Did advances in global surveillance and notification systems make a difference in the 2009 H1N1 pandemic? A retrospective analysis. PLOS One. 2013;8(4):e59893.
16. Klaiman T, Kraemer J, Stoto M. Variability in school closure decisions in response to 2009 H1N1. BMC Public Health. 2011; 11:73.
17. Stoto M, Nelson C, Higdon M, et al. Lessons about the state and local public health system response to the 2009 H1N1 pandemic: a workshop summary. J Public Health Manag Pract. 2013;19(5):428–435.
18. Galarce E, Minsky S, Viswanath K. Socioeconomic status, demographics, beliefs and A(H1N1) vaccine uptake in the United States. Vaccine. 2011;29:5284–5289.
19. Higdon M, Stoto M. The Martha's Vineyard public health system responds to 2009 H1N1. Int Public Health J. In Press.
20. Piltch-Loeb R, Kraemer J, Stoto M. Synopsis of a public health emergency preparedness critical incident registry (CIR). J Public Health Manag Pract. 2013;19(5):S93–S94.