From the University of Washington School of Public Health, University of Washington, Seattle, WA.
Editors' note: This series addresses topics that affect epidemiologists across a range of specialties. Commentaries are first invited as talks at symposia organized by the Editors. This paper was originally presented at the 2010 Society for Epidemiologic Research Annual Meeting in Seattle, WA.
Editors' note: Related articles appear on pages 760 and 764.
Correspondence: Sherrilynne Fuller, Division of Biomedical and Health Informatics, School of Medicine, Center for Public Health Informatics, School of Public Health, 35–4943, University of Washington, Seattle, WA 98195. E-mail: firstname.lastname@example.org.
The term “Global Express” was coined by Ann Marie Kimball1 to describe the system that connects us across oceans, continents, national boundaries, cultures, languages, groups, ethnicity, and trade systems. Kimball describes the public health challenges presented by infectious diseases in an era of international trade, travel, and migration. She illustrates the challenge as presented by the SARS outbreak, which rapidly spread across continents. As global trade continues to increase in volume, in diversity of products, and in speed of movement, and as infectious disease agents (including mosquitoes) continue to expand their territories as a result of climate change, the need for rapid response information systems has never been greater.2
Traditional approaches to data collection are slow to pick up threats, and they tend to be retrospective in nature rather than prospective. A century and a half ago, it took a year to circumnavigate the globe; today it takes less than 36 hours. The incubation period of many infectious diseases is often longer than the time it takes to travel from one location to another. In the past, infectious disease outbreaks were detected on ships as they pulled into port, and the ships were quarantined until the danger passed. A century ago, there was extensive research to identify and track cholera epidemics based on trade routes.3 Although the methods used to collect and map the data were crude by today's standards, they did not have to deal with the rapid movement of humans, animals, and other trade goods. We need much more accurate and rapid approaches to identify health challenges and prevent their spread.
Prediction is at the heart of many public health systems. Despite huge investments in electronic surveillance systems and data warehouses, public health professionals continue to be the most likely “early warning system”—noticing and reporting unusual clusters of symptoms or unexpected disease virulence. The efficacy of surveillance systems is limited by lack of coherent, comprehensive data collection, with limited or no interoperability even at the local or national level, let alone regionally and internationally. Vital data are trapped in “data silos” held by different organizations, different applications, in a variety of formats, on different devices, and within different networks. Further, we lack tools to support multidisciplinary, international collaboration and research in identifying, preventing, and responding to biorisks of all types. Chretien et al4 provide a summary of the potential of syndromic surveillance systems in developing countries. Brownstein et al5 discuss a variety of digital resources that yield useful disease-tracking information that complement traditional surveillance resources.
In the spirit of Web 2.0, and based on a variety of communications technologies, new tools and technologies are offering public health researchers and professionals the ability to more rapidly collect, analyze, and mine data prospectively rather than rely solely upon retrospective analysis, and to collaborate in creative new ways to effectively coordinate responses to threats. Many of these tools are “open source,” which means that the software can be used, redistributed, or modified at no charge to the user. Several of these tools and approaches to data collection are described below.
DATA COLLECTION AND COORDINATION ACROSS THE MEKONG BASIN COUNTRIES
The 6 countries in the Mekong Basin Disease Surveillance organization comprise Cambodia, China (Yunnan and Guangxi provinces), Lao, Myanmar, Thailand, and Burma. Disease surveillance across this region is greatly improving the ability to identify disease outbreaks and respond to them rapidly—through the use of standards for data collection and disease definition, mobile technologies for data collection and sharing, advanced geographic mapping systems, and significant coordination across country boundaries,. The Biomedical and Public Health Informatics Center at Mahidol University (Thailand) has developed mobile technologies to permit transmission of geocoded photographs of blood samples to central laboratories. This allows rapid detection of drug-resistant malaria strains in remote areas of the region and rapidly relays the treatment information to local health workers.6
MOBILE TECHNOLOGIES FOR DATA COLLECTION AND SHARED DECISION MAKING
In many parts of the world, the lack of reliable infrastructure (including computing and communications) and the lack of skilled workers has made data collection difficult. However, the exponential growth of cell-phone infrastructure is rapidly transforming data collection and management. A variety of types of data—from text to photos, to location, audio, barcode scans, and video—can be rapidly collected and transmitted for public health research and decision support.
Open Data Kit is a suite of open-source tools developed by computer scientists and engineers at the University of Washington and around the world. These tools are faster and more accurate than paper-based forms for data collection and analysis and less expensive than alternative computing technologies. By using existing cellular networks, Open DataKit developers are freeing the users from the constraints of traditional computer systems in developing countries and beyond. Features including global positioning systems (GPS), video, and photographs provide a contextually richer set of data than is possible with most paper forms, and the information can be compiled, shared, and analyzed much faster. For example, medical workers in Kenya conduct house-to-house visits doing HIV counseling and testing using Open Data Kit-equipped “smart phones” to track patients' medical histories (accessed by using the phone to scan a bar code on a patient's ID card) and to upload information directly to the medical record system.7
GeoChat, developed by the InStedd group, is a flexible, open-source, group-communications technology that allows team members in emergency situations interact to connect, visualize, report, receive, and coordinate data and information. GeoChat is used by rapid-response teams in Thailand and Cambodia, spanning provincial to subdistrict health officers and volunteers.8
DATA MINING AND VISUALIZATION
The classic problem is too much data and not enough information upon which to make decisions. New data-mining and visualization tools and technologies are being developed to provide enhanced views of large sets of aggregate data. A promising new approach to real-time data collection for research and response involves mining streams of data as they are generated on the Internet. Termed “crowd-sourcing,” several tools have been developed to gather distributed data through the Web (or alternative data streams) and visualize it in real time. InStedd Riff tool is an interactive decision-support environment that combines the power of virtual teams of human experts and advanced analytic, machine-learning, and visualization services to allow its users to collaborate around streams of information, to detect, characterize, and respond quickly to emerging events. During the H1N1 pandemic, Riff was used to mine Google searches related to flu symptoms and then identify where those searches were coming from in the United States and to map these locations, resulting in accurate prediction of actual outbreaks as they were developing.9
Ushahidi, similar in purpose to Riff, supports gathering of distributed data from the Web and other data streams. Developers throughout Africa and beyond are using Ushahidi to extract and map a variety of types of data, including crisis response and recovery (Chile, Haiti) and medical and pharmaceutical stockouts (Kenya, Uganda, Malawi, and Zambia).10
EpiVue, developed by the Center for Public Health Informatics, University of Washington, integrates open-source technologies to provide a geospatial-visualization framework for public health data. Users can upload data sets in a variety of formats and visualize the data using Google Maps.11
Zook et al12 outline the ways in which a variety of information technologies, including crowd-sourcing for online mapping, were used in the Haiti relief effort. They demonstrate the potential of crowd-sourced online mapping and the potential for new avenues of interaction among physically distant places.
Cloud computing is Internet-based computing in which shared software, information, data, and tools are provided to computers and other devices across the globe. The physical infrastructure is typically provided by commercial entities such as Google, Microsoft, IBM, and Amazon, as well as universities and other research organizations. The power of cloud computing for research is just beginning to be tested as groups of researchers distributed across the globe collaborate in a shared-knowledge-resource environment.
The potential of these new tools and technologies is yet to be fully realized by researchers. Many questions remain to be investigated. How do we evaluate, scale, and harness the “hundred technology flowers” blooming across the globe? With the availability of instantaneous communications connecting people across the world, how do we best use crowd reactions to epidemic threats for prevention and response? And, perhaps most importantly, what are the implications for education and training of epidemiologists and public health professionals?
ABOUT THE AUTHOR
SHERRILYNNE FULLER is a Professor of Biomedical and Health Informatics, School of Medicine, and Co-Director of the Center for Public Health Informatics, School of Public Health, at the University of Washington, Seattle, Washington. Her research and teaching focuses on the development and appropriate application of information systems and technologies for disease prevention and health improvement in resource-constrained settings across the world. She served on the President's Information Technology Advisory Committee and co-chair of the Subcommittee on Health from 1997 to 2002.
1. Kimball A. Risky Trade: Infectious Disease in the Era of Global Trade. Aldershot, United Kingdom: Ashgate Publishing; 2006.
2. Brown C. Emerging diseases: the global express. Vet Pathol. 2010;47:9–14.
3. Proust A. La Defense de L'Europe Contre le Cholera. Paris: Masson; 1892.
4. Chretien JP, Burkom HS, Sedyaningsih ER, et al. Syndromic surveillance: adapting innovations to developing settings. PLoS Med. 2008;5:e72.
5. Brownstein JS, Freifield CC, Madoff LC. Digital disease detection—harnessing the web for public health surveillance. N Engl J Med. 2009;360:2153–2157.
9. RIFF—an open source tool offers an interactive decision support environment allows its users to collaborate around streams of information. Available at: http://instedd.org/evolve
. Accessed July 31, 2010.
10. Ushahidi—an open source tool for information collection, visualization and interactive mapping. Available at: http://www.ushahidi.com/
. Accessed July 31, 2010.
11. Yi Q, Hoskins RE, Hillringhouse EA, et al. Integrating open-source technologies to build low-cost information systems for improved access to public health data. Int J Health Geogr. 2008;7:29.
12. Zook M, Graham M, Shelton T, et al. Volunteered geographic information and crowdsourcing disaster relief: a case study of the Haitian earthquake. World Med Health Policy. 2010;2:2.
© 2010 Lippincott Williams & Wilkins, Inc.