When the World Health Organization announced news of a rapidly evolving Ebola outbreak in Guinea, West Africa in March, an online tool run by experts in Boston had already flagged the disease nine days earlier. The openly available tool, dubbed HealthMap, had picked up the first public reports by local French media stations of events that have since spiraled into the most severe Ebola outbreak in history.
HealthMap scours data from news and government sites, social media, doctors’ social media networks and other resources to paint a global picture of ongoing infectious disease threats. In being the first to break the news of the outbreak, HealthMap has emerged as the standard-bearer in a new generation of disease-surveillance “mashups” that tap Web-based tools to mine, categorize, filter and visualize online intelligence about epidemics in near real time.
The exploding field of so-called digital disease detection, whose coming of age is now marked by its very own annual conference, has been gathering steam for at least 15 years. During that time, clinicians, public health practitioners and lay people have been turning to the Internet for health information in growing numbers. Web-based biosurveillance systems such as HealthMap, Argus and ProMED in the U.S., BioCaster in Japan, GPHIN in Canada and MedISys in Europe, have been credited with speeding up recognition of outbreaks, preventing governments from suppressing outbreak data and facilitating public health responses to outbreaks.
“Digital disease detection resources are already being used for public health intelligence gathering and situational awareness,” says HealthMap co-founder John Brownstein, Ph.D., an associate professor at Harvard Medical School. “HealthMap is always involved in public health events in terms of our surveillance of the massive amounts of online information—anything we can get our hands on to understand events unfolding in West Africa or globally. We put special attention on Ebola, with a special page and more personnel resources to mapping the event.”
The Mother of Innovation
With Ebola reportedly claiming more than 4,500 lives at the time of writing this, including one fatality in the U.S., efforts to contain the outbreak are stretching online biosurveillance systems to the max. They’re also pushing technological innovation.
In the spring, the U.S. Centers for Disease Control and Prevention (CDC) began using a new software tool designed to find people exposed to the Ebola virus faster and so break the chain of disease transmission. The CDC began developing the open source application in 2012 after one of its disease detectives expressed frustration at the inability to efficiently collect, analyze and act on relevant data in the field even as she saw people dying of Ebola and Marburg virus in Uganda and Congo.
The so-called “Epi Info VHF” tool (that’s “epi” as in epidemiologic, and “VHF” as in viral hemorrhagic fever) features virus transmission diagrams that help field workers visualize outbreak spread between people, as well as automated tools that speed data analysis and “contact tracing”—the ability to find exposed people. It also has a tiny IT footprint and works easily in places with limited connectivity, among other benefits.
Insights into the Ebola outbreak have come through computer models using multiple forms of data. In July, a physicist from Northeastern University accurately predicted the disease’s spread into Senegal using flight records and population data. Researchers with Stockholm, Sweden-based nonprofit Flowminder have turned anonymized cell phone location data into a map of West African transportation trends in an effort to control the outbreak.
“Cell phone data can be extraordinarily useful for public health purposes,” Brownstein says. “There are huge opportunities for understanding population movements, how people are migrating and adapting to social disruption.”
Big Data Challenges
The technological challenges to biosurveillance systems are substantial. Brownstein says the greatest of these is the relative lack of data out of West Africa, making it difficult to glean the near real time insights that are key to enhancing a current understanding of events as well as forecasting models. And then there’s the sheer volume of information, in terms of the signal-to-noise ratio, especially regarding social media.
“We spend a lot of time trying to make inferences of population health from the data, either through what people are saying online or querying on Google,” he says.
If there is a Holy Grail in the biosurveillance arena, Brownstein says it is to have better access to personal health records in an aggregate form.
“If clinical data is collected in a systematic and shareable way, that would definitely improve our surveillance efforts,” he says. “And if you marry that with the emergence of technologies that are quantifying our health behaviors, our activity and our various health metrics, then we can have a very powerful view on population health.”