Skip to Main Content

It’s emerging as one of the more promising — and potentially controversial — ideas to slow the spread of the coronavirus: collecting smartphone data to track where people have gone and who they’ve crossed paths with.

The White House has discussed the notion, and several companies are reportedly in talks with the Trump administration to share aggregated user data. Researchers in the U.K. are working on one such app, and a team led by researchers at the Massachusetts Institute of Technology is building another, with an eye toward protecting user privacy. China and South Korea developed their own smartphone surveillance systems to try to clamp down on their own outbreaks, though their approaches likely wouldn’t be palatable in countries with greater expectations of privacy.

advertisement

Then there’s Facebook, which collects data from its users around the world who opt in to sharing their location when using its smartphone app. Facebook does not share this information with governments. But in recent weeks, the social media giant has been sharing these data — in aggregated and anonymized form — with academic and nonprofit researchers analyzing the spread of the coronavirus.

Among the universities where Covid-19 researchers are harnessing Facebook’s data: the Harvard T.H. Chan School of Public Health, National Tsing Hua University in Taiwan, University of Pavia in Italy, and the London School of Hygiene and Tropical Medicine.

The idea is to study where people move and how often they encounter each other, in the hope of better understanding the virus’ spread — and which places are likely to soon see a spike in cases.

advertisement

“We aggregate up all of the signals into a picture of flows of people — and then the likelihood that groups of people from a neighborhood or a town are going to come into contact with groups of people from a nearby neighborhood or town,” said Laura McGorman, policy lead for Facebook’s Data for Good team, which is sharing the data with Covid-19 researchers as part of its yearold Disease Prevention Maps program.

For example, a researcher might use Facebook’s tool to estimate the probability that residents of Prince George’s County, Md., and Washington, D.C. — a popular commuting route — will encounter each other. Researchers can also rank the communities with which, say, Prince George’s County residents are most likely to come into contact.

Facebook’s data offers researchers a significant advantage compared to many traditional transportation datasets, the most readily accessible of which measure connectivity between states and countries. Datasets like Facebook’s that reflect movement between — or even within — counties are rarer.

In addition to the user location data, the Disease Prevention Maps tool also pulls in other data from other non-Facebook sources, including modeled census data and satellite imagery. Facebook is sharing the data with academics under a license and at no charge.

Facebook has not disclosed how many of its 2.5 billion users share their location with the company on their smartphones. But the data is highly variable by country.

“Depending on where you are in the world, it can either be a very representative sample — probably rather representative for a place like California — but not so representative if you’re trying to look at the spread of something like Ebola in the DRC, where we probably have very few people with smartphones using Facebook with location history enabled,” McGorman said.

Facebook’s data is beginning to show up in projects and papers about the coronavirus.

In a working paper released on March 10 that has not yet been peer-reviewed, researchers in Seattle tried to project scenarios about how many cases Washington state’s King and Snohomish counties will have by April 7. They cited Facebook’s data, noting that the dataset showed a stable 50% reduction in incoming traffic to Seattle and a nearby suburban area over a span of several days as residents had increasingly stayed home.

Facebook’s data is also proving helpful to Direct Relief, a Southern California-based nonprofit focused on mobilizing medical resources to help people in crisis situations. The group has been working on Covid-19 response since the end of January, first in China and now increasingly at U.S. health centers.

Andrew Schroeder, Direct Relief’s vice president of research and analysis, said his nonprofit is using Facebook’s data to better understand population movement and in turn, inform decisions about resource allocation in free clinics, community health centers, and intensive care units.

One of Direct Relief’s goals, Schroeder said, is to “make sure that we can get resources in place for where the risks are likely to be most significant — and we’re pretty convinced that’s going to be a pretty localized and uneven phenomenon.”

“So it really, really helps to have these types of disaggregated spatial movement pictures,” he added.

Shenyue Jia was already familiar with Facebook’s data. The researcher at Chapman University in Southern California had previously used it to study the risk of wildfires in California; that was under a related Facebook program known as Disaster Maps started in 2017. Now, Jia is using Facebook’s data to analyze movement between communities affected by Covid-19 outbreaks — with the goal of providing useful insights for public health workers trying to determine which interventions can have the biggest impact.

Jia recently built an interactive map of Hong Kong visualizing the commercial center known as Causeway Bay, as well as the strength of its connections with other neighborhoods. That work makes clear that “distance matters — and also the strength of the link matters,” she said.

These days, Jia is focusing her efforts on using the data to study the escalating situation in the U.S.

In sharing such aggregated data with academics and nonprofits, Facebook has not been hit with widespread privacy concerns. That’s a welcome change for the company: Over the past few years, Facebook has struggled with a series of privacy controversies, most notably the Cambridge Analytica scandal, in which a political firm harvested raw data from millions of Facebook profiles.

If more tech companies start sharing their user data with authorities to aid the fight against Covid-19, they may face pushback from critics concerned that the government would be tracking the movements of individuals. The size of that backlash may hinge on whether individuals can be identified in the datasets, where that information gets stored, and whether users have a say about whether their data gets passed over to officials.

In the coming days, other companies might also start sharing their troves of location data in some form. In a statement last week, a Google spokesperson said the company is “exploring ways that aggregated anonymized location information could help in the fight against Covid-19.” For example, the spokesperson said, such data could be used in “helping health authorities determine the impact of social distancing, similar to the way we show popular restaurant times and traffic patterns in Google Maps.”

The Google spokesperson added that the work “would follow our stringent privacy protocols and would not involve sharing data about any individual’s location, movement, or contacts.”

STAT encourages you to share your voice. We welcome your commentary, criticism, and expertise on our subscriber-only platform, STAT+ Connect

To submit a correction request, please visit our Contact Us page.