The Scarily Common Screw-Up That Exposed 198 Million Voter Records

Database security continues to be a major pain point. Just ask the nearly 200 million people whose personal info got left exposed on the internet.
Image may contain Pattern Ornament Rug and Light
Reinhard Krull/Getty Images

A number of voter-data exposures have cropped up this year, in locations as disparate as Mexico, the Philippines, and the state of Georgia. But the one that dwarfs them all came to light on Monday: a publicly accessible database containing personal information for 198 million US voters—possibly every American voter going back more than 10 years.

A conservative data firm called Deep Root Analytics owns the database, and stores it on an Amazon S3 server. As Chris Vickery, cyber-risk analyst with security firm UpGuard, discovered earlier this month, all of that data was open to anyone who found it not because of clever hacking or complicated internet forces, but because of a simple misconfiguration. Think of it as leaving your valuables in a high-end safe with the door propped open.

It happens all the time, despite repeated, and repeatedly damaging, exposures of personal information. Even though it's not a hack, server misconfiguration constitutes one of the biggest cybersecurity risks for institutions and individuals alike.

Aggregation Aggravation

The Deep Root Analytics server Vickery found contained information that was mostly publicly accessible anyway—think names, addresses, party affiliation, and so on. But a criminal coming across such a big trove of data would find plenty of the value having all of that information already aggregated in one place—particularly when the source is an analytics firm that specializes in compiling meaningful data.

"It’s definitely the biggest find I’ve ever had," says Vickery, who also discovered the exposed Mexican voter database and many others. "We’re starting to head in the right direction with securing this stuff, but it’s going to get worse before it gets better. This is not rock bottom."

Part of Vickery's research involves scanning the web for publicly accessible data that should be secured. He discovered the exposed Deep Root Analytics cloud repository during one such sweep, realizing that, as UpGuard puts it, the database "lacked any protection against access," and could be viewed by anyone with an internet connection who guessed the Deep Root Analytics Amazon subdomain “dra-dw” (which stood for Deep Root Analytics Data Warehouse). At six characters, the string wouldn't even be that difficult to encounter through a random generator. The server had a good amount of protected data on it, but because it was misconfigured, it exposed more than a terabyte of private information.

The situation joins other database misconfiguration incidents like those on Microsoft sites, dating services, and with the Hollywood screener system as examples of the threat publicly accessible servers pose.

Making It Right

Analysts point to a few remedies that could help reduce the number of misconfigured servers exposing data online. First, simply raising awareness can go a long way. Dramatic incidents that impact millions of people can motivate organizations to devote resources to setting servers up and maintaining properly. Making default settings for databases in the cloud more geared toward security would also help groups tighten up their controls. And some security companies have begun developing products that can scan system setups as another layer of defense, warning IT staff if it looks like something is exposed or configured in a dangerous way. (One reason UpGuard does exposure research is that it sells such a product.)

"People say moving to the cloud is so smart and everything, but it adds another layer of risk," Vickery says. "And every time you add in another element that has to be checked, a certain percentage of the population is not going to check it. So it’s just statistically adding exposure."

That doesn't mean companies can't adequately secure their data in the cloud, but experts say attempts to warn about the danger of systems exposed on the public internet get met with surprisingly little interest. And even when databases come with secure defaults, some people still actively (if unintentionally) work to undercut them.

"Media coverage hasn't impacted the numbers, ransomware is rampant, firewalls are being circumvented and secure defaults are being made insecure," says John Matherly, the creator of Shodan, a search engine that indexes internet-connected devices and can be used to find misconfigured databases. "I'm not entirely sure what is going on here, but it's somewhat bewildering. And unfortunately, I don't have any easy solution to this problem."

"It’s not any particular party, it’s not any particular company, it’s just an epidemic that’s everywhere right now," says UpGuard CEO Michael Baukes. "You see these human errors playing out daily. If you don’t understand what you have, you can’t control the processes and you can’t protect the risk."

For its part, Deep Root Analytics hopes make amends as swiftly as possible. "We accept full responsibility, will continue with our investigation, and based on the information we have gathered thus far, we do not believe that our systems have been hacked. To date, the only entity that we are aware of that had access to the data was Chris Vickery," the company says in a statement. Unfortunately, one random internet stranger with outside access is still one too many.