What happened?
On Monday, June 12, 2017, a security researcher discovered that sensitive personal data of nearly all the 200 million registered voters from the USA was left exposed to the Internet.
How did this happen?
The owner stored the data on a publicly accessible cloud server — Amazon Simple Storage Service (S3) object storage — without any protection against public access. Consequently, anyone with an Internet connection could have accessed the data, simply by navigating to a six-character subdomain of Amazon.
S3 is notorious for being left unlocked to the public, even by some of the largest organizations. Unlocked S3 buckets have the potential for a massive data breach when the bucket contains an extensive collection of sensitive information.
In this case, the owner improperly configured the access permission of the S3 bucket. Although the misconfiguration was small, its implications are potentially disastrous. The owner of the S3 bucket acknowledged the issue, secured the S3 bucket against public access some two days after the discovery, and implemented new protocols to prevent any further access.
What type of data was exposed?
The amount of data stored in the improperly configured database was equivalent in size to about 10 billion pages of text. For the registered voters, the exposed data includes the name, the date of birth, the address, the phone number, the registered party, self-reported racial demographic, the voter registration status, and the modeled ethnicity and religion.
Exposure of data of this kind has damaging real world consequences and may lead to chilling effects on voter registration and jury pool participation. Effects range from nuisance commercial telephone calls to threats to personal safety for individuals associated with contentious interpersonal, legal, or custody disputes.
Troublingly, the data also contained the modeled personal and political preferences of the potential voters. The model results describe the modeled likelihood of each potential voter to support policies, political candidates, and political beliefs. From various accounts, it seems that the modeling results are remarkably accurate.
It is worrisome that such a massive national database could be hosted online without even the simplest of protections against exposing the data to the public. The exposure raises significant questions about the privacy and security people can expect for their most privileged information.
How common is the problem?
Unfortunately, huge data exposure events are not uncommon, and the extent of such exposures is often mind boggling. For example, the Security Watch blog of MacKeeper published 18 reports of major data leaks in 2017 alone. Some examples are:
In May 2017, the same security researcher discovered that a multi-billion-dollar USA defense contractor exposed more than 60,000 sensitive files tied to a US military project on S3. The data contained unencrypted passwords of government contractors with top secret facility clearance. The exposed credentials could potentially grant access to repositories with similarly sensitive government data.
In April 2017, the same security researcher discovered that someone exposed 93.4 million voter registration records from Mexico on an S3 bucket in the USA. The concern is that if the data ends up in the wrong hands, the potential harm includes blackmail, abduction, and cartel violence.
In May 2016, hackers exposed to the public the personal data of 50 million Turkish citizens. The data included names, addresses, first names of parents, cities of birth, dates of birth, and national identifier numbers.
In March 2016, hackers exposed to the public the voter registration information of 55 million electors in the Philippines. The data included addresses, places of birth, heights, weights, genders, marital status, names of parents, fingerprints, email addresses, and numbers and expiration dates of passports.
In June 2015, hackers exposed to the public the background investigation information of 21.5 million people, most of whom applied for classified clearances from the USA government. The data included social security numbers, fingerprints, and highly sensitive investigative findings (such as drug and criminal histories, sexual orientations, and relationships). The concern is that if foreign governments obtained lists containing the names of their citizens who had been secretly in touch with government workers from the USA, they could be blackmailed and punished.
What can you do about it?
The fundamental problems of such data exposure are not uncommon. Prevent such potential misuses of sensitive information by following a few simple precepts in collecting and storing data.
First, understand the existing state by reviewing the access exposure to pinpoint misconfigurations to prevent the public from accessing the data.
Subsequently, fix issues and put proper policies in place. Maintain positive, logical permissions in identity and access management policies and attach them to identity and access management entities. At the same time, prevent at the resource level any unauthorized access to objects.
More specifically: Remove all ‘allow’ permissions based on an access control list. Create identity and access management policies allowing access and assign them to users and roles. Explicitly deny at the resource level unauthorized access to objects to protect against whatever misconfiguration they might have. Monitor significant events related to identity and access management policies changes, resource policies, new assets, and object access.
Also check critical indicators of potential security risks such as:
- At the server level, public or guest access is not disabled.
- At the server level, access is allowed from any IP address rather than from specific IP addresses.
- At the server level or a granular level, the data is not encrypted.
- The data does not have granular security.
- Highly sensitive data is not masked at a granular level.
- Server-level and granular firewall rules have not been configured.
- Secret keys are encrypted with weak algorithms.
- A password policy is vulnerable.
What are your next steps?
You may want to read IDERA’s whitepaper “Nuts and Bolts of Permissions and Security” and view IDERA’s webcast “Building a Successful Database Security Model”.
IDERA SQL Doctor analyzes cloud and on-premises deployments of Microsoft SQL Server (including Microsoft Azure SQL Database and Amazon RDS SQL Server) to generate expert recommendations with executable scripts to improve security and performance.
IDERA SQL Secure identifies vulnerabilities for cloud and on-premises deployments of Microsoft SQL Server (including Microsoft Azure SQL Database and Amazon RDS SQL Server) to improve security and to comply with regulatory requirements.
IDERA DBArtisan protects data security by managing permissions and privileges for cloud and on-premises deployments of Microsoft SQL Server, Azure SQL Database, Oracle, PostgreSQL, and more.
IDERA ER/Studio Data Architect categorizes objects according to the level of security, enables permissions at logical and physical model levels, defines data security types and properties for compliance, and streamlines security administration for many database platforms such as Microsoft SQL Server in the cloud including Microsoft Azure SQL Database.
IDERA SQL Safe Backup uses advanced encryption with minimal performance degradation to backup SQL Server databases and supports Amazon Simple Storage Service (S3) and Microsoft Azure Blob Storage for backup, restore, instant restore, and object level recovery.