Have I Been Pwned, the breach notification service that serves as a bellwether for the security of login credentials, has just gotten its hands on its biggest data haul ever—a list that includes almost 773 million unique email addresses and 21 million unique passwords that were used to log in to third-party sites.
According to Have I Been Pwned founder Troy Hunt in a post published Wednesday, the monster list is a compilation of many smaller lists taken from past breaches and has been in wide circulation over the past week. It was also posted to the MEGA file sharing site. At least one of the included breaches dated back to 2015. Dubbed "Collection #1," the aggregated data was likely scraped together to serve as a master list that hackers could use in credential stuffing attacks. These attacks use automated scripts to inject credentials from one breached website into a different website in hopes the holders reused the same passwords.
The 773 million email addresses and 21 million passwords easily beat Have I Been Pwned’s previous record breach notification that contained 711 million records. But there are other things that make this latest installment stand out. In all, it contains 1.16 billion email-password combinations. That means that the list covers the same people multiple times, but in many cases with different passwords. Also significant: the list—contained in 12,000 separate files that take up more than 87 gigabytes of disk space—has 2.69 billion rows, many of which contain duplicate entries that Hunt had to clean up.
About 663 million of the addresses have been listed in previous Have I Been Pwned notifications, meaning 140 million of the addresses have never been seen by the service before. Hunt said that some of his own credentials were included in Wednesday’s notification, although none were currently in use. Have I Been Pwned has now begun the non-trivial task of emailing more than 768,000 individuals who signed up for notifications and nearly 40,000 people who monitor domains. Anyone who hasn’t signed up can still check the status of an email address here.
“People will receive notifications or browse to the site and find themselves there and it will be one more little reminder about how our personal data is misused,” Hunt wrote. “If—like me—you're in that list, people who are intent on breaking into your online accounts are circulating it between themselves and looking to take advantage of any shortcuts you may be taking with your online security.”
Hunt said that one of the questions he gets asked the most is if he will divulge the password that accompanied the email address in a breach. He has steadfastly refused for a variety of good reasons. First, pairing user names and passwords would undoubtedly make his service a major target of hackers. It would also require him to store passwords in clear text, which is something no site should ever do. Have I Been Pwned does allow people to use this page to check if a specific text string has ever shown up in a breach notification, but for obvious reasons, it decouples the password from the email addresses that used it.
There's no doubt Collection #1 is huge, but it can’t be precisely compared to other massive breaches. It's tempting to compare it to hacks of Yahoo in 2013andagain in 2014that compromised 3 billion and 500 million accounts respectively, a hack in 2016 that revealed account details for412 million accountson sex and swinger community site AdultFriendFinder, and the breach of Equifax that allowed hackers to steal data belonging to147.9 million consumers. But that's in many respects an apples-to-oranges comparison, because Collection #1 was seeded by many smaller breaches, many of which were likely already disclosed.That's not to say Collection #1 isn't significant. Despite its recycling of previously breached credentials, the widely available megalist no doubt makes it easier than ever for even unskilled miscreants to capitalize on the bevy of breaches that have occurred over the past decade.
The most effective thing people can do to secure their online accounts is to ensure that each one is protected by a long, randomly generated password that’s unique to each account. For most people, this means using a reputable password manager, although many security experts (including Hunt) say an old-fashioned notebook will work. The second most important thing people can do is to use multi-factor authentication on every site that allows it. Hunt has more advice about passwords here.