WhatsApp's mass adoption stems in part from how easy it is to find a new contact on the messaging platform: Add someone's phone number, and WhatsApp instantly shows whether they're on the service, and often their profile picture and name, too.
Repeat that same trick a few billion times with every possible phone number, it turns out, and the same feature can also serve as a convenient way to obtain the cell number of virtually every WhatsApp user on earth—along with, in many cases, profile photos and text that identifies each of those users. The result is a sprawling exposure of personal information for a significant fraction of the world population.
One group of Austrian researchers have now shown that they were able to use that simple method of checking every possible number in WhatsApp's contact discovery to extract 3.5 billion users’ phone numbers from the messaging service. For about 57 percent of those users, they also found that they could access their profile photos, and for another 29 percent, the text on their profiles. Despite a previous warning about WhatsApp's exposure of this data from a different researcher in 2017, they say, the service's parent company, Meta, still failed to limit the speed or number of contact discovery requests the researchers could make by interacting with WhatsApp's browser-based app, allowing them to check roughly a hundred million numbers an hour.
The result would be “the largest data leak in history, had it not been collated as part of a responsibly conducted research study,” as the researchers describe it in a paper documenting their findings.
“To the best of our knowledge, this marks the most extensive exposure of phone numbers and related user data ever documented,” says Aljosha Judmayer, one of the researchers at the University of Vienna who worked on the study.
The researchers say they warned Meta about their findings in April and deleted their copy of the 3.5 billion phone numbers. By October, the company had fixed the enumeration problem by enacting a stricter “rate-limiting” measure that prevents the mass-scale contact discovery method the researchers used. But until then, the data exposure could have also been exploited by anyone else using the same scraping technique, adds Max Günther, another researcher from the university who cowrote the paper. “If this could be retrieved by us super easily, others could have also done the same," he says.
In a statement to WIRED, Meta thanked the researchers, who reported their discovery through Meta's “bug bounty” system, and described the exposed data as “basic publicly available information,” since profile photos and text weren't exposed for users who opted to make it private. “We had already been working on industry-leading anti-scraping systems, and this study was instrumental in stress-testing and confirming the immediate efficacy of these new defenses,” writes Nitin Gupta, vice president of engineering at WhatsApp. Gupta adds, “We have found no evidence of malicious actors abusing this vector. As a reminder, user messages remained private and secure thanks to WhatsApp’s default end-to-end encryption, and no non-public data was accessible to the researchers.”
Despite Meta's description, the researchers say they didn't circumvent or even encounter any “defenses” in collecting the phone numbers. Nor is their work the first time that WhatsApp has been warned about its exposure of phone numbers and associated profile data. Fully eight years ago, in 2017, Dutch researcher Loran Kloeze wrote a blog post pointing out that the phone number enumeration technique was possible and that it could be used to obtain phone numbers, profile photos, and also the times when a user was online.
Kloeze described a scenario in which the data exposure could be combined with face recognition to create a giant database of personally identifiable information. “Now that is quite scary, isn’t it?” he wrote. Meta, then Facebook, responded to his findings, arguing that WhatsApp's privacy settings were still working as designed—users can choose to make their profile information accessible only to their chosen contacts—and even told him he wasn’t eligible for a bug bounty reward for his work at the time.
When WIRED asked Meta what rate-limiting measures it instituted over the last eight years to prevent the technique Kloeze demonstrated, the company responded that it has, in fact, implemented evolving defenses against scrapers, including rate-limiting and machine-learning techniques to ban scrapers. Yet the University of Vienna researchers were able to not only replicate Kloeze's work, but take it further, actually enumerating all 3.5 billion registered WhatsApp phone numbers—far more than the service had in 2017. They also addressed WhatsApp's argument about privacy settings by measuring how many users publicly exposed personal information in their profiles, breaking down the results by country. They found that 44 percent of the 137 million phone numbers they collected for Americans displayed photos, and 33 percent showed public “about” text, for instance.
For countries where WhatsApp is even more widely used, a smaller fraction of the population turned on its privacy settings: In India, where the researchers counted nearly 750 million numbers, 62 percent of accounts publicly displayed a profile photo. For the 206 million Brazilian numbers they found, 61 percent had profile photos exposed.
The University of Vienna researchers stumbled on WhatsApp's phone number enumeration problem last year, when they were testing what they could learn from the service about users despite its end-to-end encryption for messages, such as the times when a user is connecting from the desktop app versus the mobile one. They found that the app didn't appear to have any obvious rate-limiting protection, so they tried simply enumerating all US numbers. “In a half hour, we had like 30 million US-based numbers,” says Gabriel Gegenhuber, one of the University of Vienna researchers. “So we were kind of surprised. And then we just kept going.”
One interested audience for the exposed phone number data, the researchers point out, would be scammers and spammers who are seeking a database of potential targets. But the researchers also found millions of phone numbers registered to WhatsApp in countries where it's officially banned, including 2.3 million in China and 1.6 million in Myanmar. Those countries' governments could have used WhatsApp's exposure to collect those numbers and hunt down illegal app users, the researchers point out. Muslims in China, according to some reports, have been detained merely for having WhatsApp installed on their phones.
The University of Vienna researchers also analyzed the cryptographic keys for the 3.5 billion accounts they found exposed via their enumeration method, the long strings of characters used to receive encrypted messages in WhatsApp's end-to-end encryption protocol. They found that a surprising number of accounts used duplicate keys—a security issue given that anyone who has the same key as another user would also be able to decrypt messages sent to them.
Some keys were reused hundreds of times, they found, and 20 US numbers used a key of all zeroes, strangely. The researchers speculate, though, that the key duplication was likely the result of unauthorized WhatsApp clients, rather than a flaw in WhatsApp itself. On closer examination of some of the accounts with repeated cryptographic keys, they also noted that they looked like scammer accounts, suggesting that some scam operations that exploit WhatsApp may use a client with broken encryption features.
Aside from the lack of rate limiting, the researchers argue that their findings point to a more fundamental issue with services like WhatsApp: Phone numbers, they say, don't actually have enough randomness to be used as a unique identifier for a service with billions of users. That leaves rate-limiting as the only available measure to prevent user data from being scraped en masse, and one that will never be fully secure against privacy leaks if WhatsApp prioritizes convenient contact discovery for users. (WhatsApp has, in fact, started testing a username feature in beta, which may offer a better approach to privacy.)
“Phone numbers were not designed to be used as secret identifiers for accounts, but that's how they're used in practice,” says Judmayer. “If you have a big service that's used by more than a third of the world population, and this is the discovery mechanism, that's a problem.”
