Internet Archive Suffers Major Data Breach, Impacting 31 Million Users
Created on 10 October, 2024 • News • 1,678 views • 5 minutes read
The Internet Archive suffered a data breach, exposing 31 million accounts and leading to serious security concerns for its users.
The Internet Archive, renowned for its digital library and the "Wayback Machine," has suffered a severe data breach, affecting over 31 million user accounts. The breach was revealed when a threat actor defaced the site with a JavaScript alert on October 9, 2024, informing visitors of the compromised data. This breach exposed email addresses, screen names, and bcrypt-hashed passwords. The breach was confirmed by cybersecurity experts, including Troy Hunt from Have I Been Pwned (HIBP), who received the stolen authentication database.
What Happened?
On Wednesday afternoon, users visiting the Internet Archive's website were met with an alarming pop-up message indicating a massive security breach. The message claimed that the site had been compromised, affecting millions of users. The breach was later confirmed by the founder of the Internet Archive, Brewster Kahle, who also disclosed that the website had suffered a Distributed Denial of Service (DDoS) attack earlier in the day.
The compromised data reportedly includes more than 31 million user records, consisting of email addresses, screen names, bcrypt-hashed passwords, and other internal data. The stolen database was a 6.4GB SQL file named "ia_users.sql," which had a most recent timestamp of September 28, 2024, suggesting that the breach occurred shortly before this date.
HIBP Confirms the Breach
Troy Hunt, the creator of Have I Been Pwned (HIBP), confirmed that he received the compromised database from the attacker. Hunt revealed that 54% of the affected accounts were already listed in HIBP due to previous breaches. He took steps to verify the legitimacy of the stolen data by cross-referencing it with users, including cybersecurity researcher Scott Helme, who confirmed the bcrypt-hashed password and other details matched his archive.org account information.
This verification process showed that the data was real and that the breach had a wide-reaching impact on Internet Archive users.
DDoS Attack and Website Defacement
In addition to the data breach, the Internet Archive suffered a DDoS attack on the same day, rendering its website temporarily inaccessible. The attack was claimed by the hacktivist group BlackMeta, who had targeted the Archive with DDoS attacks earlier in 2024 as well.
The pop-up notification that appeared on the site, defacing it, added to the confusion and panic among visitors. The notification read, "Have you ever felt like the Internet Archive runs on sticks and is constantly on the verge of suffering a catastrophic security breach? It just happened. See 31 million of you on HIBP!"
The Internet Archive’s team acted quickly to remove the defacement and restore the website. By late Wednesday evening, Brewster Kahle shared an update via the site’s account on X (formerly Twitter), stating that they had disabled the JavaScript library responsible for the defacement and were actively scrubbing their systems and upgrading security to prevent further breaches.
What Information Was Exposed?
The stolen database contained several pieces of sensitive information, including:
- Email addresses
- Screen names
- Password change timestamps
- Bcrypt-hashed passwords
The bcrypt hashing algorithm is designed to secure passwords against brute force attacks, but the exposure of such a large number of accounts is concerning. While the bcrypt hashing provides some protection, users are still advised to change their passwords, especially if they reused the same password across multiple websites.
According to Hunt, some of the data in the database dates back several years, though it includes recent information, with the last password change recorded in September 2024. This suggests that the breach affected both long-time users of the Internet Archive and newer accounts.
How Did the Breach Happen?
As of now, the exact method of entry used by the hackers to access the Internet Archive's database remains unknown. There is no clear link between the data breach and the DDoS attack, but both occurred within a short time frame, leading to speculation that the two incidents may have been coordinated.
Brewster Kahle’s initial statement confirmed that the team had been focused on fending off the DDoS attack for most of the day and only later realized that user data had been compromised. It’s possible that the DDoS attack served as a distraction, allowing hackers to breach the system and extract the database unnoticed.
What Steps Are Being Taken?
Following the breach, the Internet Archive disabled the affected JavaScript library, which had been compromised to display the defacement message. The team is currently working on scrubbing their systems of any malware or unauthorized scripts that may have been installed during the attack. Security upgrades are being implemented to protect the site from future breaches.
Kahle also reassured users that more information will be shared as the investigation unfolds, and they are working closely with cybersecurity experts to ensure the safety of their systems.
Implications for Internet Archive Users
The breach has serious implications for the millions of people who have used the Internet Archive's services. With email addresses and bcrypt-hashed passwords exposed, users are advised to:
- Change their passwords immediately, especially if the same password was used across multiple websites.
- Enable two-factor authentication (2FA) where available to add an additional layer of security to their accounts.
- Check Have I Been Pwned (HIBP) to determine if their information was part of the breach.
HIBP will soon allow users to check if their data was exposed in this breach, thanks to Hunt’s efforts to upload the compromised database to the service.
BlackMeta and Future Attacks
The hacktivist group BlackMeta, which claimed responsibility for the DDoS attack, has hinted at future attacks on the Internet Archive. BlackMeta has a history of targeting organizations with disruptive cyberattacks and previously targeted the Internet Archive in May 2024.
While their motivations remain unclear, it appears that the group is intent on causing as much disruption as possible. They posted a cryptic message on X, suggesting that another attack is planned for the following day. This raises concerns about the Internet Archive’s ongoing vulnerability to such attacks.
A Major Breach with Long-Term Consequences
The Internet Archive's data breach is a sobering reminder of the risks associated with online platforms that store sensitive user data. With over 31 million user records exposed, the fallout from this breach could be significant, both for the affected individuals and for the organization itself. The Internet Archive is a vital resource for the digital preservation of knowledge, but this incident highlights the importance of robust cybersecurity measures for even the most trusted institutions.
As the investigation into the breach continues, the Internet Archive must act swiftly to rebuild trust with its users and ensure that similar incidents are prevented in the future.