Incident

Activist group scrapes 300TB from Spotify Music Library


Learn More

Streaming giant Spotify confirmed that its music library was scraped by an activist group known as Anna's Archive, resulting in the extraction of approximately 300 terabytes of data from the platform. The activists scraped public metadata and circumvented the platform's digital rights management (DRM) systems to access audio files. 

Anna's Archive, a shadow library organization traditionally focused on books and academic papers, announced the scrape in a detailed blog post on December 20, 2025, positioning the effort as a "preservation archive" for music and claiming it represents the world's first fully open music preservation archive. The scraped data includes:

  • Metadata for an estimated 256 million tracks (approximately 99.9% of Spotify's entire catalog)
  • 86 million audio files representing around 99.6% of total listens on the platform
  • 186 million unique International Standard Recording Codes (ISRCs)
  • Public user-generated playlists
  • Album artwork and artist information
  • Audio analysis data totaling 4TB when compressed

According to Anna's Archive, the group prioritized tracks based on Spotify's internal popularity metric, which ranges from 0 to 100. For tracks with popularity scores above zero, the audio was preserved in its original OGG Vorbis format at 160kbps. For less popular tracks with a popularity score of zero, the audio was re-encoded to OGG Opus at 75kbps to conserve storage space. The cutoff date for the scrape was July 2025, meaning content released after that date may not be present in the archive. 

The data is being distributed through peer-to-peer torrent networks using Anna's Archive's proprietary Anna's Archive Containers (AAC) format. Metadata is already released and audio files are being released gradually in order of popularity.

Spotify confirmed the unauthorized access and has identified and disabled the user accounts that engaged in scraping. Spotify has implemented new safeguards designed to prevent similar anti-copyright attacks in the future. 

The streaming service claims that the investigation found no indication of non-public user information being compromised in the breach. Spotify characterized Anna's Archive as "anti-copyright extremists" who have previously pirated content from YouTube and other platforms. The company stated it is actively monitoring for suspicious behavior and working with industry partners to protect the rights of the creative community.

The incident has raised significant concerns among music industry stakeholders and cybersecurity experts about the potential ramifications of such a large-scale data heist. Yoav Zimmerman, CEO of Third Chair, a company that tracks unauthorized use of intellectual property, noted that the scrape theoretically enables anyone with sufficient technical knowledge and storage capacity to create their own version of Spotify using a personal media streaming server. Copyright law and fear of enforcement are the only real barriers. 

Industry experts are concerned about the potential for artificial intelligence companies to use the scraped data to train their models at scale without authorization or compensation to rights holders. The incident dwarfs the largest previously available open music archive, MusicBrainz, which contains approximately five million unique tracks compared to Anna's Archive's 256 million. 

Even if Anna's Archive positions its actions as preservation rather than piracy, copyright law typically does not include exceptions for archival intent, and legal action from Spotify and major record labels are likely. The archive already distributed across torrent networks presents significant practical challenges to lawsuits.

Activist group scrapes 300TB from Spotify Music Library