ruaok – MetaBrainz Blog

We can’t have nice things… because of AI scrapers

In the past few months the MetaBrainz team has been fighting a battle against unscrupulous AI companies ignoring common courtesies (such as robots.txt) and scraping the Internet in order to build up their AI models. Rather than downloading our dataset in one complete download, they insist on loading all of MusicBrainz one page at a time. This of course would take hundreds of years to complete and is utterly pointless. In doing so, they are overloading our servers and preventing legitimate users from accessing our site.

Now the AI scrapers have found ListenBrainz and are hitting a number of our API endpoints for their nefarious data gathering purposes. In order to protect our services from becoming overloaded, we’ve made the following changes:

The /metadata/lookup API endpoints (GET and POST versions) now require the caller to send an Authorization token in order for this endpoint to work.
The ListenBrainz Labs API endpoints for mbid-mapping, mbid-mapping-release and mbid-mapping-explain have been removed. Those were always intended for debugging purposes and will also soon be replaced with a new endpoints for our upcoming improved mapper.
LB Radio will now require users to be logged in to use it (and API endpoint users will need to send the Authorization header). The error message for logged in users is a bit clunky at the moment; we’ll fix this once we’ve finished the work for this year’s Year in Music.

Sorry for these hassles and no-notice changes, but they were required in order to keep our services functioning at an acceptable level.

Welcome Summer of Code 2025 contributors!

We are thrilled to announce the selection of 6 contributors to work with us for this year’s Google Summer of Code program!

MetaBrainz received many great applications this year. Selecting the final contributors was tough and involved deliberating various factors – what these contributors did right is getting in early, engaging with our community, presenting specific and detailed proposals, and proving excellent communication skills and the ability to integrate our feedback back into their proposals.

Thank you to all contributors who submitted a proposal with us!

The whole list of selected proposals can be found on the GSOC website but here is a TL;DR breakdown:

MetaBrainz proposals

Matrix Archiver (libretto) (Jade Ellis AKA JadedBlueEyes)

This project proposal replaces BrainzBot with a new archival service that archives messages directly from Matrix to HTML files on disk and a PostgreSQL database. It will support Matrix features like message editing, reactions and media, and provide full text search over all messages. Both historical and new messages as they come in will be archived.

Centralized Notification System for MetaBrainz (Junaid AKA fettuccinae)

MetaBrainz contains multiple sub-projects which sends out standalone notifications. This project aims to centralize those by developing a shared notification system within metabrainz-org, enabling all sub-projects to deliver user notifications through this notification system. Expected Outcome: A functional notifications system with relevant API endpoint.

ListenBrainz proposals

Importing Listening History Files in Listenbrainz (Suvid Singhal)

This project aims to develop a feature that enables users to import their listening history from various services, including ListenBrainz exports, Spotify, Apple Music, and other CSV file formats. The proposed solution involves creating a backend API endpoint to handle file uploads, building normalizers to parse and validate data from different services, and converting the data to the JSONL format required by ListenBrainz. The solution also includes a frontend to handle file uploads and show the progress to the user.

Onboarding Revamp in Listenbrainz-Android (Hemang Mishra)

A smooth and intuitive onboarding experience is essential for any app, ensuring that users understand its features while maintaining trust and engagement. This project focuses on enhancing the ListenBrainz Android onboarding flow by making it more informative, user-friendly, and privacy-conscious. Key improvements include a dedicated Listen Submission screen to give users full control over which apps contribute listens, clear permission rationales, and fallback mechanisms for denied permissions. Additionally, a revamped sign-in screen will provide better navigation, including a bug report option for easy issue submission.

Development of Advanced User Statistics Visualizations (Granth Bagadia AKA holycow23 AKA granth23)

The project aims to design and implement advanced interactive visualizations for ListenBrainz using Nivo for data visualization and integrating with the existing Flask API. Apache Spark will handle efficient data processing and aggregation. These visualizations will offer granular insights into genre trends, artist diversity, and temporal listening patterns, enhancing user experience and engagement. The project will result in the development and integration of the following four interactive charts into ListenBrainz: Artist Listening, Activity Statistics, Listens by Era Statistics, Genre-Based Listening Patterns and Top Listeners.

Integrate music streaming from Funkwhale & Navidrome (Mohammad Amanullah AKA mAmanullah7)

Allow users to play music from their Funkwhale servers as well as Navidrome directly in BrainzPlayer, as both are self hosted music streaming platforms. Funkwhale used a OAuth2 for secure and safe authentication, but currently Navidrome used basic subsonic authentication (username/password + salt), but soon OAuth2 authentication also will be available for Navidrome. Once these are availanble, we can support Subsonic streaming in the ListenBrainz Player.

What if you’re not in GSoC 2025?

Reading this and feeling inspired for contributing to the code still? Volunteer contributors are very welcome all year round even though we might have slightly less time available to help you during the summer. It is also putting you in an ideal situation for applying to next year’s GSoC. You can find some tips for applying to GSoC with us in one of our previous posts. When you are ready, join us on the MetaBrainz Matrix Channel and showcase your initiative and your skills !

Rest in Peace drsaunde!

Hello!

It is with a heavy heart that I share this sad news today; we’ve just found out that one of our most prolific editors, David Saunders (AKA drsaunde) has passed away.

We don’t know much about the circumstances of what happened, but a friend of David’s was kind enough to reach out to us to let us know of his passing. The friend said:

“Our dear friend, David Saunders aka drsaunde sadly passed away March 26, 2025 surrounded by friends listening to his favourite music. Dave was a brilliant man with a great sense of humour and obviously an avid fan of music. I knew he wrote for a music site but never learned which one. Quick search for a common username he used landed me on your site.”

To put into context who drsaunde really was, lets look at his impressive statistics:

He made a total of 2,191,225 edits, with only 37 rejected edits, for an astonishing acceptance rate of 99.998%. He was a member of MusicBrainz since 2006-05-06 for a total of 18 years, 10 months and 20 days, which amounts to 317 edits for each day he was a member of MusicBrainz. 🤯🤯🤯

On top of that he was our Areas editor, the person in charge of maintaining our database of cities and regions in the world. Needless to say, drsaunde has left a giant hole in our community and our hearts.

In keeping with his spirit, his friends will hold a celebration of life for drsaunde on May 3rd. We’ve sent some funds to help ensure that celebration really honors his spirit.

Finally, if you would like to take a moment to remember drsaunde, you could head over to ListenBrainz radio and make a playlist from his most listened tracks.

Rest in peace, drsaunde, you will be missed.

Google Summer of Code 2025: MetaBrainz has been accepted!

We’re are excited to announce that the MetaBrainz Foundation has been accepted into Google’s Summer of Code program for 2025! Summer of Code has been instrumental (pun intended) in the development of our projects and growth of our team over the years, so we’re pleased to be part of it for another round.

Ready to rock this summer coding with us? Start with carefully reading the terms for contributors. If you are eligible, go ahead and take a look at our Summer of Code landing page where you can find project ideas that we have listed for this year. Our landing page will also tell you what we require of our participants and how to pick up a project.

A very important note: We will not be considering any proposals from contributors who have not reached out to us before March 31.

Good luck to all who are interested in participating!

PS: If you’re feeling particularly adventurous, check out this entirely optional link for some extra motivation.

Pissed off by Spotify Enshittifying more API endpoints? We can help!

Today Spotify announced that a number of APIs will no longer be available for new users.

While Spotify won’t immediately take away these endpoints for existing users, it certainly does not inspire confidence for their longevity. Spotify cites “security reasons” as an explanation of why they are closing off these APIs, but we are unclear as to how that will improve security, so we need to assume that Spotify has some other motivations behind this move. More likely than not, they are hatching a strategy to protect their algorithmic assets from data crawlers used by third-party AI companies.

Needless to say, the Spotify services continue to get enshittified, taking away very useful features that developers have come to rely on. ListenBrainz has very different goals, being entirely open-source and part of a non-profit foundation, and we won’t pull the rug out from under our users for monetary or “security” reasons.

On the contrary, our very small team works in direct collaboration with users and developers interested in developing new discovery tools in the music space, and we embrace the variety of ways passionate music lovers want to interact with music collections and recommendations.

Our own frustrations with Spotify’s ever-worsening recommendations was the spark that lit up our interest in recommendations, but again our approach is one of fairness (we don’t tip the scales) focused on the user’s experience rather than the deep pockets of multinational labels.

For developers frustrated that their app stopped working, the good news is that the ListenBrainz team has been working on building some new datasets and API endpoints that offer replacements for what Spotify is taking away. While not everything that Spotify is enshittifying has a direct replacement with ListenBrainz, we can at least offer a path forward for developers.

These features/datasets include:

Artist similarity*: You can use our ListenBrainz Labs API to explore similar artists.The similarity datasets are still somewhat limited, since we’ve not been able to run all available data through that algorithm, but we plan to do that in the very near future.
Recommendations for a user: Recommendations are available for ListenBrainz users who send us listen information.
Custom playlist generation: https://listenbrainz.org/explore/lb-radio/ allows you to customize generated playlists as in-depth as you want (see the documentation for more details)
Popularity data: Find out the popularity of artists on ListenBrainz, as well as popular tracks for an artist
Fresh releases: Keep up with new albums coming out, generally or specifically for your taste (as seen on https://listenbrainz.org/explore/fresh-releases/)

Future new datasets include:

Track similarity
Album similarity
Your dream feature here

All of this data is Creative Commons CC0 licensed (read Public Domain) and available on our API endpoints, for free, forever. MetaBrainz is a California 501(c)3 non-profit organization dedicated to creating, maintaining and ensuring that these datasets are available for public use.

And on top of that, the person who coined the term “Enshittification”, Cory Doctorow, has been on our board of directors for 20 years, further ensuring that we’re enshittification proof.

Come play with our data – we’d love your feedback! We’re working hard to make this data better and if it doesn’t yet meet your needs, we hope to meet them soon!

* for the similar artist search, use this value for “algorithm”: session_based_days_7500_session_300_contribution_5_threshold_10_limit_100_filter_True_skip_30

Welcome Julian45 (and atj)!

I’m pleased to announce that Julian Anderson (julian45) and Adam James (atj) have joined our team as volunteer System Administrators. Julian has just now joined the team, where as Adam has been part of it for nearly 2 years and I failed to post the requisite blog post welcoming him. Mea culpa, Adam! Welcome to both of you!

We welcome volunteers to help us with our infrastructure, which continues to grow and become more complicated. The ListenBrainz project in particular has many moving parts in order to process the user’s data (stats, recommendations, fresh releases, etc.). On top of that, we’re working hard to make sure that our infrastructure is as automated as possible, so we welcome any help from people who know Ansible, like Julian and Adam.

Towards Fair Streaming: Introducing the FairMusE project

Hello ListenBrainz community!

As you know, we’ve been working hard on building recommendations and other music discovery tools as part of ListenBrainz. Our frustrations with online streaming providers and their questionable discovery features have long been a source of frustration for us, so we worked hard to build recommendations with as little bias as possible.

Fortunately, we’re not alone in our frustration with the steaming providers – researchers at Aalborg University, Denmark and Lille University, France are currently questioning the fairness of these music recommendations and have asked ListenBrainz and its community to help them with this task.

The researchers are looking for ListenBrainz users to give their permission for their public ListenBrainz data to be used as part of this research. If fair music discovery services are of importance to you, please read on and consider granting the researchers permission to use your data:

Share your listening data with the researchers and help us fight for a fair and transparent music streaming ecosystem!

Happy Birthday to us: 25 years of MusicBrainz and 20 years of Picard!

Hello!

Today is the 25th anniversary of me registering the musicbrainz.org domain. I told the story of how that happened two years ago, if you’re curious about that story. 25 years is a long time for sure and I had zero ideas that this little project would end up so enormous with users from literally all over the world.

Thanks to everyone who had/has a part in it. ❤️

Also, let’s talk about the 20th anniversary of the Picard Tagger! The beginning of Picard has a less clear starting point — at least not one that is easily Googleable 20 years on. At least ChatGPT suggests that Picard came into this world as vaporware on Februrary 9th, 2004 when the Real Networks sponsored Helix Community made a number of grants to support several open source projects.

ListenBrainz Radio: New release now live!

Hello!

I’m pleased to announce that we’ve just released the latest version of LB Radio!

This release changes many things under the hood since the first release — given all the feedback you’ve given us, we were busy making improvements clear across the board. The improvements include:

Speed: LB Radio artist and tag elements are much faster now, since most of the work is being done by Postgres (and not AI!) and not in slower and wasteful Python code as part of our Troi recommendation engine.
More distinct modes: In our first version it was hard to tell the difference between the easy, medium and hard modes. This version fixes a few bugs and improves the overall algorithms used to make the artist and tag element playlists. The playlists generated for each mode should feel more distinct now.
Country element: We’ve added a new country element that creates a playlist of recordings from artists who are originally from the selected country. This doesn’t always ensure that the music that we serve us is actually from that country or even representative of that country. Right now, we’re making a best effort for making a playlist that is representative of the country, but we might be missing the mark — you tell us. Also, tiny countries (e.g Vatican City, Andorra) don’t usually generate enough recordings to make good playlists, since there isn’t a lot of data available. Also, good thing Antarctica is not a country. 😂
Streamlined syntax and improved error messages: The parsing library that LB Radio version 1 used was pretty cool, but its error messages were hated by everyone, even hard core geeks. Thus, we wrote a new parser that could give us better error messages and to make that process easier, the syntax of LB Radio has been made more consistent.

Let’s dive into a quick look of the improved syntax — all the following examples are valid LB Radio prompts:

David Bowie

#punk

artist:(Tina Turner)

tag:(trip hop, dreampop)

The most important thing to know is that any elements that accept free form text (e.g. tags and artist names), should always be enclosed in ( ). Please refer to our official documentation for all the details.

Have fun making new playlists! And as always, if you find a bug, please create a ticket in our bug tracker.

Thanks!

SSL.com is evil and deceptive: Don’t do business with SSL.com

In the past we’ve purchased our SSL/TLS certificates from SSL.com and when we last renewed our main domain’s SSL certificate, we suddenly started getting charged $20/month for:

eSigner Cloud Signing for OV Code Signing Tier 1 Monthly

Whatever this service is, we didn’t sign up for it. And trying to get SSL.com to stop charging us and refunding us the money has been a nightmare over the past year. The UI on their site is so bad that I can’t find anything and I am constantly confused by all of the useless and cryptic information packed into every single unreadable page.

In the end, I resorted to contacting customer support, who on the surface seem nice and helpful, but really all they do is refer matters to “internal teams” who never actually resolve any issues. This is nothing more than stonewalling.

Eventually they acknowledged that we are due a refund for $200+. But they were unable to refund the money because the credit card used for the original transaction expired by the time SSL.com got its act together.

This entirely unforeseeable problem was, as you might guess, referred to an internal team. Where it has been sitting for the past year now. Any attempts to get this to move along have resulted in nothing more but:

“We’re sorry for the inconvenience, we’ve forwarded this to an internal team.”

Joy. I guess that $200 will never be recovered and I need to cut my losses dealing with evil corporations.

So, be warned: Do not do business with SSL.com and use the amazing Let’s Encrypt service instead! If by some miracle, we get our money back, I’ll donate it to Let’s Encrypt instead!