Releasing our new Solr search infrastructure

Hey folks, samj1912 here again o/

As you might know, we recently did a massive upgrade of our search infrastructure. If you have not been following our Solr updates, definitely check out our other blog detailing our search server journey and the improvements and changes that come with the new search.

We have had a beta run with Solr this last week and fixed most of the show-stopping bugs. As such, we have been stress testing our Solr search by replaying our production logs on it, live.

Solr search seems to solvr almost all our qualms with search and as such, we have made the decision to use Solr for our production search servers.

The purpose of this blog post, as nicely worded by our BDFL Rob is –

Speak now or forever hold your pickle. In a week, the ole search servers gets it.

And it’s basically that, if you haven’t experimented with Solr search, please read our earlier blog to know what’s what. If you find any bugs, please report them on our ticket tracker. In case there are no new show-stoppers reported, that must absolutely be fixed before we switch to Solr on the main website, we will be killing the old search servers and replacing them with our brand new Solr ones in a week.

Apart from that, we have made a discourse thread to report any minor improvements in the search results.

Another thing, I’d like to remind everyone is that, with our switch to this new Solr infrastructure, the version 1 web service (ws/1)will soon be discontinued. As announced earlier, we will keep it alive till 31st July 2018 but it will get the axe on 1st August 2018, 12 pm GMT.

MusicBrainz Search Overhaul

Hello people o/, samj1912 here.

I am extremely glad to announce that we are finally launching our Solr search on the MusicBrainz beta server!

Just a little history before I announce the new features and toys you get to play with:

Solr started as something that could replace our existing search infrastructure. If you have been a MusicBrainz user for a while, you might know that our search has quite an indexing latency and it takes as much as 3 hours for new edits to show up in the search results. In part because updating the search index involved doing an entire re-index of the database. With the high latency and the resources it took, the current search server left much to be desired.

Another area that our current search lacked in, was showing popular results and search ranking. Searching for a famous artist or place returned results that contained a lot of noise, and more often than not, contained results that weren’t relevant to what the user had in mind when they searched for it.

These were the two major problems that motivated us to shift to a better infrastructure for our search needs.

Thus, MB-Solr was born.

It has been in development for quite some time now. The coding for the project started with Mineo back in 2014 and was carried forward by Jeff Weeksio in GSoC 2015. But due to lack of development resources and other, more pressing needs, the project was put on a hold for a while, until Roman started working on it. However, he left MetaBrainz before he could finish this work, so when I joined the MetaBrainz team, the first and foremost task that was assigned to me was getting Solr working and ready for production.

After struggling with multiple moving parts and services, tons of issues with maintaining compatibility with our existing web-service API, rowing up and down multi-threading/processing hell, learning just enough about information retrieval to get our search relevance on point and countless hours sifting through Solr documentation to get our Solr cluster fine-tuned and running fast enough to keep up with our web traffic… we are finally here.

I am pretty sure I would’ve rage-quit dozens of times during this last year if I was doing this all alone.

As such, we have our trusty sysadmin Zas to thank for taking care of all the deployment needs and making sure Solr was well-tested (believe me we toyed with Solr like little kids in a sandbox) and wasn’t going to fail and wake him up 3 AM in the morning with red alerts all over. Mineo, Bitmap and Yvanzo were there, with much-needed code reviews and help with all things Solr and MusicBrainz. Our style leader Reosarevok, and CatQuest helped us test our new search relevance configuration. And of course, we had our BDFL, Rob over-seeing things and whipping them into shape (with chocolate and mismatched socks of course).

Anyway, here’s what you are here for:

New features/improvements

  • (Almost) Instantaneous search-index updates – Edit something and immediately see it in the search results. Say goodbye to that note you used to see below the search telling you that you have to wait. Who likes waiting anymore – seriously, it’s 2018.
  • Better search results – We wanted to make sure you were getting the right Queen and London as the top result. You can finally link your favorite artist to London, UK as opposed to London, Arkansas. Don’t believe me? Go try it out.
  • Less load on our servers – Meaning we can serve more of your requests, faster. Getting tired of waiting for tagging your bajillion songs in Picard? Well, you still gotta wait, but less so, now that we are better equipped to handle your requests.

What has stayed the same

  • WS/2 Search API – We know you devs hate doing that extra work to maintain your applications’ compatibility with that one site that changes its API on a whim. Well, we wouldn’t want you to spend those hours following that one int to float change that broke everything ever. As such we have worked hard to make sure that Solr doesn’t change any of our WS/2 search schema.

What’s gone

  • WS/1 Search API – We deprecated WS/1 back in 2011. With the new search servers in place, there are only 3 words for those still using it after WS/1 being deprecated 7 years – ‘poof, it’s gone’. The service still works on our main website, but its search functionality will be phased out soon, while the entire service will be discontinued in August 2018 as announced earlier.

Now, you must be thinking there is some catch, some slip. Well so do I, which is why we are releasing this beta for you to test the heck out of our new search over at the MusicBrainz beta site. If you haven’t used it before, worry not – it has all your personalizations and all our cool music metadata from our main site. You should feel at home. (Note: The MusicBrainz beta site works on the live data. Any edits you make on the MusicBrainz beta site will also be reflected on the main site.)

So please! Go check it out!

If you feel you aren’t getting what we promised you or you want more of those shiny new features or that this blog was too long or like a TV commercial, feel free to complain at our Ticket Tracker for Solr. You get your promised features bug-free and our devs get to earn their living. It’s a win-win.

Happy testing!

Server update, 2018-05-30

React migration resumes with this server release which features rewritten area search results page and fixes a few regressions in editing forms. Thanks to reosarevok who added support for crediting label in relationship. Beatport, Musopen (score) and six other databases are now handled as external links. Some more small issues have been addressed too, including web service/collection bugfix, release display improvement, and other external links updates. The git tag is v-2018-05-30.


  • [MBS-9719] – Convert the area search results page to React


  • [MBS-9675] – Lyrics language dropdown missing while creating works from the relationships editor
  • [MBS-9676] – Cannot select work attributes on non-English localisations
  • [MBS-9704] – 400 Bad Request error when requesting user-tags (or user-ratings) and user-collections
  • [MBS-9710] – Release editor: Add a new recording: You haven’t made any changes!
  • [MBS-9715] – Non-standard barcode entering broken

New Feature

  • [MBS-9630] – Extend relationship credits to labels


  • [MBS-9565] – Update the CD Baby logo used in the sidebar
  • [MBS-9609] – Update the Bandsintown logo used in the sidebar
  • [MBS-9646] – Normalize Bandcamp URLs to https
  • [MBS-9670] – Update the Facebook logo used in the sidebar
  • [MBS-9700] – Extend BnF URLs auto-select, cleanup and validation to instruments


  • [MBS-6130] – Clean and validate Beatport URLs
  • [MBS-8629] – Hide part works from release view
  • [MBS-9326] – Add Beatport links to the sidebar
  • [MBS-9614] – Match with the “purchase for download” release-URL relationship type
  • [MBS-9618] – Extend URL auto-select and validation to series/festivals
  • [MBS-9682] – Allow selection of “download for free” with Google Play
  • [MBS-9684] – Add DRAM to the other databases whitelist
  • [MBS-9685] – Auto-select, clean and validate Musopen URLs as score download for free for works
  • [MBS-9694] – Add TouhouDB to other database whitelist
  • [MBS-9697] – Add the Library of Congress Name Authority File to the other DBs whitelist
  • [MBS-9698] – Add SNAC to the other dbs whitelist
  • [MBS-9702] – Add Prog Archives to the other DBs whitelist
  • [MBS-9717] – Add NDL Authorities to the other DBs whitelist

Server update, 2018-05-09

This bugfix release mainly addresses UI regressions from the previous server release. Thanks to reosarevok, it now handles license links for works and SoundCloud links for places. Another change is that emails sent with a hidden address from the website by other editors are now using like other emails from the website do. The git tag is v-2018-05-09.


  • [MBS-9658] – /instruments page breaks if a new instrument type is added but not used
  • [MBS-9673] – Entity search options in the header are no longer translated
  • [MBS-9693] – Tags without vote are not immediately visible
  • [MBS-9705] – Overview tab link is now appended with /show
  • [MBS-9708] – Querying area containments is very slow


  • [MBS-9639] – Extend Soundcloud relationship to places
  • [MBS-9688] – Add autoselect and cleanup for work license rel
  • [MBS-9692] – Normalize VocaDB and UtaiteDB URLs to HTTPS
  • [MBS-9696] – Replace with in hidden email From field

Server update, 2018-04-23

After two months of rewriting parts of the website renderer to React/JSX, it was about time for an intermediate release. We tried hard to make as little changes to the rendered web pages as possible. Thanks to spellew for rewriting the ISRC and “not found” pages. MusicBrainz finally gets rid of Google Analytics, thanks to chirlu’s early contribution. Besides, this release contains a few small user interface improvements and bugfixes, as well as usual additions to the lyrics whitelist. The git tag is v-2018-04-23.


  • [MBS-9606] – Rewrite ISRC index page to React/JSX
  • [MBS-9607] – Rewrite “not found” pages to React/JSX
  • [MBS-9626] – Rewrite entity headers to React/JSX
  • [MBS-9635] – Rewrite Wikipedia extract to React/JSX
  • [MBS-9689] – Rewrite the election pages to React/JSX
  • [MBS-9690] – Rewrite the aliases pages to React/JSX


  • [MBS-9374] – Langcode not displayed when searching works
  • [MBS-9548] – Same link showing twice on the sidebar
  • [MBS-9628] – Items from main menu do not expand down in IE11 on MusicBrainz
  • [MBS-9636] – Edit annotation tab in Work uses the wrong sub header even if work type is available
  • [MBS-9668] – Non-English breaks the instrument list


  • [MBS-3643] – Add Musixmatch to the lyrics whitelist
  • [MBS-6530] – Remove Google Analytics
  • [MBS-9645] – Extend Geonames autoselect to places
  • [MBS-9648] – Add a bunch of lyrics sites to the whitelist


  • [MBS-8417] – Wikipedia extract language fallback should be smarter
  • [MBS-9681] – Group core entity types in search options

No Spring 2018 schema change

We recently decided not to have a spring 2018 schema change release. As usual, we still have some bits left over to finish up from the last spring schema change. More importantly, we’re making a concerted effort to improve the user experience (UX) of the MusicBrainz site — more on that in a blog post later.

We may decide to do an autumn 2018 schema change, but this depends on how well our UX efforts progress over the course of winter and spring.

Server update, 2018-02-09

This server release mainly introduces a confirmation request when adding a new release (or a new medium to a release) without setting a format, because entering this information is often skipped, yet the editor usually knows it. It also contains URL cleanup updates and localization bugfixes, and the instrument list template has been rewritten in React. The git tag is v-2018-02-09. Thanks to naiveaiguy and spellew for their contributions!


  • [MBS-9590] – Rewrite the instrument list in React/JSX


  • [MBS-9599] – Translations are not applied on the 404 page
  • [MBS-9600] – Work attribute type and value names are not translated on the work edit form
  • [MBS-9603] – Series ordering type descriptions are not translated on the series edit form

New Feature

  • [MBS-9368] – Ask for confirmation when leaving format empty


  • [MBS-9587] – Add a few Japanese lyrics sites to the whitelist


  • [MBS-9562] – Improve Deezer URL cleanup
  • [MBS-9597] – Update VGMdb URL cleanup to use https
  • [MBS-9612] – Remove locale from URLs