Our servers are on the move again!

Our servers are on the move again!

You might remember that a few years ago we moved all our servers from the US to Europe. Sadly, that has proven too costly* and prone to fires, so we have decided our servers will continue their travel east and move to India, where hosting them will be significantly cheaper and cooling can be provided by sun-covering swarms of locusts. Most of the non-slacking members of our dev team are based in India by now anyway, so it feels like the right place!

As part of our cost-cutting process, and since air travel is not environmentally friendly anyway, we sent our servers East by boat. Unluckily, that means they had to wait their turn for quite a while in the only traffic jam you can see from space. Luckily that is starting to get resolved now, but it will still take quite a while for the servers to clear the traffic jam, reach their destination and then be located among the twenty thousand containers in the ship carrying them, especially with all other jammed ships arriving at the same time! We’re hoping to be done with it all in a couple months though, after which you can hopefully expect a return to our usual (not-particularly-)fast service.

In the meantime, our whole infrastructure is being hosted on some old servers that our friends at the Universitat Pompeu Fabra in Barcelona donated to us. They claimed they were getting too old for use in a rigorous academic environment, that as we all know in Spain requires systems to work continuously for a few hours every weekday with only a short break for a midday nap. These donated servers are now running from our executive director’s basement. Any issues you have noticed recently are solely caused by this (and have nothing to do at all with the skill and attention span of our development team).

* It could be argued that we should be able to cover the costs given we have had several years of good financial stability by now. While that is true, when we realised that the alternative is to cut costs and spend the savings on fancy ham and beers, the right choice became obvious to us.

A very important announcement

In these troubling times (New Roman) it has come to our Attention that our Esteemed program “Picard” (from now on known as “The Captain”) is written with Very Silly™ code. This will just not do!

The MetaBrainz Foundation is a very serious organisation! It will not stand for such Silly™ things as Moose, Tissues or Membranophones!

The program (from now on known as “Sir Patrick”) has in addition to pythonesque codings also several oblique references to: burlesque starships, electronic hair-colour, Monty (including “the full”), several references to technologically minded stick-figures, John Cage, and disused towels left by your mother!

For now the program’s (from now on known as St. Paddy) most recent new release (version 2.45!§🥷) vil quickly bee retracted (bzzt!) and replaced by Harder, Better, Faster, Stronger Picard (from now on known as (♫♪♬) (version 2.4√-1€DONTBLINK;)) – this to Prevent excesses such as: Rock climbing Monkies, Tobogganing Dates, Breakdancing Pastries and–

Right, that’s enough of that, we apologise for the previous part of this blog post, it is Very Silly™ and the MetaBrainz Foundation is a very serious organisation which does not abide with such silliness!

We will now return to our regular scheduled blog post! –

Now look here!

What? Stop this at once!

That’s it, I quit!

Automating the voting system

MetaMetaData

For the last several years, one of the things our community has struggled with is a lack of active voters. We’ve tried to implement various measures to decrease the need for voters and load for the wonderful ones that actually do actively look through edits and help vote on them—e.g., making more edits auto‐edits and decreasing amount of time edits stay open. However, the edit queue is still quite unwieldy and as such we’ve kept trying to come up with other ways to decrease the load on our contributors.

Over the past few months since our last summit, we’ve been working on training AIs, both for recommendation engines and data analytics, and for helping out with spam, but it soon appeared that we had another valuable dataset: our history of 15,693,824 votes from 16,336 voters and 56,374,198 edits from 2,007,134 editors. It turns out that this is an unintended side-effects of the editing and voting system in that it creates a paper trail of our habits as a community and our collective mind.

A paper trail that you could, say, train a neural network on. And that’s just what we did.

By feeding data from our top voters, we’ve been able to train our network to replicate with 96.4% accuracy the personality when using the other half as test data. That figure is the average for 300 bots each based on our top 300 voters.
We were really impressed with the results but the story doesn’t stop there…

Meet BrainzVoter

The next logical step was to create our own Frankenstein’s monster. By training on 70% of our entire set of votes, we gave birth to a voting bot that represents the essence of our community. “BrainzVoter”, as we dubbed it, is precise and scores a staggering 98.9% accuracy on test data and comparing with the other 30% of our dataset.

To quote the late Terry Pratchet:

Ankh-Morpork had dallied with many forms of government and had ended up with that form of democracy known as One Man, One Vote. The Patrician was the Man; he had the Vote.

Edit filters

In view of the recent developments on net neutrality taken by the European Union with articles 11 & 13/17, MusicBrainz is taking measures to protect against copyright infringement: we’re implementing automatic edit filters. BrainzVoter will use the latest in NLP technology to understand what you, the editors, write in your edit notes, and use this understanding to vote on your edit. It will also inspect any URLs included in the edit note to cross-reference the data. The aggregate data will not be available to the public.

Edits with better and clearer notes will become more likely to pass. Consider this a good opportunity to (re‐)read How to Write Edit Notes!

How will this affect me as an editor?

Not much will change, and you can continue doing what you were doing before! We recommend that you take the time to make clear statements in your edit notes.
You will also be able to use a system of tags to express intent, using for example #typo #correction in the content of your edit text. Syntax highlighting and shortcuts will be available in the text editor.

In the end, by removing the need for humans to look over edits, the bot should give you, the editor, more time to add and edit and fix data in MusicBrainz, without having to spend time checking everyone else’s edits or worry about other editors disagreeing with yours!

After a brief trial period on MusicBrainz, this system will be adapted and also rolled out to BookBrainz.

We hope you will share our excitement for the benefits of automation and help us improve our training models over time. I, for one, welcome our AI overlords.

Deprecating MBIDs

This post is an April Fools joke. Rest assured, we have no intention of changing the MBID system that MusicBrainz currently uses.

But, like all good parody news items, there is an element of truth behind this post. The announcement of the Echo Nest API shutdown is real, and with this change you will no longer be able to use the Echo Nest IDs to look up information. This particularly hurts users of the Million Song Dataset, which maps each track to the Echo Nest ID. The new Spotify API isn’t even providing any compatibility api or ID mapping, leaving users to look up 1 million Spotify IDs in the remaining months that the the old Echo Nest api will remain available.

At MusicBrainz, we understand the importance of a stable identifier system. That’s why, 16 years ago, we picked these unwieldy-looking UUID identifiers – that have since proven to have stood the test of time, with room to continue growing. You can look up an MBID made 16 years ago, today – and it will still work another 16 years in the future.

Hello all,

Following Echo Nest’s bold announcement that Echo Nest ids are being replaced by Spotify IDs, we figured it was time to make our own ID change public as well – MBIDs were a fantastic idea 16 years ago, but let’s face it, they’re not the most beautiful thing around, so our MBIDs will now also be replaced by Spotify IDs to help with a proper mapping across tools. Anything without a Spotify mapping will simply get purged. This should greatly simplify the data we have and remove any doubt for some releases whether they exist or not – if they’re on Spotify, they clearly exist!

We would like to commend Echo Nest on their brave leadership in this, giving us the courage to move on from our ancient heritage and try new things. With the speed technologies evolve in this digital age, it can be hard to keep up with things and keep things fresh, but Echo Nest is showing the way forward, and we’re delighted to be able to follow so quickly in their path.

I hope you all will welcome this bold move by our team. We hope to have it ready by next schema change. We know we’re excited! 😀

PS. No, we will not provide a mapping between MBIDs and the new Spotify IDs. We trust our data users to be capable to set things up on their own. Happy hacking! 🙂