Another AcousticBrainz update and a survey

Last year we started working on features to improve data produced from information about recordings that you submit to AcousticBrainz. First part of it was a way to create datasets that are used to train high-level models. The next were dataset creation challenges.

We already have a significant number of datasets created by AcousticBrainz community. The list of public datasets is available at https://beta.acousticbrainz.org/datasets/list. A couple of days ago our experimental challenge has concluded. It was related to classifying music with and without vocals. You can see final results at https://beta.acousticbrainz.org/challenges/14095b3b-4469-4e4d-984e-ef5f1a55962c.

Your feedback on high-level data

The latest addition to AcousticBrainz is a way to provide feedback about high-level output that you can see on summary pages for recordings. After a model is applied to all of the AcousticBrainz data we can understand how well it performs on a larger scale. This should help us make further improvements to models and underlying datasets. Keep in mind that you need to be logged in with your MusicBrainz account to see this.

Survey about new features

To help us understand how well new features work for you, we created a survey for you to participate in. If you have used AcousticBrainz, please fill out the survey here: https://goo.gl/forms/Oh3a9INBCCsW2I1i1. It shouldn’t take more than 5 minutes. We’ll keep it open for about a week.

Your feedback is very much appreciated. Especially considering that we don’t have a lot of ways to collect it from people. Some come to IRC and tell us about issues they are having, some comment on blog posts or create tickets in JIRA. But at this point we need a better overview of the current state of the project.

Thank you! 🎶

Dataset creation challenges in AcousticBrainz

Datasets are an important part of the AcousticBrainz project. All machine learning models, that are used to calculate high-level information about recordings (genre, mood, danceability, etc; see https://beta.acousticbrainz.org/485bbe7f-d0f7-4ffe-8adb-0f1093dd2dbf for example), first need to be trained on a dataset. Last year we released a platform which allows people to create and evaluate these datasets within AcousticBrainz. We’ve already seen a number of interesting datasets and now we want to take this process to the next step, make it more interesting.

Recently we started working on a new feature that allows us to organize dataset creation challenges. These challenges allow us to directly compare datasets created for the same classification tasks: genre, mood, instrumentation, etc. After a challenge ends, we can use the best models on all of the AcousticBrainz data.

Everyone can participate in a challenge, so we invite you to try the current version of the system at https://beta.acousticbrainz.org/! Right now there’s only one challenge related to classification of music with and without vocals, but we might add more later. To participate in a challenge:

  1. Create a dataset manually or by importing it from a CSV file created externally (this can be done from your profile page). Make sure it has the same structure (set of classes: “with vocals”, “without vocals”) as defined in the challenge requirements.
  2. Once you have built the dataset, select “Evaluate” link on its page to go to the evaluation page. There select a challenge that you would like to submit your dataset to (search for “Classifying vocals”).
  3. Wait for results! We’ll probably post an update once we have something interesting to show.

Please keep in mind that this is a very early prototype, so some issues are to be expected. This is why we ask you to try it and tell us what you think. We encourage you to report any problems or make suggestions in JIRA or in the #metabrainz IRC channel (https://wiki.musicbrainz.org/Communication/IRC). Feel free to use IRC or the comments section if you have any questions or thoughts. Thanks!

We have several more useful features coming up later. The big ones are improvements to the dataset editor, an extension of the API for datasets that was added recently, and a way to collect user feedback on high-level data. The dataset editor should become easier to work with, especially when working with large datasets. The API will be useful for people who want to build their own tools on top of core dataset functionality in AcousticBrainz. And finally, user feedback will allow us and other dataset creators to see how their models perform on a much larger scale.

Notifications and messaging in MetaBrainz projects

During the last MusicBrainz summit in Barcelona we decided to start working on finding possible ways to implement two features that have been requested for a long time:

  1. Messaging between users
  2. Notifications about various actions in MetaBrainz projects

Since MetaBrainz is more than just MusicBrainz these days, we also want to integrate these features into other projects. That, for example, means when a user is reading reviews on CritiqueBrainz they can see notifications about comments on their edits on MusicBrainz. Same applies to messaging. These features are intended to encourage our communities to communicate more easily with each other.

Messaging

http://tickets.musicbrainz.org/browse/MBS-8721

The only ways of communication we have right now are two IRC channels, forums that we plan to replace with Discourse, and comments on individual edits. Sometimes we end up sending private emails to editors for one reason or another. Perhaps it is better to have our own messaging system for this purpose? I imagine it being similar to messaging systems on forums, reddit, etc. We would like to know what you think potential uses are for this and how it might look like to be useful.

Notifications

http://tickets.musicbrainz.org/browse/MBS-1801

Site-based notifications are another thing that people have been asking for a long time. For example, these notifications can be related to edits on MusicBrainz, reviews on CritiqueBrainz, datasets in AcousticBrainz, etc. It can be an addition or replacement for email notifications that we currently have in MusicBrainz. Maybe something similar to the inbox feature that the Stack Exchange network has. People should be able to choose if they want to keep receiving email notifications or only use the new site-based notifications.

Progress so far

We looked at a couple of ways to implement this functionality.

First suggestion was to use the Layer toolkit. The problem with it is that we don’t want to be dependent on closed software and another company’s infrastructure, especially in case of such important features.

Second was to use the XMPP protocol to handle communication and notifications. We tried to implement a proof of concept using this protocol and encountered several issues at the start:

  • It’s unclear how to store messages and process them later;
  • It can be problematic to reuse the same connection in different browser;
  • There are plenty of things that we’ll need to implement on top of this protocol ourselves (like authentication, storage, notifications).

Repository with everything that was implemented so far is at https://github.com/metabrainz/xmpp-messaging-server. Given these problems we started considering implementing our own server(s) for this purpose.

You can take a look at the document where we collect most information about current progress.

Feedback

There’s plenty of feedback on the site-based notifications feature request, and we have a pretty good understanding of what’s needed. This is not the case with the messaging feature. We explored several options for implementing this kind of functionality and decided that it’s time to refresh the list of requirements to get an idea of what needs to be done.

The goal of this blog post is to encourage discussion and gather ideas. If you are interested in these features, please share your thoughts and suggestions.