Last weekend approximately 20 dedicated MusicBrainz fans and customers all got together at Universitat Pompeu Fabra, Barcelona to discuss all things metadata. Now that the weekend has passed, and everyone is back at their respective homes, I can fill you in on how this fantastic summit went. We talked about a lot of topics at the summit, and I encourage you to read the extensive notes that Ian took (a huge thanks!) if you’re interested in all the details.
Friday was a lazy day while people arrived, but we did spend a bit of time at the summit venue. We got an introduction from Music Kickup – a new Finnish startup which offers a cloud based record label for artists. Spotify presented their ingestion process, which was helpful as at MusicBrainz we’re beginning to plan the new Ingestr project.
On Saturday we got under way with the summit proper. Brewster Kahle from the Internet Archive spoke on how the Archive works, and its plans going forward to try and archive even more about music. The general motivation is to provide listeners with more ways to explore and discover content, and the Archive are looking for ways that MusicBrainz can help with its linking its metadata to the Archive’s content.
Next, we moved on to discuss the Artist Image Archive – or more generally, adding more images to MusicBrainz. The conclusion was here that this is certainly a wanted feature, and we would like to try and use Wikimedia Commons to store public domain/CC-licensed artwork and fall back to the Archive for other content. We also lightly talked about adding label images, and the Archive are again happy to host this content if that’s needed.
While waiting for the full CompMusic team to be present, we outlined Ingestr, a forthcoming MusicBrainz tool to work with dumps of metadata. CompMusic then presented what they are working on, and how it ties in with MusicBrainz. They are currently storing some data outside the MusicBrainz schema, but would love to store as much as possible inside our database (and we’d love that too!). They are interested in adding more information to works, such as ragas and talas for Indian classical music.
The final topic on Saturday was to try and get some ideas going on how we want to store events, locations and venues inside MusicBrainz. No conclusions were reached here, but there was a lot of excitement and considerations and what we want to store, with the suggestion that some of this work might make a fantastic Summer of Code project next year.
We picked up on Sunday, and dived straight in with a discussion on bringing multiple release countries/release dates back to single releases, with a consensus that this should be done. Work here will probably resume in February, with the next schema change.
Dynamic work attributes were next on the agenda, and we sketched out a plan on how we can add new attributes to works that don’t depend on schema changes. This should allow us to add ragas and talas for the CompMusic folks, but also many other interesting properties. Again, work on this one is looking likely to begin again for the next schema change.
Much like Summit 11, the instrument tree came up again. People were in agreement that we want richer data about instruments, and making them entities that can be used in relationships (including between instruments) is probably the way to go.
nikki outlined her proposal on data quality, and we all spent a bit of time discussing what we’re trying to achieve with data quality, how to overcome the social problems with the word ‘quality’, and so on.
Warp bought up the problem of capturing series of releases in MusicBrainz, and we discussed how to solve this problem (for example, consider the ‘Dubstep Allstars’ series of releases). We agreed that the best way to move forward is to introduce the idea of ordered series, and deal with unordered series later.
The difficult topic of box sets came up, and while we didn’t make a clear decision on how to solve this problem, people had the chance to explain solutions as they see them, and everyone has a good understanding of both the challenges of box sets and the information that we’re trying to capture.
Like in Summit 11, we once more discussed the hard problem of genres, as these have become something that customers are increasingly requesting. We outlined all the various solutions that other projects use (such as SoundCloud, Music Kickup, and SoundUnwound). This topic didn’t reach any type of conclusion, but it was good to have some cross-pollination of ideas.
We wrapped up the day with some social topics, and ocharles started a discussion on how we can better promote ourselves and communicate our new features. There was a lot of energy in this discussion, but some of the key ideas were – more community interaction on blog posts, a clear ‘checklist’ of what to do when we release big features. kepstin also volunteered to look at rewriting the landing page with something more interesting than the static page we currently have.
The summit has proven once again to be a fantastic experience, not just for the quality of discussions, but also the level of interaction between participants. With the Saturday group meal, socialising at the apartment and continued discussions during breaks at the summit itself, it was great to see people chatting, laughing and generally having a great time.
Thanks to everyone who came for making the summit what it was. We hope to see you all again next year!
