MetaBrainz and the Music Technology Group at Universitat Pompeu Fabra are pleased to announce the first public release of the AcousticBrainz project.
What is AcousticBrainz?
The AcousticBrainz project aims to crowd source acoustic information for all of the music in the world and make it available to the public. The goal of AcousticBrainz is to provide music technology researchers and open source hackers with a massive database of information about music.
AcousticBrainz uses a state of the art research project called Essentia (http://essentia.upf.edu/), developed over the last 10 years at the Music Technology Group.
Data generated from processing audio files with Essentia is collected by the AcousticBrainz project and made available to the public under the CC0 license (public domain). In 6 weeks since its inception, AcousticBrainz contributors have already submitted data for 650,000 audio tracks using pre-release software.
Today we are releasing client programs to submit data to the AcousticBrainz server and our first public release containing audio features for over 650,000 audio files.
What data does it have?
AcousticBrainz contains information called audio features. This acoustic information describes the acoustic characteristics of music and includes low-level spectral information such as tempo, and additional high level descriptors for genres, moods, keys, scales and much more. These features are explained in more detail at http://acousticbrainz.org/sample-data
How can I get it?
You can access AcousticBrainz data via our API. See details at http://acousticbrainz.org/api
We also provide downloadable dumps of the whole dataset. You can download it (all 13 gigabytes!) at http://acousticbrainz.org/download
What can I do with it?
We hope that this database will spur the development of new music technology research and allow music hackers to create new and interesting recommendation and music discovery engines. Here are some ideas of things we would like to see:
- Music discovery
- Playlist generation
- Improving the state of the art in genre recognition
- Analytics on the musical structure of popular music
- and more!
This is one of the largest datasets of this kind available for research, and the only one of this size that we know of which contains both freely available data as well as the reference source code used to compute the data.
How can I contribute?
If you are a music researcher, you can help us by contributing to the essentia project. Go to the essentia homepage to see how you can do this. If you do something cool with the data let us know. We’d like to start a “made with AcousticBrainz” page where we will showcase interesting projects.
If you have any audio files, we would love for you to contribute audio features to our project. You can do this by downloading our submission clients from http://acousticbrainz.org/download. We provide clients for Windows, Mac, and Linux.
If you find any bugs or errors in the AcousticBrainz stack please let us know! Report issues to http://tickets.musicbrainz.org/browse/AB.
We can’t wait to see what kind of things you will make with our data.
The AcousticBrainz team.
12 thoughts on “Announcing the AcousticBrainz project”
What is the difference between AcousticBrainz and AcoustID?
We explain the difference between AcousticBrainz and AcoustID on our FAQ page: http://acousticbrainz.org/faq
An algorithm that could use some tweaking: About 12 or so out of 15 tracks on http://musicbrainz.org/release/53e51fbd-54c8-4c44-8cc4-498bfda9666b, a recording of Stravinsky’s “The Firebird”, are described as “undanceable”. It’s a ballet, so specifically written as music to be danced to.
This is a great question, and deserving of more than a comment to answer. I’ve started writing a blog post explaining how many of these values are calculated and will post it in the next few days. The short answer is that of course the Firebird Suite is danceable, just not with the same definition that our computer systems use.
Good, I’ll look forward to seeing the blog posts, as this sounds like a very exciting project, and I’d like to get a grasp on what the terminology for each field implies in terms of practical use. In the meantime I’m working on submitting chunks of my own music files.
Such a project could help elaborate tools to find out adequate musical illustration on purpose. This could help diversity of music if that’s fun and easy to use. I wold like to have such a tool for my needs (family video, blog article (to create a mood), etc…)
I had no idea chachacha was so popular… http://acousticbrainz.org/c44f7e2d-10ed-4513-8306-5ef44119b196 for instance.
Staffan: There are some relevant comments in the comments of http://blog.musicbrainz.org/2014/11/21/what-do-650000-files-look-like-anyway/