We’re pleased to announce that we have just released acoustic similarity in AcousticBrainz. Acoustic similarity is a technique to automatically identify which recordings sound similar to other recordings, using only the recordings themselves, and not any additional metadata. This feature is available via the AcousticBrainz API and the AcousticBrainz website, from any recording page. General documentation on acoustic similarity is available at https://acousticbrainz.readthedocs.io/similarity.html.
This feature is based on work started by Philip Tovstogan at the Music Technology Group, the research group that provides the essentia feature extractor that powers AcousticBrainz. The work was continued by Aidan Lawford-Wickham during Summer of Code 2019. Thanks Philip and Aidan for your work!
From the recording view on AcousticBrainz, you can choose to see similar recordings and choose which similarity metric you want to use. Then, a list of recordings similar to the initial recording will be shown.
These metrics are based on different musical features that the AcousticBrainz feature extractor identifies in the audio file. Some of these features are related to timbral characteristics (generally, what something sounds like), Rhythmic (related to tempo or perceived pulses), or AcousticBrainz’s high-level features (hybrid features that use our machine learning system to identify features such as genre, mood, or instrumentation).
One thing that we can immediately see in these results is that the same recording appears many times. This is because AcousticBrainz stores multiple different submissions for the same MBID, and will sometimes get submissions for the same recording with different MBIDs if the data in MusicBrainz is like this. This is actually really interesting! It shows us that we are successfully identifying that two different submissions in AcousticBrainz as being the same using only acoustic information and no metadata. Using the API you can ask to remove these duplicated MBIDs from the results, and we have some future plans to use MusicBrainz metadata to filter more of these results when needed.
We haven’t yet performed a thorough evaluation of the quality of these similarity results. We’d like people to use them and give us feedback on what they think. In the future we may look at performing some user studies in order to see if some specific features tend to give results that people consider “more” similar than others. AcousticBrainz has a number of additional features in our database, and we’d like to experiment with these to see if they can be used as similarity metrics as well.
The fact that we can identify the same recording as being similar even when the MusicBrainz ID is different is interesting. It could be useful to use this similarity to identify when two recordings could be merged in MusicBrainz.
The data files used for this similarity are stand-alone, and can be used without additional data from AcousticBrainz or MusicBrainz. We’re looking at ways that we can make these data files downloadable so that developers can use them without having to query the AcousticBrainz API. If you think that you might be interested in this, let us know!