Acoustic fingerprinting at MusicBrainz: The Future

My last post on current state of the TRM fingerprinting solution got quite a bit of response — I was quite amazed by it really. Personally, I think people still put too much emphasis on TRM and what role it plays within MusicBrainz, but without me providing a new tagging solution there aren’t any concrete points to discuss.

Given the feedback I’ve gotten, I’d like to state a reformulated vision with regards to acoustic fingerprinting and tagging here at MusicBrainz. The two points that have received the most feedback concern acoustic fingerprinting and downloading large index files in order to use the tagger.

Acoustic fingerprinting: Since so many people professed their love for TRM and acoustic fingerprinting in general, we will do the following things:

  1. Keep TRM alive.
  2. Work to create an open replacement for TRM. See the musicbrainz-devel mailing list for discussion on this topic and if you would like to help out. The founder of Tuneprint has recently volunteered to help build this new solution and I expect that his presence in this project should stir things up a bit.
  3. When #2 is operational, we will start a gradual migration to the new server. TRM is not going away tomorrow! Got it?

The obvious problem is if #2 does not come to fruition — if you care about TRM and acoustic fingerprinting here at MusicBrainz, you should go check out the discussion on the devel mailing list and lend your hand. If it doesn’t come about and the TRM server stops being useful, then we’ll eventually turn the TRM server off.

Picard & large indexes: The Picard tagger with Lucene support will progress as planned — the only change so far will be that I will provide one machine for use as a centralized lookup server that will not require you to download the massive text index. However, I expect that Picard with Lucene will be a popular tagging tool, and that the server will get overloaded and slow in the space of a few months. Given that, we’ll have complete indexes available for people to download.

I predict loads of people will opt to download the text index since a 250Mb download will be a lot faster than trying to tag their 10,000 file collection on an overloaded server that performs 10 lookups per minute for them.

Thanks for all the feedback!

UPDATE: PLEASE stop telling me how much the large index would cramp your style and how much the fingerprinting has saved you. I know!

3 thoughts on “Acoustic fingerprinting at MusicBrainz: The Future”

  1. will be able to run a fairly hefty public lucene search server. I think we can squeeze a few thousand album lookups a minute out of a fairly low end dual CPU box.
    Can’t wait for Picard to hit the streets!

  2. Random thoughts:

    1) Release Lucene downloads only once a month, so that BitTorrent is a viable option for distribution, and have Picard fall back to the public server if a disc isn’t found so it can find the newest-and-bestest discs.

    2) Have Picard count the number of times a user uses the public Lucene server, and every x times, pop-up a “Hey, would you like to download the indexes? It’ll go much faster.” or “Hey, would you like to donate some cash?” style message boxes.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.