Server updated finally!

The server has been updated – this release includes a redesigned navigation system, changes to the Autofix (a.k.a. Guess Case) tool and Album Editor. Track durations can now be edited and a number of smaller bugs have been fixed. A big thanks goes out to g0llum, lukz, matt and djce for making this release a … Continue reading “Server updated finally!”

The server has been updated – this release includes a redesigned navigation system, changes to the Autofix (a.k.a. Guess Case) tool and Album Editor. Track durations can now be edited and a number of smaller bugs have been fixed.

A big thanks goes out to g0llum, lukz, matt and djce for making this release a reality. Now that we have this pesky re-org release out of the way, we hope to bring you several more releases before the end of the year.

Continue reading “Server updated finally!”

Acoustic fingerprinting at MusicBrainz: The Future

My last post on current state of the TRM fingerprinting solution got quite a bit of response — I was quite amazed by it really. Personally, I think people still put too much emphasis on TRM and what role it plays within MusicBrainz, but without me providing a new tagging solution there aren’t any concrete … Continue reading “Acoustic fingerprinting at MusicBrainz: The Future”

My last post on current state of the TRM fingerprinting solution got quite a bit of response — I was quite amazed by it really. Personally, I think people still put too much emphasis on TRM and what role it plays within MusicBrainz, but without me providing a new tagging solution there aren’t any concrete points to discuss.

Given the feedback I’ve gotten, I’d like to state a reformulated vision with regards to acoustic fingerprinting and tagging here at MusicBrainz. The two points that have received the most feedback concern acoustic fingerprinting and downloading large index files in order to use the tagger.

Acoustic fingerprinting: Since so many people professed their love for TRM and acoustic fingerprinting in general, we will do the following things:

  1. Keep TRM alive.
  2. Work to create an open replacement for TRM. See the musicbrainz-devel mailing list for discussion on this topic and if you would like to help out. The founder of Tuneprint has recently volunteered to help build this new solution and I expect that his presence in this project should stir things up a bit.
  3. When #2 is operational, we will start a gradual migration to the new server. TRM is not going away tomorrow! Got it?

The obvious problem is if #2 does not come to fruition — if you care about TRM and acoustic fingerprinting here at MusicBrainz, you should go check out the discussion on the devel mailing list and lend your hand. If it doesn’t come about and the TRM server stops being useful, then we’ll eventually turn the TRM server off.

Picard & large indexes: The Picard tagger with Lucene support will progress as planned — the only change so far will be that I will provide one machine for use as a centralized lookup server that will not require you to download the massive text index. However, I expect that Picard with Lucene will be a popular tagging tool, and that the server will get overloaded and slow in the space of a few months. Given that, we’ll have complete indexes available for people to download.

I predict loads of people will opt to download the text index since a 250Mb download will be a lot faster than trying to tag their 10,000 file collection on an overloaded server that performs 10 lookups per minute for them.

Thanks for all the feedback!

UPDATE: PLEASE stop telling me how much the large index would cramp your style and how much the fingerprinting has saved you. I know!

General update: What's up with TRM??

This general update is way overdue — a lot of things have been happening behind the scenes and its time to let everyone know where things in the MusicBrainz world are headed. I’ll start off with TRM, since that is hot discussion topic on the musicbrainz-users mailing list right now. The TRM (TRM’s are acoustic … Continue reading “General update: What's up with TRM??”

This general update is way overdue — a lot of things have been happening behind the scenes and its time to let everyone know where things in the MusicBrainz world are headed. I’ll start off with TRM, since that is hot discussion topic on the musicbrainz-users mailing list right now.

The TRM (TRM’s are acoustic fingerprints that MusicBrainz uses to identify music tracks) server is constantly overloaded and can only handle a database size of about 2.2Gb before it crashes. To prevent crashes, we prune the database where we throw out the least used TRMs, which implicitly discards work that our users have done. Not good. In order to make the TRM server perform at some reasonable level of performance, the entire database needs to be kept in RAM. Thus our server has 5GB of RAM and it still can’t keep up. The fact that this problem hasn’t reared its ugly head to the public, is a testament to Dave Evans’ skill in keeping the TRM server ticking.

Furthermore, TRMs have shown themselves not to be as unique as we would’ve liked. For example, take a look at the TRM’s with at least 5 tracks report: 4400 pages (!) of TRMs that I would consider to be sub-optimal. One example TRM (non silence on page 2) has 104 tracks associated with one single TRM. Given this, TRM is not some sort of magical solution that with great authority tells the tagger what metadata to apply to a track. Instead, its best to think of TRM as a system that lets you guess which few dozen tracks a file could be matched to — there is a lot of logic in the tagger that makes up for the shortcomings of TRM.

Thus, TRM has two major problems: its not accurate enough and it doesn’t scale well to the size that MusicBrainz has grown to. The system still functions but I expect it to start breaking down and becoming of less use over time. We have the following options:

  1. Find a replacement for TRM: Relatable doesn’t seem to be in business anymore, or at least they are in deep hibernation. No other companies that I have approached were interested in sharing their technology with MusicBrainz. (For the record, I’ve tried with 3 companies, including a couple of on-site visits in Europe).
  2. Create our own TRM solution: This is an very large endeavour — at least a year if not two, of hard work. I’d rather work to improve MusicBrainz itself, rather than hacking on acoustic fingerprint software.
  3. Throw more resources at TRM: We’re still lacking the funds for more resources, and the same argument in #2 still applies.
  4. Do something else: Find some technology that can replace TRM.

Given my babbling about Lucene, I think its a foregone conclusion that #4 is the way to go. Sometime this fall, I will release a Picard tagger with a lucene text indexing engine to replace the current MusicBrainz Tagger. The benefits of this new tagger will be:

  1. It will distribute the load on the server, since currently a large chunk of the server load goes to supporting tagger users. And a large chunk of tagger users never really contribute data to MusicBrainz or make cash donations to support the project. So, moving that traffic off the main server will allow people who want to edit/vote on the data focus on their work.

    Given that most files in the wild nowadays have some metadata, a text index will work well. Lucene is great at taking crappy data input and coming up with something useful. If TRM gets us into the ballpark and then additional heuristics do the final leg work, Lucene will give us a much better guess to start with than TRM ever did. Thus, overall tagging quality will improve greatly.
  2. A lucene tagger will work much faster than the TRM based tagger ever was. 2-5 seconds per track was not unusual given TRM — with Lucene we’ll see 2-5 tracks per second, if not much faster.
  3. Since we will no longer have to decode files to identify them, it will be easier for us to support new formats. Its less work overall.

This approach also has the following downsides:

  1. It will no longer support identifying completely anonymous files. Files that have no id3 tags and are named test1.mp3, test2.mp3 will simply not stand a chance at identification. I realize that there is great romance associated with this concept, but in reality most people have files that have some metadata in them, and thus will stand a good chance of being identified.
  2. You will need to download a 250MB Lucene index to tag your collection. This is a pretty big hurdle, but if BitTorrent can routinely help people download 650Mb movies off the net, it should help us download distribute our search indexes. After the first release of a Lucene enabled Picard, we will investigate P2P searching methods that will allow people who have no index to use some other people’s indexes (if they allow that).

So, the roadmap for this looks like this:

  1. Release picard 0.5.0 in the next few weeks and start putting it on the main page as an alternative to the MB tagger.
  2. Release picard 0.6.0 with full Lucene support and offer that as the main tagging solution for MB.
  3. When the TRM usage drops because of adoption of Picard 0.6.0, we will start phasing out TRM.

There you have it — thats the current happenings on TRM and how we hope to solve the problems that it presents us with.

Bad news: Picard on OS X

In the last few days I’ve been playing around with Picard on OS X. After fixing a few bugs in libtunepimp that prevented it from compiling on OS X, I managed to get Picard to come up. However, there are so many UI bugs that it is essentially unusable: Drag and drop does not work … Continue reading “Bad news: Picard on OS X”

In the last few days I’ve been playing around with Picard on OS X. After fixing a few bugs in libtunepimp that prevented it from compiling on OS X, I managed to get Picard to come up. However, there are so many UI bugs that it is essentially unusable:

  1. Drag and drop does not work
  2. Some options dialog items won’t un/check
  3. Adding files from Add Files dialog doesn’t work
  4. The UI is butt-ugly

This is the same code that has undergone a fair amount of debugging on Windows and Linux. Given that the code works fairly well on those two systems, I have to suspect the wxWidgets toolkit on Mac OS X. I looked into a number of UI bugs listed above only to be stumped by these bugs on multiple occasions. The code looks ok and works great on two platforms. No manner of tweaking the code allowed me to make any headway on any of the bugs.

My conclusion: wxWidgets on OS X, even the 2.6.x version, is still not ready for prime time. Thus, I’m sad to say, Picard won’t be coming to OS X soon. If someone has more experience with wxWidgets on OS X and would like to take a stab at looking at these bugs, please do. At this point I should spend my time on bugs that will make Picard better on the two platforms where there is hope.

I’m bummed. 🙁

German mirror online!

After months of tinkering, with I’m pleased to announce that MusicBrainz now has a mirror in Germany. The de.musicbrainz.org mirror is graciously being sponsored by HousePool Media International Group — many thanks to Carsten Marmulla for working hard over a number of months to find a hardware and bandwidth to support this mirror. Our two … Continue reading “German mirror online!”

After months of tinkering, with I’m pleased to announce that MusicBrainz now has a mirror in Germany. The de.musicbrainz.org mirror is graciously being sponsored by HousePool Media International Group — many thanks to Carsten Marmulla for working hard over a number of months to find a hardware and bandwidth to support this mirror.

Our two mirrors (.de and .nl) are currently underutilized, but the upcoming release of Picard will have support for tagging of mirror servers. We’ll have to encourage users to use the mirrors for tagging, so that the main server can stay available for people wanting to make changes to the database or vote on pending changes.

Summer is over!

Well, almost. I’m back from OSCON, Foo Camp, Burning Man and the Future of Music Conference. Traveling was fun, but I’m ready to wait for the not-so-nice weather and cuddle up with a computer and get some serious MusicBrainz work done. The good news is that the data licensing revenue should start rolling in within … Continue reading “Summer is over!”

Well, almost. I’m back from OSCON, Foo Camp, Burning Man and the Future of Music Conference. Traveling was fun, but I’m ready to wait for the not-so-nice weather and cuddle up with a computer and get some serious MusicBrainz work done.

The good news is that the data licensing revenue should start rolling in within a few weeks, which means that I get to keep working on MusicBrainz full time! Full time and paid — at first it won’t be much of a paycheck, but it should pay the bills. Maybe next year we can work towards a full paycheck — we’ll see.

Here is my todo list for the near future:

  1. Whip mirror servers into shape
  2. Sign more license deals
  3. Get the menu server release out the door
  4. Fix AR bugs, improve related artists, hammer out a few new server features.
  5. Release Picard 0.5.0, libtunepimp and libmusicbrainz — all of these desperately need new releases.

Of course there are lots more things on my todo list, but these are the top 5 items. Stay tuned for more info!

June finances posted

I’ve finally posted the MetaBrainz Finances for June. I had to write a script that would take PayPal’s tab seperated format and crank out a QuickBooks compatible iif file. I’d rather spend time hacking on MusicBrainz proper, but ya gotta do what ya gotta do. ( This is one of of times where my programming … Continue reading “June finances posted”

I’ve finally posted the MetaBrainz Finances for June. I had to write a script that would take PayPal’s tab seperated format and crank out a QuickBooks compatible iif file. I’d rather spend time hacking on MusicBrainz proper, but ya gotta do what ya gotta do.

( This is one of of times where my programming skills saved my bacon. What do people who can’t program do? Lots of tedious work? )

MetaBrainz signs up first customer

I’m proud to announce that the MetaBrainz Foundation has just signed up the first client for its live data-feed service! This presents a major milestone for the foundation, since it proves that there are companies interested in and willing to pay for this service. The only downside is that I cannot mention who this company … Continue reading “MetaBrainz signs up first customer”

I’m proud to announce that the MetaBrainz Foundation has just signed up the first client for its live data-feed service! This presents a major milestone for the foundation, since it proves that there are companies interested in and willing to pay for this service.

The only downside is that I cannot mention who this company is. This will have to stay secret until they roll out their product with our service under the hood. So, until then sit tight and stay tuned!

What's up PayPal??

I’m now over two weeks behind in posting the finances for June — please bear with me, while I battle PayPal a bit. I’ve been using their export to QuickBooks feature to download the transactions for all the donations we’ve been receiving. But, for some unknown reason, that option is no longer available — only … Continue reading “What's up PayPal??”

I’m now over two weeks behind in posting the finances for June — please bear with me, while I battle PayPal a bit. I’ve been using their export to QuickBooks feature to download the transactions for all the donations we’ve been receiving. But, for some unknown reason, that option is no longer available — only export to tab/comma delimited files, and QuickBooks can’t import those files.

I’d have to write a converter that takes CSV files and translates them into quickbooks files — something I am trying to avoid. Thus I am waiting for a reply back from PayPal about what’s going on.

If this is not resolved by next week, I’ll go write the script and get our finances back on track.

UPDATE: PayPal permanently removed this feature, with no explanation as to why — I guess we join the ranks of people who have gotten screwed by PayPal. I just wish they would’ve notified me. Grrr.