Lucene based tagging update – MetaBrainz Blog

I previously mentioned that Lucene rocks — well, that is not giving it enough credit. I’m working on the guts to a Lucene enabled Picard tagger, and in doing so I have created a simple script that chewed through a given set of mp3 files and attempts to match them up with MusicBrainz.

My friend Vee once gave me a CD full of hip-hop music to give to my GF. I took one look at it and stared in shock! What a mess — not many id3 tags, mostly no album names at all. Lots of friends vs friendz problems — much slang used in inconsistent ways. Ick!

I ran this through the old tagger a while back and it matched roughly 30% of the tracks. I’ve been using this set of files to tune the new tagging engine and once things got cached into memory, it chewed through over 100 files in under 7 seconds:

60% matched: 64 files matched, 41 files with suggestions, 1 files not matched.

60% !! Check the results for yourself!

And of the 41 files that have suggestions at least 80% of them have the correct match in the top 3 closest matches. I’m floored — it works so well, and there are a number of improvements still left to make. The downside? You need the 700Mb lucene index on your hard drive. That’s going to be more than 250Mb to download. 🙁 I’ll have to work out the right combination of BitTorrent, caching, and P2P solutions to tackle that minor issue.

But this is really stunning!

5 thoughts on “Lucene based tagging update”

Awesome, can’t wait until the new tagger is in a fairly usable state. I’ve been putting off tagging my music (and therefore putting off installing NetJuke) for quite a while. Appreciate the work..

online poker

You may find it interesting to check the sites dedicated to online poker texas holdem online poker

texas hold em

Please visit some helpful info dedicated to texas hold em online poker texas holdem

texas-holdem

Please check some helpful info about texas hold em online poker online poker

well, I have a 80gig or so hardrive, and also bittorent installed, I am willing to host a 1gig big file to be downloadable by kazaa and bittorent

if and when you have a bittorent file available tell me and I’ll get it. I have broadband and I’m pretty much online 24/7

if that helps at all I mean 🙂

~mo

5 thoughts on “Lucene based tagging update”

Leave a Reply Cancel reply