Picard 0.5.0-test2 released!

With much help from Lukas Lalinsky and Dave Evans, we’ve finally managed to scrape together the next picard release. The highlights for this release are:

  • Full UNICODE support on all supported platforms. Tags and filenames are now read and written with the proper encoding support.
  • ID3v2.3 AND ID3v2.4 support — selectable in the options dialog.
  • Increased stability
  • A more rounded feature set
  • Tons of bug fixes
  • UI improvements
  • Built on top of a release version of wxWidgets, for greater stability
  • New windows installer

Downloads:

Continue reading “Picard 0.5.0-test2 released!”

Picard update

I know that a lot of people have been waiting for the next release of Picard. Fortunately, I’ve been working on it — unfortunately its taking me a bit of time to get to the next release. In order to get end-to-end UNICODE support in Picard, a number of serious changes were required to the underlying libtunepimp library.

The good news is that these massive changes are nearly done. The bad news is that the application will probably no longer run on Win 9x and Me — the UNICODE support in those two OSes is too weak. And the one thing I want to get right with this release is the UNICODE support. It is exciting to see Picard writing Cyrillic filenames — that’s for sure.

I’m trying to finish up the app in the next few days — stay tuned.

Lucene based tagging update

I previously mentioned that Lucene rocks — well, that is not giving it enough credit. I’m working on the guts to a Lucene enabled Picard tagger, and in doing so I have created a simple script that chewed through a given set of mp3 files and attempts to match them up with MusicBrainz.

My friend Vee once gave me a CD full of hip-hop music to give to my GF. I took one look at it and stared in shock! What a mess — not many id3 tags, mostly no album names at all. Lots of friends vs friendz problems — much slang used in inconsistent ways. Ick!

I ran this through the old tagger a while back and it matched roughly 30% of the tracks. I’ve been using this set of files to tune the new tagging engine and once things got cached into memory, it chewed through over 100 files in under 7 seconds:

60% matched: 64 files matched, 41 files with suggestions, 1 files not matched.

60% !! Check the results for yourself!

And of the 41 files that have suggestions at least 80% of them have the correct match in the top 3 closest matches. I’m floored — it works so well, and there are a number of improvements still left to make. The downside? You need the 700Mb lucene index on your hard drive. That’s going to be more than 250Mb to download. 😦 I’ll have to work out the right combination of BitTorrent, caching, and P2P solutions to tackle that minor issue.

But this is really stunning!

Lucene enabled Picard

In the last couple of days I’ve stuffed Lucene into Picard and it has given me some quite amazing results. I’ve opened a collection of untagged files and watched it open the right albums and populate it with tags automatically. Mind you, none of the tags were previously tagged with MB ids. Plain amazing!

I have this hip-hop compilation that my friend put together and its utter crap — duplicates, many files without tags, crappy spelling and mostly from greatest hits albums. Ick. The original tagger identified less than 15% of the tracks. The new tagger identifies 50% – 60% of the tracks — that’s a really good rate for this crappy collection.

Continue reading “Lucene enabled Picard”

Trod trod trod

I have Picard starting up on Gentoo linux! Finally sorted out some permission problems and got the last of the requirements installed, and it runs! Well, almost. I can’t actually do anything; dragging files has no effect, the application complains a bit and won’t shut down properly; but still, progress! On the Mac, I’ve managed to work around some Python issues with respect to case insensitive filesystems, but now wxPython won’t compile, so I’ve taken a break from that and turned to the Windows front: I’m installing Windows XP into a VirtualPC as I write.

As an aside, I’ve almost got the my MusicBrainz test server up and running again, having some issues with the database import though. I forgot how many perl modules you need to install to get this thing running, but it only has to happen once.

They're not barriers, they're challenges

Mac OS X: I’ve decided Fink can suck er .. something rancid, and I’ve gone down the path of compiling everything from scratch, including gtk+2 and all of its dependancies, and Python. Now that I’ve got that all out of the way, it turns out that OS X has it’s own version of Python, and this version of Python doesn’t play ball with the ctypes package. Ah well, getting closer, slowly.

Lucene rocks!

I’ve been playing with the Lucene text indexing system (in particular, I’m playing with PyLucene, which is a GCJ compiled version of Lucene with Python bindings). Lucene does text searching really well and its fast!

Eventually I’d like to use Lucene to power the MusicBrainz searches as was as building a copy of it into Picard. Picard? Yes! Lucene is so good, that you can give it a track title and chances are its going to find the right track. My idea is this:

  1. Cluster new files and determine which artists these files cover.
  2. Download and cache the metadata for the artists locally, and build a lucene index of it.
  3. Throw each of the tracks at lucene to see what it can match.
  4. If nothing matches, maybe do a full DB search via the web service or do a TRM calculation.

I’m excited by this — the proof of concept looks fabulous. Executing it on the full scale where things are getting cached and locally indexed, is going to be a fair amount of work. Unfortunately.

But, this gives me hope that Picard will have some serious brainz under the hood. 🙂