New fingerprinting questions answered

As promised, here and some answers to most common questions:

  • “Is there a contact at MusicDNS?” — Yes, use this link. A contact us page will be added soon.
  • “libofa doesn’t build on OS X” — the first rev, 0.9.1 with the first patch included will be posted later today. Stay tuned — this should also build on OS X.
  • “Are there plans to implement the Creative Commons sampling license options?” — Yes. I’ll add that and the Public Domain ‘license’ later today.
  • “What does PUID stand for?” — They are Portable Unique Identifiers.
  • “How do I get a PUID into the fingerprinting system?” — Right now you need to use the MusicIP Mixer (free) to analyze the track to get a PUID generated for any tracks that do not have PUIDs yet. PUIDs should become visible to MusicBrainz within 24 hours. This is far from perfect, but all we could get done for now. We’ll improve this before too long.
  • “It’s nice that we have an alternative to TRM now, but I’m disappointed that this is a one-way relationship.” — Thats not quite accurate. Our partnership is focused on creating a balanced relationship, but there is a limit to what we were able to accomplish for this initial release. We’ll be working on improving this soon.
  • “When will a version for OS X be available?” — Hard to say. One of the underlying toolkits still has a number of bugs that prevent us from releasing it on the Mac. Hopefully I’ll have some time to look into this soon.
  • “Maybe an obvious question (or answer), but does this excellent news mean I’ll have to re-tag all my files?” — No. The fingerprints are only used to resolve the proper metadata. Once the tagger identifies the right track, it writes MusicBrainz ids to the files, which have not changed.
  • “Since the accoustic fingerprint is opensource is there a role that MusicIP NEEDS to play?” — There is a server component to this as well but that is not open source. Without the server, the client portion is less than useful.
  • “How well does it compared to TRM?” — It should have a lot fewer duplicates and collisions than TRM. But really, time will tell. Let’s start using it and we’ll see how well it works. I do know that the PUIDs won’t have to be trimmed to keep the service alive, so this is already a drastic improvement.

Let me know if there are more questions!

(Update: The support link is now fixed.)

Technorati Tags: ,

18 thoughts on “New fingerprinting questions answered”

  1. in case you get this, else someone else will probably point it out: the yes, use “this link” to the MusicDNS conact page is broken

  2. Can you explain the role of the server side function? It would seem that once an accoustic fingerprint is generated all you need to do is match the fingerprint. So basically I dont understand what else is done that is proprietary. Could MB proceed without the proprietary tech that predixis / musicip holds?

  3. NoMe – the data generated by libofa is one kind of fingerprint, but just like human fingerprints, you can’t just “compare for equality” – the matching needs to be fuzzy in just the right ways – not too much, not too little.

    To implement this matching there may be a reduction of the data supplied by libofa to generate an internal fingerprint that only exists in the server – note that this would *not* be the same as the PUID, which is just an arbitrary identifier.

    In any case, while “all you need to do is match the fingerprint,” that’s much like saying all you need to do is recognize the face: easy to describe, but difficult to implement.


  4. “How do I get a PUID into the fingerprinting system? — Right now you need to use the MusicIP Mixer (free) to analyze the track to get a PUID generated for any tracks that do not have PUIDs yet. PUIDs should become visible to MusicBrainz within 24 hours. This is far from perfect, but all we could get done for now. We’ll improve this before too long.”

    Ok, but the MusicIP Mixer doesn’t work with Windows 98, which is what my media server runs. That’s the same reason I was never a fan of MusicMagic Mixer. I guess I’ll just have to wait to see what’s coming in the future. I think using the PUID system holds great promise, but there are still a lot of kinks that need to be worked out.

  5. richardsur: Have you tried the headless version? It might work on Windows 98 to enter music into the system. If we need to, we can create a stripped down analysis-only module.

  6. From my point of view, what would make this system really usable would be to have a command-line tool that takes a filename and the metadata for that file as arguements, and submits all that info to the musicIP servers. I wouldn’t expect to get back a PUID immediately, and when I use libofa in a day or two against the same file, I wouldn’t even expect to get back the same metadata I just submitted. I *would* expect to get back the proper PUID, even if I wasn’t getting a PUID before.

    Making this open source would be ideal in my mind, because then such a tool could run on the maximum number of endpoints. Of course, support for utf-8 metadata should be a requirement.

    Basically, I want to avoid using the musicIP mixer. It’s a surprisingly nice iTunes impersonation…. but then, I already have iTunes. And I like to script things.

  7. It’s all fine and dandy, except that getting that mixer just to submit puids is a bit much. But as soon as there is a small “analysis-only” program or something similar, I’ll get rocking.

  8. That “7.6 weeks” isn’t how long it will take to analyze, that’s how much music you have (if you played all of it non-stop).

    Anyway, I don’t see any options about submitting anything in the mixer program, am I to assume that it is doing that automatically as it crunches? There’s no prompt for a username or anything, how does it know that valid tracks are being submitted?

  9. amaiman: Submissions happen automatically during the crunching process. Invalid tracks will generate invalid fingerprints, which don’t match valid tracks, so there’s no extra authentication required. In fact, the whole submission process is completely anonymous, so we don’t have to worry about anyone trying to subpoena our databases to find out who owns what.

    We’ll try to set up some standalone apps for submitting PUID’s and analysis after SXSW. Thanks to all for the feedback!

  10. How does the program know which puid goes to which track though? Does it read the tags and base the submissions to that? If this is so, it sounds like easy target for abuse.

    Either way, I think some documentation is required.

  11. I hate to comment on the smallest possible thing, but, from a design point of view, it would be preferable if – in MB – the clef symbol used to indicate that PUIDs are present could be recolored from the current black to the violet-periwinkle more “standard” for the MB site. I’m finding that big black spot on every track very distracting from reading the text on album pages.

  12. Thanks amaiman for clearing that. But sadly the validation process is equal or slower than the play time (on my Athlon 1.2GHz). Also there semm to be various validation stages (ready and verificated). And the PUID that should have been submitted in about 1 day are not there yet several days later ( should have many PUIDs). As of now, that process is reallyto slow. I’ll wait for the standalone tool to analyze my music.

  13. I’ve read the site and I must say I don’t understand the point of MusicBrainz. It seems like such a great technology that is totally misused. I mean what it does is look up artist, album and title of songs. These tags are almost always correct in all files I’ve seen or are easily available in the filename. What WOULD have been great would have been if MB could provide tags like year, bpm, composer etc. Tags that take forever to lookup by yourself.

  14. Tags are almost always incorrect on files from the internet. Year (date) is already there on a lot of releases and a composer can also be added. BPM cannot be calculated correctly and there is almost no information toget this.

  15. Sorry I’m really late, but the trimming of TRM IDs is really impressive. You collapsed the key space from 10^154 to 10^38 by simply using every fourth byte of the 64 returned (for 16 bytes) … I don’t see how TRM stood much of chance. I also don’t see the point of promptly hex encoding and adding the dashes to use 36 bytes for a 16 byte value. I wish I’d been around to at least discuss it at the time …

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.