Acoustic fingerprints: Is closed source OK?

Ever since my post about TRM hitting its limits, I’ve been in discussions with a reputable company who has offered to let the MusicBrainz community use their fingerprint server in exchange to a free license to the MusicBrainz live-data feed. While I am not ready to reveal who this company is, I do feel that I can trust these folks — this is not the first time we’ve chatted.

The straw-man deal that we’ve put together makes sense for MusicBrainz and this company. Unlike MusicBrainz’ relationship with Relatable, this relationship would be more balanced. Plus, we would not have to maintain the server ourselves. All around I feel good about this proposed deal. There is just one little snag.

They are uncomfortable with open sourcing their client.

While I am not an open source license Nazi, I have received tons of complaints about MusicBrainz using a technology that is not fully open. As a matter of fact, my most unpleasant dealings with the general public have been on this point (that the TRM server is closed source). And I’ve had unreasonable people shout unreasonable things at me over this point. Quite frankly I am not really interested in having to defend my position on this any further, but I fear that not having a working fingerprint solution may be more of a hassle than having to defend a closed source solution.

So, my question to you is this:

  1. Do you value having access to a fingerprint solution as part of MusicBrainz more than MusicBrainz being an end-to-end open solution?
  2. What arguments can we make for having this company open source their client, as Relatable did? I’ve argued the standard open source arguments and I think that there is still a small chance that we can persuade this company to open up. I need to construct a better argument and perhaps meet with them in person to hash this out further. What things should I argue?

Please keep your idealistic everything needs to be open arguments to yourself. I simply won’t bother reading them or responding to them. I really care to see if we can find a balance where we can maximize the value that MusicBrainz presents, even if it means compromising our values slightly. If you’re not ready to make a balanced argument, then please don’t.

NOTE: If we were to start using a closed source fingerprint solution, nothing else would change. None of the existing licenses for MusicBrainz would change. So, keep your pants on and stop frothing at the mouth.

35 thoughts on “Acoustic fingerprints: Is closed source OK?”

  1. To me, any ‘acoustic fingerprint solution’ seems better than none. Even if it’s not open source. Of course, it should have advantages over the current TRM system.

  2. I don’t mind it not being open source. As zout said, any fingerprint solution is better than none.

  3. Who will port the client library to every possible (and “impossible”) platform and operating system when it’s not open source? Also, reverse engineering is legal and can’t be forbidden in many countries, so people will write their own clients anyway (and some of these tries might contain bugs that “harm” the server run by that company).

    Also, a closed source server will have the same problems as the current closed source server: what if this company disappears or stops supporting MusicBrainz? A possible way to secure MB against that, is that they give a proven working copy of their server sources into escrow to a third party, with a contract that MB gets these sources in case such troubles arise.

  4. So what would be closed source? A client library for getting fingerprints? I guess my concern would be the cross-platformness of such a closed source library. How many platforms and languages has musicbrainz client library been ported, that’s an argument for open source, because they’d have an expanded base of potential clients.

    Ultimately i don’t like this idea, no matter what it seems their server is going to be proprietary, hosted by them, if they disappear or end their agreement, whatever, poof same situation again, but a bit worse. I guess I’d say CDDB and Relatable are good lessons to why using a proprietary solution from a company while their current terms are fine is always a bad idea.

  5. Would people who are against this still be against it if the acoustic fingerprinting were a plugin that users could, at their option, install and/or use? If yes, why?

    Also, is something like that even an option?

  6. First, I like the Fingerprinting option even if it’s closed source.

    Second Jan C’s idea seems a wonderful suggestion.

    Proposal for open source persuasion.

    Arguements for open source: you have many knowledgeable people here that can improve ultimately on their concept and theoretically those improvements would be free of charge to them from developement wages. (perhaps that’s already part of the open source arguement and if so, sorry to reiterate.)

    Potential arguments against open source: Information theft, to later be utilized in a product that will be sold, thus the company losing out on financial gain.

    Points to counter argument: This is a valid point however, large corporations (I wont name any names but M-$hit comes to mind) steal code often, sell it integrated with their current software suite and then get into court battles in which they pay a large payoff anyways. Therefore the potential of code being reverse engineered and stolen already exists for a close code platform. On the alternative if dealing with an open code source, you can prove it was yours to begin with, as well as having many others support this. Therefore financial gain is still possible if not more plausible due to the mass number of knowledgeable people you can access for such a proceeding.

    That’s the only current thought I have as for debate versus close and open code to present to a company. Perhaps if I knew more the stance against open code…

  7. If there are no sufficient open-source fingerprinting packages — client-side software and server-side services– well, you’ve got to do what you’ve got to do.

    But, you might want to insist that any integration be (1) optional and (2) non-exclusive. Optional, so those religious about such things don’t need to accept bundled proprietary code/lookups to get other separable functionality. Non-exclusive, so that the adoption of one solution doesn’t crowd-out other potential solutions that could arise, from either open-source or competitive proprietary solutions.

    That is, whatever deal is made with a first provider should also be available as a standing offer to all other comers. There would not necessarily be any single official MB fingerprint, just a possibly-varying set of ‘compliant fingerprint partners’. Openness is great but often competition can provide much of the same benefit.

    Just a thought,

  8. If I remember correctly the problem with the current closed source fingerprinting software is that it’s flawed and can’t be improved because it’s closed source. So if this solution works well it would be great, open source or not. I would like to see it open sourced, but I wouldn’t really mind the proposed situation.

  9. Let me ask a question that should be obvious but for some reason no one seems to have thought of… How about developing a free (patent free, free software) accoustic fingerprinting technology?

    Don’t say that it’s “too hard”… The Xiph guys developed a whole psycho-accoustic lossy compression technology (ogg) and avoided existing patents. Fingerprinting can’t be any harder than that.

  10. As you know I think fingerprinting is a must. As a 3rd-party developer it would be important that a clientside api (prefereably language independent i.e xml,Java orSOAP) is available to 3rd party applications with no strings attached. Can it be used with any audio formats or is it tied to mp3/mp4. If there are limitations in these areas there may be scope to encourage them to open things up on the basis that these opensource developments will improve their product with little effort on their part.

    (Obviously you need to be confident that this solution is scalable and does not suffer from TRMs limitations in which case I see no real problem with it. But if you go with it it is still important to keep the TRMServer going for some time, and keep the Trms Ids in the database for even longer so that Trm ID3 tags are not invalidated in existing mp3s.)

  11. I assume that the deal you will arrange will have a back-out clause that secures MusicBrainz against the company loosing interrest. There are two things that might happen: The company might loose interrest in the technology alltogether, in which case MB should get the code. Or, the company might loose interrest in supporting MB with their server technology. Leaving MB alone like this should cost them so much, that MB could buy some hardware to replace their service.

    As for arguments: If MusicBrainz drops TRM, there will be no working open source fingerprinting library. Hey, this is an argument on its own. Do they want advertizing? Do they want to publicly proove how powerful their technology is? How could this be easier than by letting their library spread through the open source world?

    If the contract is well done, then it will secure the fact that a fingerprinting server is _there_. It might not be open source, but the service will be available, either by contract to MetaBrainz or hosted by MetaBrainz. A combination of a “free beer” server and a “free speech” library should work pretty well to spread their technology.

    Finally, they might be concerned about bad ports of their library corrupting thier database with bad fingerprints. But they will only be able to provide compiled and tested libraries for very few platforms.
    If their technology works really well, this might generate so much frustration, that some developers might be tempted to reverse engineer or to do ‘illegal’ ports.
    In this case, an open library that identifies itself as “superfingerprint 3.60.3 port 5.1 to ubuntu breezy by” might be _more_ secure than a closed source library that gets misused. An open library will encourage correct identification of the port and the client and they will be able to categorize submissions on the server side into trusted/tested and untrusted/untested. A closed library encourages obfuscated identification, which could harm them much more.

    Also the burden of prooving that a port is correct can lie entirely on the side of the developers. Maybe MusicBrainz could even provide a testing environment?

  12. What happens when the client software breaks? How quickly will it be fixed?

    Is it available for all the platforms where Musicbrainz is used? Windows, OS X, GNU/Linux, FreeBSD, etc?

    What happens when the company goes out of business, or decides they’re not interested in working with Musicbrainz anymore?

    What happens when somebody develops an open-source fingerprinting solution that you’d rather use? Are you locked into a “contract” with the vendor to continue using their service? Will the Musicbrainz community balk at switching to yet another fingerprinting scheme?

    Using this software is a bad move. It’s quite possible that you’ll eventually be right back in the same boat you’re in with Relatable, stuck with a closed solution that’s no longer supported and no way out.

    I think there is already inertia in the open source developer community to support Musicbrainz because of TRM, even though you don’t have to use TRM to use Musicbrainz. Many developers prefer to work with FreeDB, despite the fact that its metadata is crap compared to Musicbrainz, because it’s a completely open solution and that’s what they want to spend their time supporting. Snuggling up to a new proprietary solution for accoustic fingerprints will only exacerbate this problem.

    Wait for something that’s open source (client *and* server). The most important part of Musicbrainz is the quality of the metadata, not some accoustic fingerprinting scheme. It has survived thus far with a barely-functional one, it can continue to survive without one at all until an open-source scheme is invented.

    (Apologies for the single-paragraph post… I can’t figure out how the hell to insert paragraph breaks in this comment mechanism!)

  13. While I don’t have a problem with a closed-source fingerprint server, a closed-source fingerprint client is, I think, not really something that MusicBrainz can use effectively. The main problem, as I see it, is that for MusicBrainz to be adopted widely, it needs to be a cross-platform solution that will run on not just Windows, but also MacOS (PPC & x86), Linux (x86 + others), *BSD (x86 + others), etcetera, etcetera.

    Is there any way that they can split up the fingerprinting algorithm in a way that moves some of the functionality to the MB server; i.e. the open-source client implements a subset of the functionality, collecting a set of metrics that require access to the actual PCM data, and there is (closed source) support in the MB server for putting these metrics together with their “special sauce” so that the exact details of their algorithm are not revealed to the general public?


  14. As a developer considering the possibility of integrating of musicbrainz into my GPL’d application, I’m most worried about stability-of-method and reliability-of-service. I worried about the TRM method before, and thus never incorporated MB in my app from the start… while I see “no solution” as a great threat to stability, but I see dependence upon another closed-source app as an even greater long-term liability.

    Nothing would please me more than for MB to have a good relationship with a company developing it’s acoustic fingerprinting software, so long as as much of the code as possible is open source. This is not because I’m a zealot; I have my fair share of paid-commercial-closed-source apps (even on my linux box), but I look at this as something analagous to a format “standards” issue. I have no interest in supporting the development of competition for this company with it’s own fingerprinting software; I simply think that there *must* be a stable(for software developers/etc) way out if the company should go under/etc.

    sorry if I just repeated what others are saying…

  15. I think it’s possible for the company to benefit from open source financially if the source is under the same GPL as Ghostscript.
    That way anyone who wants to use it for financial gain would have to use commercial licensing and if they don’t then they can be sued for violating the GPL. That way you get all the benefits of open source and some benefits of commercial licensing.

  16. I think it’s possible for the company to benefit from open source financially if the source is under the same GPL as Ghostscript.
    That way anyone who wants to use it for financial gain would have to use commercial licensing and if they don’t then they can be sued for violating the GPL. That way you get all the benefits of open source and some benefits of commercial licensing.

  17. If Musicbrainz goes closes and liscences any of its source then I wouldn’t be surprised if the company that owns the source would take it and run leaving the software non-functional or maybe somebody would write a loop in the code and nobody would be able to even edit the code in the company without entering a password and then the software wouldn’t function unless the programmer continued to write software for the company or was paid to hand over the password.

  18. I *am* a Free Software zealot and I’m still mature enough to know that the perfect is the enemy of the good. I’d rather have “fully open” than “partially open”, but I would *much* rather have partially open than nothing at all. I say go for it.

    As for arguments to open their source, the most obvious one is that it’ll improve faster and find its way into a larger market. Lots of companies have shown now that with a decent amount of market penetration and a lot of expertise in a product there’s plenty of revenue to be made in ways other than per-seat licensing.

    – Chris

  19. 1) I’m in favor of audio fingerprinting support even if it means having to deal with closed-source pieces. In the long run, audio fingerprinting is not the primary reason for MB’s existance. Thus, a closed-source client for fingerprinting would not irreperably harm MB’s advancement.

    2) Advantages of making the fingerprinting client open source:
    – Code review by peers to accelerate development and fix bugs. “Free development!”
    – Increases chances of finding security flaws in client software. There’s been a rapid increase in overflow flaw discovery for many mainstream multimedia apps such as RealMedia player, Quicktime, iTunes, etc.
    – Ports of client software to other platforms, increasing potential audience size

    The risks of an open source client are small, since even a closed source client can be reverse engineered. And if the unnamed company in question is smart, they should already have a patent on their method of fingerprinting.

    3) An option to have the MusicBrainz client distributed with plugin technology to support fingerprinting is a FANTASTIC idea! Most definitely, that should be considered.

  20. All of the open source arguments above are good ones.

    But what options exist today for reliable audio fingerprinting?

    I’d be interested in spearheading an effort to

    1. canvas the existing literature to see what options exist,

    2. sponsor a patent search to assure what is developed is open and doesn’t infringe, and

    3. develop an unburdened open-source audio fingerprinting technology.

    Drop me a line if you think this effort has merit, if you can help, and how, and we’ll see where we stand.

  21. So we’re asked if we care about having a closed source client library. I say it’s not a disaster, so long as it has no worse performance than TRM (i.e. at worst no more collisions). The valid arguments are certainly along the lines of portability and improvability (presumably the former can be answered rather simply – what platforms does it currently support?). I disagree with the reverse-engineering argument – all those people who are great reverse-engineers, why not reverse engineer the relevant part of the TRM server and save us this argument in the first place? I can’t understand why doing a TRM lookup can be so difficult (and hence why an open source version can’t be written) unless it manages somehow to calculate the ‘distance’ between two TRMs (which I presume is how it calculates its % certainty of the result).

    So, as far as I see it, this would be the section of code that we’d either want open source, or available in bare form to be called at will in a database server that we ourselves design – if we don’t get this in any new solution, I’d say it’s no better than the old.

    One thing I’d suggest in the meantime is having two lookup servers (if I’ve understood the problem correctly). One does a straight lookup (which is cheap, and if there are so many collisions should be fruitful). The second fallback server does the distance magic. That ought to significantly reduce load on the crucial server. As regards the memory issue, can’t we just have obscene amounts of swap?

  22. I have no problem with a closed source solution at all. What I am concerned about is the algorithm used. If the new software follows the same algorithm but uses a different implementation, hopefully results from implementation “A” and “B” would be the same, and so long as the results are correct then I’m happy.

    Oh, and as long as it’s Mac OS X compatible too ๐Ÿ˜‰

  23. Like most people here, my only concern is that the fingerprint can be obtained from any platform without requiring MysteryCompany(tm) to port the library to ObscurePlatform v2.0.

    Beyond that, I’m indifferent to the license used.

  24. Wait… If the current licenses for client apps like the tagger (currently GPLed) wouldn’t change, do you mean that the only thing closed-source be some part of the fingerprinting done on the server, like with the current TRM system?

  25. I agree totaly with JanC :

    So what would be closed source? A client library for getting fingerprints? I guess my concern would be the cross-platformness of such a closed source library. How many platforms and languages has musicbrainz client library been ported, that’s an argument for open source, because they’d have an expanded base of potential clients.

    Ultimately i don’t like this idea, no matter what it seems their server is going to be proprietary, hosted by them, if they disappear or end their agreement, whatever, poof same situation again, but a bit worse. I guess I’d say CDDB and Relatable are good lessons to why using a proprietary solution from a company while their current terms are fine is always a bad idea.

    PS : your website does not seem to support accents. I wanted é displayed in my name .. but not working…

  26. Is not fingerprinting a special case of mapping music bits to metadata tags? Why not make a flexible plugin scheme for code which does that. The plugs could be either open or closed source with whatever license the creator requires.


  27. I have no problem at all with their server remaining proprietary. However, a proprietary client library is completely unworkable. Consider the wide variety of Free Software media applications which need to make use of musicbrainz; most prohibit usage with proprietary libraries/plugins, and few if any of the rest would be likely to make any serious use of a proprietary library. It is highly doubtful that any GNU/Linux distribution will compile Free applications against this proprietary library by default, for most of the same reasons.

  28. I am concerned, as many here have mentioned, that MB would be dependent on something they dont have control over. A quick google search turned up a few open source acousitc fingerprint libraries ( songprint,…). Why not take one and develop it, or start one from scratch.

    I am not an OS zealot, but I think in this case if MB is going to survive, let along grow, it needs an OS acoustic fingerprint.

  29. I cast my vote for an OS acoustic fingerprint. Just because you don’t have any mission-critical problems with a closed-source approach today doesn’t mean you won’t be badly encumbered by it tomorrow.

  30. This is a very old post. Indeed, after TRMs, we did go and use another proprietary fingerprinting solution, but that eventually came to an end too. These days we’re partnering with AcoustID, for which all the source is open and their data is also open. Your words are wise and almost everyone involved in MusicBrainz and related projects are strong proponents for open source. ๐Ÿ™‚

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.