Questions about the MetaBrainz launch

Matthew Exon asked a few great questions about the MetaBrainz Foundation launch, and I wanted to share my answers with everyone here on this blog: On Apr 19, 2005, at 3:58 AM, Matthew Exon wrote: > Congratulations! This is certainly a big step for MusicBrainz, and I’m > sure it gives all of us a … Continue reading “Questions about the MetaBrainz launch”

Matthew Exon asked a few great questions about the MetaBrainz Foundation launch, and I wanted to share my answers with everyone here on this blog:

On Apr 19, 2005, at 3:58 AM, Matthew Exon wrote:

> Congratulations! This is certainly a big step for MusicBrainz, and I’m

> sure it gives all of us a greater feeling of confidence and

> responsibility about the whole project.

Thanks — I’m glad to hear that. I’m trying to project that on a much larger scale. To the point where corporations get that sense too.

> The teasers on the test MetaBrainz page over the last couple of weeks

> have raised some questions for me, so this is my chance to ask if you

> could clarify some of them for me. I don’t expect full dissertations

> to

> turn up on the MetaBrainz site overnight, and I guess it would be

> prudent for you not to respond to some things here, but you might be

> interested to know what questions occur to this punter in the street.

Do you mind if I post this response to the MB blog? Your questions a excellent and until I can have a fully articulated position on the licenses, I’d like to have something to refer to.

> First of all, there’s the big “licence” word in there, and I’d like to

> see a bit more detail about precisely what kind of agreements you have

> in mind. For example, will licensees be prohibited (or encouraged!) to

> further distribute the data, either on a commercial or non-commercial

> basis? Are you selling data, or are you in effect selling bandwidth to

> the MusicBrainz servers? How do you intend to balance bandwidth

> between

> licensees and the general public? These questions are far too complex

> to be answered straight away, especially when you don’t actually *have*

> any licensees yet, but it’d be interesting to see a guess.

I think I can answer all of your questions — they are great questions!

First off, license may be a big and scary word, but its been a part of

MusicBrainz from early on. For more information on this topic and how

we arrived at our current license scheme, please check out:

Currently MusicBrainz divides the dataset into two chunks: the core

data is in the Public Domain (read: no license, do with it as you

please) and the ancillary data (search index, moderation, etc) is

released under a Creative Commons license for non-commercial use. For

more details, see:

The live data-feed doesn’t change any of this. The live-data feed is a

stream of hourly chunks of data being generated by the project. This

data feed contains both PD and CC data, and thus the whole is licensed

under the most restrictive license, the CC share alike, non-commercial

license. This means that anyone who wants to have an MB server in a

non-commercial setting, can have a server that is no more than 70

minutes out of date with the main server.

So, any commercial entity that wants to use MB data, can now:

1. Download the twice weekly snapshot and use it without paying a


2. Arrange to license the live data feed and their data stays up to


So, no one is going to pay for the data itself. Commercial customers

are going to pay for the privilege of having bite-size chunks of data

applied to their own server. If a customer is running a 24/7 service,

option #1 can be a real pain — taking down your servers to import

fresh data sucks.

Commercial licensee’s will be able to use the live data stream in any

which way they want. Once the data resides on their own server, they

can use it as many times in their organization as they want. With as

many copies as they want — they will not be required to share their

data with anyone else. However, they will not be allowed to offer our

live data feed to others — we don’t want our commercial customers

competing with us. That makes no sense.

Balancing bandwidth between users and commercial customers may not

really be much of an issue. Downloading a live data feed amounts to the

same bandwidth in a day as one or two people using the tagger to tag

their music collections. Not a big deal.

Should the day come when it does become a big deal, I will set up a

another server, that will be paid for by the data licenses and its only

purpose will be to farm out the data to commercial customers, leaving

the rest of the bandwidth to our contributors.

> BTW, licencing the data is certainly the way to go, and I don’t have

> any

> reason to worry about my right to access it: but this is the kind of

> thing that makes contributors nervous, so it’s worth thinking about

> this

> from the tinfoil hat point of view.

Understood. I’ve been listening to the community for over 5 years and

I’m very aware of people being very sensitive about me ‘pulling a

gracenote’. I think my solution is a pretty good one — everyone has

access to the data, yet we can license the data for commercial use and

not sell out.

Another point that should put people at ease is that with the

non-profit in place, the data is officially owned (as much as you can

own PD data) by the MetaBrainz Foundation. Property of a non-profit

cannot be sold to a for profit entity — it must be destroyed or

donated to another non-profit.

I *can’t* pull a GraceNote.

> Second, I’d like to be clear on what the relationship with Amazon is.

> Somewhere I got the impression that you were waiting until this launch

> to become an Amazon associate.

We had to wait until we were a recognized non-profit. Late last year we

turned on the amazon ids and started collecting associate fees from

them. Over the last 6 or so months, we’ve taken in around $50. Not much

to write home about, but having the cover art is great — and having

some beers with friends on Amazon’s tab seems like a nice cherry on

top. ๐Ÿ™‚ I’ll probably spend the referral fees on beer in London.

> This seems like the simplest and

> quickest way to start raking in cash. Can we expect an announcement

> about this too?

There was:


> Are you going to develop a full-on web-services based

> thing?

We have a full on web service — the tagging applications use it. We

had a web-service since before the term was coined. ๐Ÿ™‚

> You could potentially manipulate users’ shopping baskets and

> stuff on the MusicBrainz server side, and turn MusicBrainz into a

> shopping site as well as a metadata site. In fact, MusicBrainz seems

> to

> me to have the potential to be the most important Amazon web services

> partner they’ve ever had. I mean, MB is kinda the music equivalent of

> IMDB, and Amazon *bought* IMDB…

I agree. However the fundamental reality is that you tend to make very

little from associate’s fees. There are a number of companies that

tried to make business models out of this and failed in spectacular

ways. I believe that one customer paying one month’s of full license

fees will be more income that associate’s fees for the entire year.

Over time that is likely to change, but if our data licenses grow at

the same rate as our overall usage, then the associate’s fees will

never catch up to the data license fees.

> So OK, here’s my $64,000 question (possibly literally): are you hoping

> to licence the data to Amazon for use in their online catalog? I can’t

> imagine any potential licensee bigger than Amazon, but maybe I’m

> short-sighted.

I’d love that and I’ve asked Jeff Bezos that myself. I did that two

years ago, when our data was less mature, and the answer was no — not

surprisingly. Even today that will likely not be any better, since we

do not have rights to cover art. The cover art is a big deal for

Amazon, and we simply cannot offer that.

> Lastly, is MusicBrainz now effectively a commercial competitor to


In a sense yes — allmusic supplies the data for Amazon, I believe.

While allmusic still has more data than we do, I think we’re going to

catch up pretty fast. If the answer is not yes today, in 6-12 months it

will be.

> If so, do you expect this to be a rather rough ride? I’m

> thinking in terms of FUD, patent war, allegations of copyright

> infringement, and so on. Again, maybe paranoid, but we’ve all seen

> nastier behaviour before…

Not paranoid at all — someone recently called MB’s competitors ‘enemy

combatants’. So there is some truth to that fear. FUD I think we can

deal with — much in the same sense the Linux deals with M$’s FUD.

Patent wars are going to be an issue — for now we’re staying clear of

other’s patents. That just makes sense for us. Copyright violations? As

long as we’re vigilant and make sure that no one starts importing data

from anther source, we should be fine. It never has been a problem for

us and now that our database is more mature, there will be even less of

a chance of this being a problem for us.

Now, my take on copyright violations is a conservative one — MB should

never accept data from any other commercial sources — only from

people’s brainz. However, the actual legal position is much more in our

favor. In the data-license white paper I talk about the Feist vs Rural

telephone company Supreme Court case. This case found that facts (like

the title of a CD or the name of an artist) are not copyrightable.

So, in theory, someone could copy data from allmusic directly into

MusicBrainz and have it be legal (as long as everything happened in the

US). Allmusic’s lawyers may think differently about this, and that is

why MB cannot accept data from any other database.

> Anyway, again congratulations, and try not to work yourself too hard!

> I

> suspect you’re going to be spending too much time in the next few

> months

> talking about MetaBrainz and associated stuff to even think about

> Picard

> ๐Ÿ˜ฆ It’s important work though!

Au contraire — I hope that all the hard non-profit work is behind me.

I certainly have some work to do on licensing our data, but Picard is

quickly moving to the top of my todo list. I hope to spend serious time

on Picard in May.

Let me know if I can post this on the blog!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.