TRM Database Pruned

It’s TRM pruning time again.

At about 2300hrs UTC on January 25th the TRM database
was “pruned” again. 
As we did last time, we removed all TRMs apart from the ones attached to
MusicBrainz tracks, this time using the additional criterion that
the TRM had to have been looked up at least twice. 

Just before the prune we had about 3633572 TRMs; after the prune we had about 1898435. 
For pretty pictures illustrating this, check out our
MRTG pages.

See also prune 1,
prune 2
and prune 3.

AR todo list

If you’ve been lusting for the new Advanced Relationships feature in MusicBrainz, take a look at our AR Todo list.

We need to complete this list of tasks before we release AR on the world. We’re now officially done with the features, but we still need to do some performance tweaking and behind the scenes futzing.

If you’d like to play, please go to the test server!

But keep in mind that the relationships on the test server will change! (They should be more complete and suck less for the first release.)

More Wikipedia stuff

This is an excellent piece on cooperation and politeness at Wikipedia — I haven’t finished reading it yet, but for anyone who is thinking about improving MusicBrainz’s voting system, this should be considered required reading.

I for one, am finally seeing the light on Jamie Munro’s Survival of the Fittest proposal and how it should let us avoid some of the problems/issues/discussions that Wikipedia is currently encountering. I think it may be time to tackle that after I get the new tagger on solid ground.

Internet Archive donates to MetaBrainz

Gordon Mohr, who is currently working at the Internet Archive was given the opportunity to donate $300 to a non-profit with a compatible mission to the Internet Archive. Gordon chose the MetaBrainz Foundation and thus we just received a check for $300.

Thank you very much to both Gordon and Brewster Kahle, the driving force behind the Internet Archive!

Wikipedia thoughts

I think that Wikipedia and MusicBrainz have a lot in common — we differ in scope and how we collect data, (unstructured vs structured, respectively) but the overall approach of collecting data from volunteers is pretty much the same. Both projects rely on a set of guidelines to guide it’s self-selected contributors in the right direction. One major difference is that MusicBrainz has a peer review system embodied by our often discussed and much tweaked voting system. Larry Sanger, the co-founder of WIkipedia singles out the lack of a peer-review system in Wikipedia as it’s greatest flaw:

Second, my view is that Wikipedia badly needs a review process which the general public can regard as reliable. I personally made several different proposals of such review processes, and shortly before I left the project permanently, I was working on such a proposal. The Wikipedia community, like any large online community, is a pretty “political” place, however, and so I do not have the time or patience to try to organize a review process now.

. . .

A free, open encyclopedia that is reviewed has always been my vision for Wikipedia (and for its parent project, the currently-moribund Nupedia). From before the time that I left, I personally have regarded it as a serious failing of the project that it does not have a publicly credible review process.

In the past I’ve suggested a wiki style approach to moderation at MusicBrainz and the community has pushed back on that idea for many of the same reasons. In retrospect, I am glad that we have a peer review process in place — the voting system has taken a lot of our time (both using it and creating/maintaining it) but I think MusicBrainz is better off because of it. That’s not to say that we’re done tinkering with it — far from it. I wonder how difficult it will be for Wikipedia to bolt a review system on top of it’s wiki, should they decide to do it — that is a major challenge!

Then, Larry points out another problem with Wikipedia:

First, Wikipedia is, at present, of uneven reliability. Some articles have only recently been started; some have never received the attention of anything like an expert; some (fewer) have been degraded from superior earlier versions. This imperfect reliability is something that Wikipedia itself makes no secret of, particularly in its “General disclaimer.” I personally share the view of many that Wikipedia should not be used as a single source of information for anything. Defenders sometimes add that this is true of all sources of information, which is true, as far as it goes.

To a degree, this is true of MusicBrainz as well. However, I don’t consider this to be a problem — I consider this to be a fact of life. There is so much music and increasingly more is created every day, which means that MusicBrainz will always be behind in cataloging it all. Also, MusicBrainz will never be completely correct — there will always be mistakes. We will certainly aim to be more complete as time progresses, but it is a limit function — MusicBrainz will never be complete or totally correct. But, is that really a problem?

I don’t think so.

UPDATE: Clay Shirky has a great response to Larry’s article.