Reminder: MusicBrainz hack weekend starts this Friday!

I wanted to remind everyone that we’re having our first online MusicBrainz hack weekend starting this Friday. If you’re interested in participating, please be in the #musicbrainz-devel channel on Freenode on Friday, 17 August, 1900 UTC. We’re going to have quick coordination meeting at that time and see who wants to work on what projects. Then we’re going to hack through the weekend and see how much fun stuff we can build for MusicBrainz.

For more information, pleasee see our wiki page and the original blog post.

Replication packet 61163 is large

I’d like to apologize for replication packet 61163 — our release created a very large replication packet (4.9Mb) that is going to take a while for clients to apply. Users using our slave software (musicbrainz-server and mbslave) can expect to see a much longer loading time and much greater use of disk space when this packet is applied.

Sorry for the inconvenience.

Trouble with edits pending notices

We’re debugging some insidious database deadlock issue and we think we have a clue where the problems are coming from. In an effort to get the site stable, we’re going to disable the function that keeps track if a piece of data has edits pending or not.

If you’re editing, please disregard these indicators as they are likely to be wrong. We’re going to work up a permanent solution for this problem, but it is unlikely that we will have this solved before mid-next week.

Sorry for the troubles!

Introducing our latest acronym: NES — New Edit System

Long after we finally delivered our much anticipated Next Generation Schema, we’re finally ready to dive in and get to work on our New Edit System. You may recall that during the NGS re-write we opted to not tackle the re-write of our edit system. Instead, we decided to do the edit system overhaul once NGS became stable.

We finally reached this point when Ollie, Warp and I met in London last month to discuss this new project. We had a productive meeting reviewing all of the research that Ollie had done over the past months. We improved the terminology here and then, but in general approved for Ollie to continue his work. Part of his work was to write about it so that our users can get familiar with the concepts that will come about in NES.

Ollie had started blogging about that on his own blog:

If you’re interested in the New Edit System, please take a moment to read these posts and consider following Ollie’s blog.

Matthew Hawn from Last.fm joins the MetaBrainz board of directors

I’m pleased to announce that Matthew Hawn from Last.fm has joined our board of directors. Matthew is a veteran of the music industry who has worked for big music labels in the past and today fills the role of VP of Product at Last.fm. Matthew joining our board of directors reflects our shared history and the closer collaboration between MusicBrainz and Last.fm

Welcome on board Matthew!

Matthew replaces outgoing director Derek Sivers. Derek, the founder of CD Baby, has been with MetaBrainz since its beginning of the organization and has looked out for us many times. We really appreciate everything you’ve done for us Derek!

UPDATE: Fixed Matthew’s title.

We've just launched our new MetaBrainz site!

In an effort to retire some old power-hungry servers, we’ve moved MusicBrainz Classic to a new server and also created a new MetaBrainz Foundation web site. The new MetaBrainz site looks like the MusicBrainz NGS site and shares a bit of the same code.

Please open a bug if you find problems with either of these two sites.

Thanks to Ollie for hacking together the metabrainz site!

Summer of Code log analysis project: May we share our data with our GSoC student?

UPDATE: This clearly going to be a major hassle, so we’ll spend the extra time coding a program that will sanitize the data before it goes into splunk.

Last week Google’s Summer of Code program started and my student Dániel Bali is ready to get busy combing through our massive logs and see what sorts of information he can mine from our logs.

We only have one minor problem — our logs contain the IP addresses of our users and some requests contain the user names of the person making the request. Removing this private information from the logs before Dániel sees them is quite a pain to do well.

I would like to propose that we:

  1. Consider Dániel part of our core team for the summer and allow him to see IP addresses and all the requests in full.
  2. Have Dániel sign a short statement stating that he will not divulge any private information.
  3. Will fail him in his GSoC project if he does divulge any private information.

If this is not acceptable to you, please speak up soon. I would like to make this happen early next week so Dániel can continue his GSoc work.

UPDATE: The final output of Dániel’s work will not contain any private information. If we end up using any private data as input, we will sanitize it and remove private information before we publish the output.