Jim DeLaHunt is our new style leader!

I’m pleased to announce that Jim DeLaHunt is our new style leader!

After chatting with tons of people, it appears that no-one has any objections to Jim being our new style leader. Jim will be working to improve the style process and to get stuck proposal moving again. Expect to hear much from Jim in the coming weeks as the revamps our style process. If you’re interested in taking part or observing the improving style process, please subscribe to the MusicBrainz style mailing list.

The BBC unleashes dynamic artist pages beta

The BBC has just released its next MusicBrainz enabled feature: Dynamic artist pages.

You can see how often your favorite artists have been played on the BBC networks since last year. Turns out Coldplay is quite popular. To see how often a specific artist has been played, find the MBID on MusicBrainz and then go to this URL:

http://www.bbc.co.uk/music/artists/<artist mbid>

For instance, Portishead has mbid 8f6bd1e4-fbe1-4f50-aa9b-94c450ec0f11, so to check on Portishead, you’d go to:

http://www.bbc.co.uk/music/artists/8f6bd1e4-fbe1-4f50-aa9b-94c450ec0f11

The folks at the BBC assure me that all the artist MBIDs have a page there, but the page isn’t guaranteed to have much data. Even the MusicBrainz test artist is there. 🙂

Well done BBC! I’ve learned lots and lots about how the BBC operates and to see my friends at the BBC make highly visible progress tickles me pink!

Matthew Shorter of the BBC offers a little more insight:

Why are we doing this?
Currently our offerings around individual artists tend to be dispersed and hard to find. This leads to poor search performance for BBC music content, which means that users will typically only find content if directed by broadcast, or serendipitously by browsing brand sites.

Persistent unique URLs for artist pages which automatically aggregate what the BBC has to offer around individual artists will lead over time to much improved search performance and facilitate wider syndication of our content, building reach to brands. Automation and dynamic publishing means the pages can be created and maintained with a fraction of the manpower and server
load of the current generation.

Building good interrelated metadata for artists and programmes will also help greatly to enrich the music offering of radio & TV sites, offering such things as a chart of artists most played by a network, further information behind tracklists, rich now-playing information and so on.

What’s the offering?
We have a page for every artist in the MusicBrainz database – c.350,000 and counting. They contain, where available, basic information about the artist, discographies, high-quality images and details of play count by BBC networks & programmes. (It’s worth pointing out that for most of these pages, most of the time, there won’t be much content, but that’s fine, because pages will only ever be linked where we have broadcast or otherwise featured an artist, which by definition makes them significant.) See instructions at the bottom of this mail for how to access a given artist page*

Help wanted: Add release AR links to Wikipedia

Wendell and Sergey from MusicIP have done another crawl of Wikipedia — this time the goal was to match up releases in MusicBrainz with Wikipedia pages that exist for those releases.

Just like last time the results have been broken into convenient chunks of 100 with the proper links to let people verify the matches and quickly enter them into MusicBrainz.

If you’re interested in helping, please follow the instructions on the wiki and jump in!

Thanks!

Testing PPC build of Picard

If you have a PPC Mac that runs 10.4/10.5 and have been waiting for a DMG of Picard, please try download and install this version. Please let us know if it works in the comments.

Jon Hermansen and I have been working on building Picard with only MacPorts prerequisites — that is how this DMG has been built. If this install works then we can proceed to work on a Universal Binary that should work on 10.4/10.5. If we can reach that, we should be able to release Mac binaries at the same time as we release binaries for other platforms.

Thanks for all your hard work Jon!

UPDATE: We’ve found a problem with PUID generation and have fixed it — we hope. The above link now points to the updated dmg. May not work on Tiger yet — if you have a Tiger PPC box, please try it and let us know.

Squashing the rise of the sock puppets

We’ve recently seen a rise in Sock Puppets here at MusicBrainz. We’ve observed editors creating separate sock puppet accounts who vote through the edits of the editor in order to get changes through MusicBrainz faster. This practice obviously side-steps our peer-review system, and up until now we’ve had to have other editors go through and follow the trails of naughty editors to clean up after them.

To avoid this from happening continually, we’ve update the main server with a minor patch that requires people to have more than 10 approved edits in order to vote on other people’s edits. This makes creating a sock-puppet account much harder — each sock puppet account created will need to have a lot of work invested in it before it can be useful. We’re hoping that this simple tweak will discourage sock puppeteers.

Help wanted: Add Wikipedia ARs

MusicIP did some matching between the MusicBrainz data and Wikipedia in order to find artists inside MusicBrainz that didn’t yet have an AR link referencing their Wikipedia page. Brian Freud then went and created a set of pages that make adding these links a snap.

Now we need more help picking up a section of the list and going through each of the listed artists and adding the ARs. If you’re interested in helping out, please take a look at the wiki page that coordinates this effort.

Thanks!

Looking for a new style leader!

The style guideline process has been stuck in neutral for quite some time and I was hoping that Panda could take over this role from Don Redman (who had been swallowed up by real life quite some time ago), but real life is about to swallow Panda for the foreseeable future. Thus, I start the search for a new leader of the Style Council once again.

The new leader for the style council would need to:

  1. Review the process by which style guidelines get updated. Obviously this process is flawed.
  2. Devise a new method by which the style council works on style guidelines. The new leader can choose to use the Wiki, the bug tracker, mailing lists, forums, smoke signals or anything else they choose. This process has to fit with the structure of MusicBrainz and our bottom up method for working together. I will personally work with the new style leader to define this process.
  3. Document and implement the new process
  4. Lead the style council.

Qualifications of the new style leader include:

  1. Excellent communication skills
  2. Excellent knowledge in music, especially classical.
  3. Must not be embattled in current proposals. Ideally the leader would be neutral on existing proposals and keep an objective stance on new proposals as they move through the process.
  4. Skillz in bringing about consensus. We need a strong communicator who can settle long raging debates and use their judgement to settle debates and remove contentious points from discussion. This leader needs to keep people focused when debates rage out of control.

I would very much like to see the new style process be a bottom up process where the community brings about proposals much like they do today. The major difference is that the new style leader would have the authority and presence to move these proposals along when they get stuck. My vision is that this person isn’t a nanny who micro-manages the community of style people, but rather one who provides lubrication for the process to run smoothly. If something gets stuck, forgotten or argued to death, the leader should jump in an rectify the situation.

If this position interests you, please speak up in the comments. If you think someone in the community would make an excellent leader, but suspect that this person isn’t going to speak up, please nominate them in the comments. I will take the proposed people, chat with them and chat with members in the community to see how they feel about this person. Then, I’ll make a benevolent dictator decision and install the new style council leader and set them off to their task.

Call for search server testing

After I gave some history in the last post, I’d like to put out a call for testing for the new search server. In moving from Lucene to Xapian I’ve fixed a number of bugs, some of which have been lingering for a while. Also see the list of bugs we still have open and plan to fix before the release.

If you have a pet-peeve bug that’s been annoying you, please check to see how our new Xapian test server is handling things now. (Please be patient with our the dev server, the box needs an upgrade soon!)

If you are a fluent speaker in Chinese, Japanese, Korean or Thai, please take a moment to look up some artists! We had some problems with searching Chinese text, but I think I fixed it, but I am not proficient in any of the applicable languages, so please help sanity check me!

Unless I find more bugs, this new search server will go into production sometime next week. If you find a bug, please report it to the usual place.

Search: Why is it so important?

After many days of tinkering, the new search server has passed its tests and is nearly ready for deployment next week. After my last post on the search services, there were lots of questions, so I’ll give some more history on why I’m working on this now:

  1. The old Lucene based search services worked well, but installing them was a major pain. Installing compilers by hand, sacrificing chickens and hoping that things would work wasn’t my idea of fun.
  2. Lucene has a philosophy of working out of the box without significant tweaks. That’s great if you’re indexing a bunch of text, but indexing music metadata from an SQL database is a bit of a different beast. The usual Lucene tricks didn’t work so well for us, so we couldn’t tweak it to work better for us. Xapian requires a little more tuning out of the box, but our search results are much better now than they were before.
  3. Sending metadata lookup traffic to a service like Xapian is generally a good idea, as a single Xapian server can handle lookup traffic more elegantly than a Postgres database. And adding more search servers is easier than adding more database servers.
  4. Our traffic is growing — I expect us to handle twice as much traffic in July as we did the July before. A lot of this traffic growth is coming from people using our web-service to look up music. If the web-service slows down, the rest of the site slows down as well. So I’m trying to stay ahead of the curve an anticipate when we reach capacity and be able to add more machines as necessary

As of next week, MusicBrainz will have twice as much rack-space (20U’s of space!) and we can finally rack the two new servers that were donated a few months ago. Fortunately due to dropping bandwidth costs, this new space doesn’t really come at a greater expense to us — I expect our hosting costs to stay nearly the same as they are now. (about $1000/mo, btw)

This will allow us to have 3 times the search capacity we have now, which should keep the site working for a while longer. In fall I hope to start moving our web-service to Amazon’s EC2 service, which should allow us to get as much capacity as we need.

As soon as I get the new search services deployed I’m putting my head down and coding the next server update. So, keep your fingers crossed that this process goes smoothly.

Bug tracker in read-only mode for a while

Dave was working on upgrading software on our catch-all server and ran into some problems with plugins for trac, our bug tracking system. Trac is currently up, but the plugin to log-in hasn’t been installed yet, so no one can log into track right now.

Dave will continue working on this in about 8-10 hours of time. Sorry for the inconvenience!

UPDATE: Everything is back to normal now. Thanks Dave!