Google Summer of Code Projects have been accepted

I’m pleased to announce that we have accepted three Google Summer of Code projects: Using Collaborative Filtering to generate Relationships between artists for MusicBrainz by Sharon Myrtle Paradesi Working on implementing simplified NGS by Erik Dalén Improved Statistics and Trivia by Guelson Fostine Lukáš will mentor Erik and Guelson and I will be mentoring Sharon. … Continue reading “Google Summer of Code Projects have been accepted”

I’m pleased to announce that we have accepted three Google Summer of Code projects:

  1. Using Collaborative Filtering to generate Relationships between artists for MusicBrainz by Sharon Myrtle Paradesi
  2. Working on implementing simplified NGS by Erik Dalén
  3. Improved Statistics and Trivia by Guelson Fostine

Lukáš will mentor Erik and Guelson and I will be mentoring Sharon. By accepting three of our proposals Google is indirectly sponsoring MusicBrainz with another $15,000! (Each student gets $4,500 and each mentor gets $500 per student)

Thank you very much Google and good luck to Sharon, Erik and Guelson!

Our servers are overloaded!

This weekend we’ve seen a rise in tagger traffic to the MusicBrainz site. This extra traffic is causing load spikes that give us the dreaded 502 proxy error messages. Now that the release is done I can focus more time on getting our various hardware issues solved and bring a new database server online before … Continue reading “Our servers are overloaded!”

This weekend we’ve seen a rise in tagger traffic to the MusicBrainz site. This extra traffic is causing load spikes that give us the dreaded 502 proxy error messages. Now that the release is done I can focus more time on getting our various hardware issues solved and bring a new database server online before the load spikes return next weekend.

What can you do? There are three concrete things:

  1. Make a donation to help us cover our costs.
  2. Stop tagging for today and spend some time with your friends and family for Easter. 🙂
  3. If you must continue tagging, please use our UK mirror: http://www.uk.musicbrainz.org In the options dialog of your favorite tagging application, look for the tab that lets you set the server you’re using to tag. Enter www.uk.musicbrainz.org in that field and you should be good to go.

Sorry for the inconvenience, we’ll work on this as soon as possible!

Server update has been completed!

After much work by Lukáš, Dave, Age (Prodoc) and myself, I’m pleased to announce that the main server has been updated! We now have support for DataQuality, Labels, improved cover art, track annotations and a whole host of bug reports. To see the detailed list of what things have been included in this release, please … Continue reading “Server update has been completed!”

After much work by Lukáš, Dave, Age (Prodoc) and myself, I’m pleased to announce that the main server has been updated! We now have support for DataQuality, Labels, improved cover art, track annotations and a whole host of bug reports. To see the detailed list of what things have been included in this release, please see our release milestone.

This release is significant in a number of ways:

  1. The labels feature is our first major schema extension in quite some time.
  2. Its our first schema change release in over a year!
  3. Age Bosma (Prodoc) has submitted a number of patches that were included in this release. Its great to see another aspiring developer getting code included in the main server. Well done and thanks much, Age!

Huge thanks go to Lukáš, Dave, Age for working on this release. Thank you!

Next server release: 1 week from today

The next server update is scheduled to happen one week from today. Today Lukáš and I finished checking in changes and will now only do further bug fixes — that is if you help us find more bugs! A few things to note: Lukáš added support for track annotations and release formats! Data quality changes … Continue reading “Next server release: 1 week from today”

The next server update is scheduled to happen one week from today. Today Lukáš and I finished checking in changes and will now only do further bug fixes — that is if you help us find more bugs!

A few things to note:

  • Lukáš added support for track annotations and release formats!
  • Data quality changes are no longer automods for everyone. Their behavior is defined in the edit info page now.
  • Since we’ve implemented the DataQuality feature, expired edits will no longer stay open for a grace period. Thus the next time ModBot runs after the next release, a bunch of expired edits will be accepted, since all the data will be at the default level and the action for expired edits at the default level is to accept the edit! Is this really what we want? Please take a moment to review the edit info page now and make sure it all makes sense to you!

The staging server is now updated with the latest code — please come and help us test over the next week to make sure no new bugs slipped into the upcoming release.

Next server release: April Fools Day

April 1st may not be the best date to release a new server, but scheduling would have it that way: The next server update is officially scheduled for April 1st, 2007. To prevent the next release from becoming a bad April Fools joke, we will need your help to test the new features on the … Continue reading “Next server release: April Fools Day”

April 1st may not be the best date to release a new server, but scheduling would have it that way: The next server update is officially scheduled for April 1st, 2007.

To prevent the next release from becoming a bad April Fools joke, we will need your help to test the new features on the server. Recently we’ve asked people to come check out the new Labels support and the Data Quality support. Now that we’re coming to a close on this new release (there are still bugs to be fixed, but major functionality changes are done) we’d like people to come check it out again and help us test on the staging server.

The following features will be included in the next release:

  1. Improved cover art support: A new release-url Advanced Relationship link type has been created. By linking a release to a cover art JPG file at CD Baby or at the Internet Archive, editors will now able to deep link to cover art on sites other than Amazon.com. For more information on this feature, see CoverArtSites. See an example here and here.
  2. Data quality: Based on the first round of feedback, we’ve narrowed the data quality levels down to 3 from 4. The staging server has also been loaded with recent data and the ModBot is now running for a more complete test. See below for more comments on this.
  3. Label support: Label support has been around and a number of bugs have been fixed. For more info see Labels.
  4. Lookup nagging: Nagging tagger users who look up their files at MusicBrainz but who have not donated. If you go to to the taglookup page, you will be constantly nagged if you’re not logged in. If you log in, you will only be nagged every 5th lookup (I suspect that most people will be logged in). If you’ve donated to MusicBrainz, you wont be nagged at all. Designed to be not terrible right off the bat, I am curious to see what people think of this solution. Please point your tagger to http://test.musicbrainz.org and do some lookups to see if you think the current nagging approach will work ok.
  5. Bug fixes: Lots of them — see our milestone info for more details.

I have some more comments regarding the DataQuality feature — based on blog feedback I’ve changed the data quality levels to:

  1. Low
  2. Unknown
  3. High

I’m not certain if these are the best levels, but I wanted to throw out some thoughts that go with choosing these names/levels. First, the existing data and all new data that editors have not vouched for needs to have a name attached to it that makes sense. Just applying low data quality to all data by default will be unfair to large swatches of our data. I think one level needs to indicate that no human has vouched for the data and the other levels needs to indicate that someone has looked at the data and given it a thumbs up or thumbs down. Second, I like Low and High, but I am not a big fan of Unknown. What other word can we use that suggests that no human has vouched for this data?

Other suggestions I’ve tried for level names:

  • bad, unknown, good
  • unverified, unknown, verified

I tend to dislike these levels since labeling our data as bad seems like a poor idea. And verified is questionable as well — what do you verify the data against? So, please take the staging server for another spin and let us know what you think now. We still have nearly three weeks to try and figure our the best way of handling this.

Finally, the change artist/release quality edits are currently still auto edits for everyone — this will be changed before the release.

Thanks!

Data quality: We want your feedback!

The release locking concept has been around for quite some time and has been debated at great lengths. After a couple of long calls Don and I reworked the concept into the data quality concept. The idea is relatively simple: Each artist and each release in MusicBrainz now has a quality indicator that shows the … Continue reading “Data quality: We want your feedback!”

The release locking concept has been around for quite some time and has been debated at great lengths. After a couple of long calls Don and I reworked the concept into the data quality concept. The idea is relatively simple:

  • Each artist and each release in MusicBrainz now has a quality indicator that shows the quality of the data: Unknown, low, normal or high.
  • Data that is marked with low quality should be easy to change.
  • Data with normal quality should take about the same amount of effort to change as it takes now.
  • Data with high quality should be harder to change, to avoid incorrect changes to good data.
  • Each edit type will define the number of votes required to pass, the duration votes stay open, what to do if an edit receives no votes, and if a vote is an auto edit.

This new feature will allow us to edit sloppy data faster, tune the editing system to fit better with how people use it and it will allow us to prevent accidental edits to good data. Now we need to your help in testing this system and giving us feedback about the various edit levels.

If you would like to help, please read the data quality wiki page, view the new edit information page and then test the new features on the staging server. Each artist and release page now has a Change Quality link that will allow you to change the quality of the artist/release. Once those are changed, the edit system will behave according to the values set forth in the edit information page. Please note that the change artist/release quality edits are currently autoedits, which will be changed once we’re done testing the bulk of this new system. For right now we’re making it easy to change the data quality for testing purposes.

To start testing, head over to the staging server. Add any bugs you find to the bug tracker. Or post your feedback in the comments.

MusicBrainz downtime redux

The attempt to switch the database to the new Sun server has failed and we’re still running with the old set of hardware. For some reason Postgres refuses to run with any kind of acceptable level of performance on this new machine. Thus, we’re setting aside this server for the time being and instead working … Continue reading “MusicBrainz downtime redux”

The attempt to switch the database to the new Sun server has failed and we’re still running with the old set of hardware. For some reason Postgres refuses to run with any kind of acceptable level of performance on this new machine. Thus, we’re setting aside this server for the time being and instead working to bring a replacement server online as soon as possible.

We’re also working with Dell to determine what the problem on the server is, and luckily the servers still have an extended warranty covering them with 4 hour on-site service. However, it appears that we need to diagnose the problem before Dell is willing to send someone out. I’ve run a memory test on the server and it passed all of its tests.

I’m out of ideas. I’m frustrated with Sun and Dell and will no longer waste time on their hardware trying to solve this problem. Instead I will work with whatever hardware I can scrounge up and get the service moved over to our spare server and then we’ll sort out the issues with our glorious servers.

It looks like it will at least be Friday before we can make the next attempt to fix this. Again, sorry for the inconvenience — we’re doing our best to fix this problem ASAP.

MusicBrainz downtime: January 17th, Noon PST

Tomorrow January 17th MusicBrainz will be unavailable starting at Noon PST, 15:00 EST, 20:00 UK, 21:00 CET, for at about 1 hour. The recent problems with certain artists not being found are being caused by a failing motherboard (specific guess: failing DMA controller). Fortunately the database server is still working well and the database has … Continue reading “MusicBrainz downtime: January 17th, Noon PST”

Tomorrow January 17th MusicBrainz will be unavailable starting at Noon PST, 15:00 EST, 20:00 UK, 21:00 CET, for at about 1 hour.

The recent problems with certain artists not being found are being caused by a failing motherboard (specific guess: failing DMA controller). Fortunately the database server is still working well and the database has not been affected by this hardware malfunction.

Tomorrow Dave Evans and myself will bring moose, our newly donated Sun Server, online and swap it out for the current database server. Once the new database server is up and running, we will want to monitor it for a while to make sure its doing its job properly. Once we’re convinced that it is doing its job, we will take the site down again and swap the drives in our then-to-be obsoleted database server with our ailing web server and bring the site back up.

After that we can remove the failing server and purchase a replacement motherboard — if you’d like to make a donation to help us cover the cost of the new motherboard (est: $800), we would greatly appreciate your support!

Thanks to Sun for the spiffy server that will get us out of a this bind and a big thanks to Dave Evans for his hard work getting our servers stable again.

I apologize for the inconvenience all this may have caused you.

Trying Google Analytics

I am currently in the middle of writing the year-end report for the MetaBrainz Foundation and its clear that our ancient web statistics system using webalizer is hopelessly outdated. Following a glowing recommendation from Wendell at MuiscIP, I’ve turned on Google Analytics for the MusicBrainz site so we can get a better idea as to … Continue reading “Trying Google Analytics”

I am currently in the middle of writing the year-end report for the MetaBrainz Foundation and its clear that our ancient web statistics system using webalizer is hopelessly outdated. Following a glowing recommendation from Wendell at MuiscIP, I’ve turned on Google Analytics for the MusicBrainz site so we can get a better idea as to what our traffic is and where it comes from.

Google Analytics works by having the client browser send a tiny ping request to Google — this tiny request is enough to track all sorts of things about the MusicBrainz traffic. If for some reason this should cause some issues, please post a comment here.

If it does work out and gives us good results, we’ll keep it. If it causes problems we’ll dump it. Let’s wait and see!

Technorati Tags: ,

The old RDF based web service has been deprecated

The subject pretty much says it all — the new XML Web Service has been debugged and is now stable. This new web service was designed to replace the old web service since its better designed, more concise and a lot simpler to use.

If you are using the old web service directly or have been using libmusicbrainz (1.x/2.x) or libtunepimp (all versions) then you are using the old web service and you should make an effort to migrate your application to use the new XML web service. The old web service will stay in service for all of 2007, but we may get rid of it as early as 2008.

Technorati Tags: ,