MusicIP open sources a small server extension

I’m pleased to announce that MusicIP has just open sourced a small extension to the MusicBrainz server! MusicIP contracted me to write a set of SQL scripts that would take their mirror of the MB database and create an extra table that stores the first release date for each track. As you may know we … Continue reading “MusicIP open sources a small server extension”

I’m pleased to announce that MusicIP has just open sourced a small extension to the MusicBrainz server!

MusicIP contracted me to write a set of SQL scripts that would take their mirror of the MB database and create an extra table that stores the first release date for each track. As you may know we have this for albums, but we haven’t had (or needed) this for the track level.

If you’d like to check out this extension, you can find it here. Take a look at the README file to see how this should be used.

Please note that this code is checked into the RELEASE_20060712 branch — once we’re finished with dev work on this branch we will merge it back into the trunk.

Big thanks to Wendell Hicken and Matthew Dunn of MusicIP!

Technorati Tags: ,

To search aliases or not?

In ticket #1731 we’re currently discussing the merits of having the artist aliases searched by default. Compare these two searches: Search without aliases for “Jennifer” Search with aliases for “Jennifer” The perfect match for Jennifer is half-way down the page of results if aliases are included. Jennifer is all the way at the top (where … Continue reading “To search aliases or not?”

In ticket #1731 we’re currently discussing the merits of having the artist aliases searched by default. Compare these two searches:

  1. Search without aliases for “Jennifer”
  2. Search with aliases for “Jennifer”

The perfect match for Jennifer is half-way down the page of results if aliases are included. Jennifer is all the way at the top (where it should be) when the aliases are not searched.

So, why does this happen?

Take for instance the top hit of the with aliases search: Jennifer Paige. She has an alias for “Jennifer Page“, so when Lucene ranks the search results, the word Jennifer appears twice, which is a better match than when the word appears only once. This disturbed our users and it plain feels wrong to me.

Then I tried to play with Lucene’s term boosting functions. Take this query:

aritst:jennifer^10 sortname:jennifer^10 alias:jennifer^0.0000001

In English, it says to search for Jennifer in artist names and sortnames and to make these hits 10 times more “important” than normal hits. It also says to search aliases and make hits from this vastly less important than normal hits. The result, Jennifer is at the top as we want. But, what happens when we search for Bjork (not Björk)?

We get this mess where Björk is the last search hit with a score of 0. (this is not the best example since the next version of search will automatically find Björk when searching for Bjork, but it still illustrates the problem)

As you can see, tweaking the searches to make things better one way, will make other searches worse. Do you think it its more important to search aliases by default or to have better search results by default?

Technorati Tags: ,

Call for bug-fix testing

We’ve fixed a number of bugs (321 of them as of right now!) and we should be ready to go for another bug-fix update of the main server this weekend. To avoid the mess we had last time, please take some time in the next few days to check the test server to see how … Continue reading “Call for bug-fix testing”

We’ve fixed a number of bugs (321 of them as of right now!) and we should be ready to go for another bug-fix update of the main server this weekend. To avoid the mess we had last time, please take some time in the next few days to check the test server to see how things are looking. Please test now and not after the release!

I would also like to remind people that if you find a bug to report it via our bug tracker. Please do not mail bug reports or mention them in IRC in hopes of having them fixed. Also, if you don’t report a bug before the release, please don’t scream your head off if the issue hits the main servers when we do the release.

Go test on the staging server and report bugs or look at the list of closed bugs.

Thanks to Stefan for fixing all these bugs!

Technorati Tags: ,

libtunepimp releases

Lukas just released two new bug fix versions of the tunepimp library. The following changes were made:

libtunepimp 0.5.1:

  • Fixed broken symlink problem in plugins/tta/Makefile.am
  • Don’t write files/directories with leading dots. (#1427)
  • Added SetNotifyCallback to the Python bindings.

libtunepimp 0.4.3:

  • Fixed check for TagLib 1.4 in configure.in + few other build system fixes.
  • Fixed buffer overflow in lookuptools.cpp (patch by urs_fleisch at yahoo dot de).
  • Fixed memleaks in the WMA plugin.

Download either of these releases from the libtunepimp download page. Thanks Lukas!

Technorati Tags: ,

Usability fixes live on server

A number of you had reported various usability issues with the new server update — we just updated the server with some more fixes. For the run-down of what we changed, please see the list of recently closed bugs. Thanks to all those who reported bugs and helped us sort out these issues! Technorati Tags: … Continue reading “Usability fixes live on server”

A number of you had reported various usability issues with the new server update — we just updated the server with some more fixes. For the run-down of what we changed, please see the list of recently closed bugs. Thanks to all those who reported bugs and helped us sort out these issues!

Technorati Tags: , ,

Main server update updated

We’ve gotten a slew of bug reports overnight as to what was wrong with the latest release. Keschte and I worked hard to address a number of these issues and I’ve done a mini update on the server. Check out this list of recently closed bugs to see what we did. Thanks to all those … Continue reading “Main server update updated”

We’ve gotten a slew of bug reports overnight as to what was wrong with the latest release. Keschte and I worked hard to address a number of these issues and I’ve done a mini update on the server. Check out this list of recently closed bugs to see what we did. Thanks to all those who reported bugs!

The broken tagger issues should also be resolved now.

Technorati Tags: ,

Main server updated

We just updated the main server with the long awaited new release: Not really new, or not so obvious features The MusicBrainz web pages have been polished and cleaned to be web standards compliant. We proudly serve the pages in XHTML 1.1, CSS 2.0 compliant mode. The user interface has been given a face lift … Continue reading “Main server updated”

We just updated the main server with the long awaited new release:

Not really new, or not so obvious features

  • The MusicBrainz web pages have been polished and cleaned to be web standards compliant. We proudly serve the pages in XHTML 1.1, CSS 2.0 compliant mode.
  • The user interface has been given a face lift and a cleaner look.
  • We have made great steps on the terminology changes, which had hung in limbo for quite some time. From now on, the term ”album” is reserved for release attributes. The albums are now called ”releases”. The same is true for the term ”moderation”: we replaced this with the more appropriate term ”edit”.
  • Lots of static pages have been replaced with WikiDocs pages. This means that most of the documentation pages displayed on this very server are no longer static html pages, but are instead content transcluded from the MusicBrainz Wiki. For each of these pages, a given revision is labeled as “official content”, and is displayed inside the MusicBrainz site. (For example, compare http://musicbrainz.org/doc/ContactUs to http://wiki.musicbrainz.org/ContactUs). If the content is not labelled as official, it can still be browsed in the same way, but will be labelled as unofficial content. WikiDocs is a huge project, and some parts will always be in the works. Please talk to the Wikizens if you’d like to contribute!

New features

  • The “Release Editor” (nicknamed “The edit page to rule them all”) replaces the FreeDB Import, the CD-Lookup as well as the “Edit all” workflows. It supports all the edit types usually applied to core entities (artist/release/tracks), including single artist conversions, various artist conversions, but also explicit changes to one track artist of a single artist release. This will allow us to more flexibly edit releases to conform to the new ReleaseArtistStyle, all in one go.
  • Timestamps on edit notes.
  • Diff display of changed titles on the voting (previously: moderation) pages.
  • Edits are linked to their respective documentation.
  • Edits show all the parent objects that are relevant for the current object (example: on a track title edit, the artist and the release are shown in the header).
  • Pending edits are shown and explained in more detail on the edit pages. Links are provided to review the pending edits.
  • The help blurbs on the editing pages were extended, and now include links to official WikiDocs documentation relevant to the current EditType.
  • GuessCase classical mode.
  • The Indexed Search was extended to include additional fields for the Artist Search. Users can choose to include artist names, sort names and alias’ for more precise search results.
  • The Indexed Search results have an additional row which allows to add the entity to the Create Relationships list. This should allow to enter relationships much faster now.

Improved features

  • Lots of fixes for the GuessCase function.
  • Numerous others: please see the closed tickets for the current Milestone and Version.

Many thanks to Keschte and everyone else who contributed and tested for your hard work on this release!

Technorati Tags: , ,

libtunepimp 0.5.0 released

libtunepimp 0.5.0 was just released — the changes for this version are:

  • Versioned header files. tunepimp/tunepimp.h -> tunepimp-0.5/tunepimp.h
  • Removed track lookup parts of libtunepimp
  • WavPack, Speex and The True Audio metadata plugins
  • Correct handling of UNC paths on Windows
  • Fixed non-album tracks renaming/moving (#1408)
  • Trivial change to id3tag to read tags empty ID3 frames (#1568)

IMPORTANT: Please note that libtunepimp 0.5.0 is incompatible with previous versions of libtunepimp. We’re in the process of re-architecting libtunepimp and we’ve started removing the features that do file lookups, since Picard now does lookups with python-musicbrainz2. If you wish to use lookup features with libtunepimp, you will either need to use python-musicbrainz2 or call the XML Web service directly.

You can also view the complete diff of this change. You can download the new release in the usual place. Big thanks to Lukas for working on this release!

Technorati Tags: , ,

New Server Release

It has been rumoured for quite some time now, and I think that the new server release is ready for beta-testing. Please jump in, and help finding the remaining bugs. If you find any, file them to the XHTML 1.1 Milestone, and owner to yours truly. This is a significant update to the look and … Continue reading “New Server Release”

It has been rumoured for quite some time now, and I think that the new server release is ready for beta-testing. Please jump in, and help finding the remaining bugs. If you find any, file them to the XHTML 1.1 Milestone, and owner to yours truly.

This is a significant update to the look and feel of MusicBrainz — many pages and workflows have changed and there are bound to be a number of bugs. We’ll need people to jump in help testing if we want to get this release out soon.

See what has changed: Release Notes
Test Server (as usual): test.musicbrainz.org
Bug Tracker (as usual): bugs.musicbrainz.org

For right now, we’re not specifying a release date — we need to get more eyes looking at this new release before we can nail down a date. So, please jump in and help test!!

What's up with those pesky 502 errors?

There are two web servers running on the main web server machine. The first web server is light and handles all the content that is simple, such as static pages, images and the like. Anything that requires more intelligence, such as talking to the DB, gets passed to the second web server, which is designated … Continue reading “What's up with those pesky 502 errors?”

There are two web servers running on the main web server machine. The first web server is light and handles all the content that is simple, such as static pages, images and the like. Anything that requires more intelligence, such as talking to the DB, gets passed to the second web server, which is designated for these heavy requests.

The light server will wait for a specified time for the heavy server to finish its job — currently 120 seconds. If the heavy server hasn’t finished the job in that time, the light server gives up and and returns you the dreaded 502 error. The DB server will unfortunately continue to chug on the query and finish executing it as requested — cancelling an existing query is hard to do, and often times its better to let the server just run its course.

The gut reaction might be to say: “Why not stick around longer and wait for the results, if the DB is going to crank them out anyway?” Problem is that if we do this, the light web server is sitting idle doing nothing while waiting for the DB/heavy server to finish its job. The light server can give up and instead spend its time better doing things it can accomplish in a reasonable amount of time — like serving smaller requests for others. With this setup, the overall system favors the less intensive requests and thereby increasing the overall number of queries that were successfully handled. If we stopped and waited for the DB/heavy server to finish its stuff, we would pretty quickly clog up the web server with requests that are sitting idle, doing nothing. And that clog would then prevent any further connections to the web server and the whole site comes to a halt.

If you want a visual representation of what is going on, check the load graphs for dexter, our DB server. Any load greater than 4.0 and the DB server is no longer running optimally. We’re fine right this second, but in 10 minutes time?

So, what are we doing about this?

  1. Optimize the server code so that the user cannot make these intensive requests
  2. Spread the DB load across multiple replicated slave servers.
  3. Partition the database so that we can have multiple master servers. For instance, we could have one DB server that handles all the edits and one that handles the data. Maybe one that handles TRMs and PUIDs. This way each machine does less work, but this is a lot of work to code for mb-server devels.
  4. Find someone to give us a beefy database server with 12GB – 16GB of RAM

So, next time you’re aching for new mb-server features, please keep in mind that we’re spending a lot of time just keeping the service running smooth. Our income isn’t great enough yet that we can hire people to maintain the site AND hire people to hack on new features. In the meantime, Dave Evans and I will focus on keeping things running and hard working folks like Keschte are working on new features for the server. Overall we’re still moving forward, just a lot slower than we care for.

What can you do?

  1. Help us solve our DB issues if you’re a DB person.
  2. Help us write more mb-server code.
  3. Most important of all, make a donation!!
  4. Bug your rich friends to donate to MusicBrainz so we can buy a beefy database server. 🙂

Technorati Tags: , ,