Search server release: 2012-05-15

In case you haven’t gotten enough of release announcements, we have another one for you. Yesterday during the main releases we also released a new search server to match the main server release. Thanks much to Paul Taylor for working on this release to be timed perfectly!

UPDATE: The search server and the MMD schema repositories have been tagged with this tag:

release-2012-05-15

Bug

  • [SEARCH-198] – The artist is getting a lowered score on MBS
  • [SEARCH-199] – Search includes empty annotations
  • [SEARCH-200] – Search on release giving to much boost to matches on CatalogNo
  • [SEARCH-201] – explain option doesnt work if search results contain non ISO-8859-1 characters
  • [SEARCH-216] – Null pointer exception when building freedb

Improvement

  • [SEARCH-157] – Be able to search for a track by its metadata OR its puid
  • [SEARCH-186] – Search Server has hard coded redirect URL
  • [SEARCH-187] – Update Junit Test from 3 to 4
  • [SEARCH-202] – Allow searching for RGs based on their releases’ status
  • [SEARCH-204] – Upgrade codebase to Lucene 3.6
  • [SEARCH-214] – Add release group ID to the web service indexed search results for recordings

New Feature

  • [SEARCH-205] – Search server should return multiple ISWCs for works
  • [SEARCH-207] – Changes due to introduction of ISO-3 language code
  • [SEARCH-208] – Chnages due to Split release group attributes into two types Schema Change
  • [SEARCH-209] – Support for Multiple IPI Artists
  • [SEARCH-211] – Support for new Track ‘Number’ field in a track
  • [SEARCH-212] – Add abiility to index, display and search works by lyrics language as part of schema change
  • [SEARCH-213] – Changes due to MBS-1385:Support unknown end dates

Task

Search server index updating paused

Tomorrow’s release requires us to update the main server and the search servers at the same time. This presents a bit of a chicken-and-the-egg problem: We need to build new indexes even before we’ve migrated the database to our new schema.

To accomplish this, I’ve stopped index updating and created a separate database that will allow me to build some indexes that will be a few hours out of date when we release tomorrow. They will be slightly old, but at least we will have indexes that work.

I expect the indexes to be up to date about 3 hours of the release is complete. If you find the indexed search out of date, please use the direct search in the meantime.

Sorry for any inconveniences this may cause.

User agent based throttling is now live

Yesterday we talked about rolling out our throttling based on User-Agent strings. A few minutes ago we pushed this feature live on our servers so now the updated rules are in effect. python-musicbrainz/0.7.3 users are now allowed 500 requests every 10 seconds and every single one of these requests is constantly being used. No surprise here. 🙂

For the exact details on what is throttled and how to get around your application being throttled, see our rate limiting documentation.

Current web service rate limiting documentation

We’ve just added a page that documents what we’re currently blocking on our Web Service. We hope to lift the block on python-musicbrainz/0.7.3 tomorrow and instead throttle the number of requests it can make in a given period of time.

I’ll post another entry once we’re done with making those changes.

Dear python-musicbrainz/0.7.3 application, we need to talk!

An application that uses our python-musicbrainz/0.7.3 client library has been putting undue load on our servers all at once. This application looks up something at MusicBrainz at 03:00UTC causing our servers to be overloaded at that time each day.

To protect our servers from being overloaded we’re going to block this application from 3:00 UTC – 4:00 UTC. We’re hoping that this will alllow us to identify the application and start a dialog with the application authors. Once we have established communication with the authors and worked up a plan to fix this, we’re going to release the block.

We really dislike blocking applications, but if applications are being inconsiderate of our resources, we’re left with few options. We hope to hear from the application authors soon so we can resolve this issue. Also, we’re moving forward with our plans to require User-Agent strings that properly identify applications using our service to fix this problem going forward.

If you are the author of said application, please leave a comment with information on how we can get in touch with you.

Release editor service interruption

In an effort to mitigate/fix MBS-3379 we need to restart the service that keeps the session information for our release editors. We’re going to do that tomorrow Saturday October 28 at Noon PDT, 3PM EDT, 8PM London, 9PM Amsterdam. If at this time you have a release editor open, submitting your edits will fail and you will need to start your edits over again.

Sorry for the inconvenience.

Web service user-agent string blocking reminder

I would like to remind Web Service users that on 16 November we’re going to block generic User-Agent strings from accessing our web service. Earlier we said:

The User-Agent string needs to identify the application and the version of the application that is making the request; having a generic User-Agent string like “Java/1.6.0_24″ or “PHP/5.3.4″ does not allow us to properly identify the application making the requests.

IMPORTANT: 6 Months after we release NGS (Nov 16th) we’re going to start blocking common generic User-Agents strings, so please make sure that you send us a proper User-Agent header as part of your request.

You have been warned. 🙂

The FreeDB gateway has been updated to NGS!

I’m pleased to announce that the FreeDB gateway, which lets you fetch MusicBrainz data via FreeDB enabled applications, has been updated to use the NGS database. As of now, its updating with new data from the main MusicBrainz server and you should be able to look up new CDs.

To use the FreeDB gateway, set the FreeDB server in your application to freedb.musicbrainz.org on port 80. For more information, please take a look at our wiki page for the gateway.

Thanks to Lukas for porting this code to NGS!

Record traffic, new server in NGS rotation

We’re currently seeing record levels of traffic right now. Since the beginning of August our traffic has gone from 9M hits per day to 14M hits per day today, which is a significant increase in traffic. Fortunately our servers were able to handle the extra load, but we started getting near capacity.

Yesterday I started working on taking one of the servers from the classic site and moving it into the NGS cluster. We finished this a couple of hours ago and we now have a maximum limit of 227 queries per second (qps).

If you find any spurious troubles with the site in the next day or so, please open a ticket.