Dear python-musicbrainz/0.7.3 application, we need to talk!

An application that uses our python-musicbrainz/0.7.3 client library has been putting undue load on our servers all at once. This application looks up something at MusicBrainz at 03:00UTC causing our servers to be overloaded at that time each day.

To protect our servers from being overloaded we’re going to block this application from 3:00 UTC – 4:00 UTC. We’re hoping that this will alllow us to identify the application and start a dialog with the application authors. Once we have established communication with the authors and worked up a plan to fix this, we’re going to release the block.

We really dislike blocking applications, but if applications are being inconsiderate of our resources, we’re left with few options. We hope to hear from the application authors soon so we can resolve this issue. Also, we’re moving forward with our plans to require User-Agent strings that properly identify applications using our service to fix this problem going forward.

If you are the author of said application, please leave a comment with information on how we can get in touch with you.

19 thoughts on “Dear python-musicbrainz/0.7.3 application, we need to talk!”

  1. How do you want to block it if the application is unknown? Is the IP address always the same?

  2. hrglgrmpf: We’re blocking that User-Agent string, which is all we know of the application. Well, we know that they use our library, but that is it.

  3. That could make sense. I suspected Ubuntu 11.10, since the pattern of our traffic growth makes sense in that context.

  4. Ugh, python-musicbrainz2 doesn’t seem to allow setting a custom user agent string (would probably work with monkey-patching though). So essentially, you’d be blocking everything that uses the current version of python-musicbrainz2. However a repeating time pattern hints at something cronnable and it’s using Python, in which case albumidentify is most certainly one of very few candidates (apart from some custom software), so I guess you’ll find the evil one soon enough.

  5. Why are you even talking about blocking it? Just throttle its speed such that the load isn’t be a problem anymore. For example you could add a sleep(3) to each of its requests. As long as it’s not making an infinite number of parallel requests, that should slow it down enough.

  6. intgr, as far as I know, they want to block it so the person will actually notice, come and say “hey, it’s me, what should I do instead”. Will it work? No idea.

  7. intgr: If we throttle them, then we’re quietly providing bad service. We’d rather identify the problem, have a chat and go back to providing good service asap.

  8. Just hit this as I happen to try and check a release during the special time period. Can this really not be stopped by blocking this bad user’s IP?

  9. At some point you might have to move to a Developer ID like mechanism, like Google, MS and others are doing. Although I do not really like that, it still is the only way to properly identify which application (or application author) is abusing / overloading your service and effectively block it…

  10. Luke: or just block the default user agents of the various MusicBrainz client libraries.

    Unless there are also problems with people spoofing user agent strings, then requiring client registrations sounds a bit over the top.

  11. I just wasted 30 mins to find out that the block on python-musicbrainz is now applicable all day long.

    Do you still think it’s albumidentify at fault?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.