Comment spam, take II

I know, I know, I know. Captchas discriminate against vision impaired people.

Regardless, I have installed comment captchas (little images that contain a number you need to type in) as a comment spam blocking solution. Its worked well on my other blog, and I hope it will work well here. I’ve not gone back to all the entries to make sure that all comments are turned on, but they should be for all new posts.

Let’s hope this works.

Vacation, pub night and getting back to work

I had a really great time in Europe (Spain and Britain) in the last couple of weeks. Now I’ve returned to the states, cleaned up my travel gear and I’m trying to convince my body to get over the jet-lag and prepare for getting back to work tomorrow.

The MusicBrainz Pub Night was a success, if I say so myself. We ended up settling on the third pub in the list: Old Thameside. At its peak we had 14 people present, which is a record for getting Brainzerz together in one room. I really enjoyed meeting people face-to-face, espousing the virtues of MusicBrainz and getting people fired up about contributing.

We had a great number of conversations throughout the evening, but we mainly avoided plotting grand new schemes that we wouldn’t have time to implement. We did talk quite a bit about the new non-profit, AdvancedRelationships and our perennial problem of reducing the number of open moderations. We all had a good laugh at the last one — no matter what we do, it seems that open moderations tends to hover around 4000-5000 open moderations.

I really hope that in the coming year we can tackle a number of the things are we’re hoping to work on, so that we can have another full fledged summit towards the end of next year.

Thanks to everyone who showed up to the pub night — it was a really fun evening!

Server Troubles

Recently the server has been hit by patches of instability – large load spikes, running out of memory, and processes getting killed here and there. When the most recent out-of-memory condition occurred (last night) the SSH server was one of the processes which got killed, which is why the server had to be rebooted a little while ago.

I’m fairly sure I more or less know what’s been causing the problems, and have made a few changes to try to reduce the chance of it happening again.

One of the worst causes of the problem is looking up a TRM with a large number of tracks. The worst TRM by far for this is the “silence” TRM, with (currently) over 900 tracks. As a result I’ve had to, for now at least, disallow lookups on this TRM – doing so will now simply return an error. Sorry 😦 Maybe it can be made to do something more helpful in future.

The other change is that if you do a lookup on any TRM which has more than 100 tracks then only 100 of those tracks will be returned. However so far there are no TRMs (except “silence”) with over 100 tracks, so this won’t affect anyone, yet. As the data grows, it will though.

Sorry for any inconvenience caused (hey, I’m apologising again. This is getting to be a habit). But I’m sure you’d rather have a server which doesn’t keep crashing and locking us all out. Hey ho.

Blog comments disabled

Sorry, but I’ve got better things to do with my time than continually delete spam comment runs from this blog (and the administrative UI doesn’t make it very easy to delete large numbers of comments). So, pending some effective method of protection against the spammers being found, I’ve disabled comments on this blog.

For now, if you want to discuss any of the entries here, you can use the mb-users mailing list. Sorry for any inconvenience.

Mopping Up the Pink Stuff

Over the last few days we’ve had something like 200 spam comments posted to this blog, so I’ve been forced to invest a few hours of my time deleting those comments, and trying to find a way to solve the problem in a more permanent, automated way in the future.

We’ve got some comment filtering in place now, which should mean that the spam comments at least only get seen by the blog admins (thus they don’t get spidered by Google et al, which is presumably the reason for spamming in the first place; and it also means that you, dear reader, aren’t bothered by them either).

The blog software does have provision for blocking by IP address, but like e-mail spam it tends to arrive not from one or two sources, but from a multitude of machines, so having a simple list of banned IPs is never likely to be practical. However I took the list of IP address seen in the last couple of runs of comment spamming, and cross-checked them against some well-known DNS blacklisting services, traditionally used to protect against e-mail spam. About half of the blog spamming IPs were listed in those blacklists, so if anyone happens to know of a MoveableType plugin which can do DNSBL lookups, please let me know!

Without going into any detail, I’ve taken a few other measures to protect against this problem; it shouldn’t have broken anything, but if it has, please let me know about that too 🙂

Server Updates

“Add Disc ID” moderations, and Annotations.

Changes mainly of interest to MusicBrainz Users

“Add Disc ID” Moderations

Whenever a disc ID is added to an existing album, it is now tracked
via an “Add Disc ID” moderation.  This applies both to disc IDs added
via the “CD lookup” interface (in which case the moderation is credited to
whoever performed the lookup), and also to those added as a result of a
FreeDB lookup (which fall under the “FreeDB” moderator). 
“Add Disc ID” moderations are not used in the case where
an album and a disc ID are added at the same time.

Annotations

Annotations allow you to add notes to artists and albums. 
See How Annotations Work
and the Annotations FAQ
Thanks to Matthias Friedrich for building the foundations of this feature.

Bugs and RFEs Closed

Dave Evans