Spam Makes Children Cry

Posted September 1, 2007

I’ve been getting spam comments. This kinda sucks. They’re unsightly, and a pain to clean up.

I want to add some functionality to mitigate the issue, or ideally stop it entirely. The problem is that I’m not sure what I should use.

Bayesian filtering is probably the most accurate, but pure-Ruby filters are slow and it’s a pain to install C extensions on my server. On top of that I don’t really want to bother training the filter.

I could also go with captchas. The image-based ones require C bindings, but there are other options (math captchas are intriguing). I don’t really want to force you folks to do more stuff before commenting, though.

There’s always something like Akismet, That has some of the same flagging issues as Bayesian, but it’s already been taught enough that I could probably ignore that. I don’t know how it would be performance-wise, though… an extra network call for each comment is pretty heavy.

The easiest way out would probably just be to validate that they have Javascript running. Most spam-bots don’t, so I could just have a dynamically-populated, hidden input somewhere. The problem is that this also blocks out any users without Javascript enabled.

What do you folks think? Do any of you not have JS enabled? What do you use for your own blogs? Let me know!

Will Farrington said September 02, 2007:

Require users to really register and use a math captcha or something.

Nathan said September 02, 2007:

I don’t really want to make users jump through hoops to comment, though…

Will Farrington said September 02, 2007:

The alternative, Nathan, is to force yourself to jump through ridiculously large hoops.

I’d rather a tiny hassle for me, than a large one for you.

Pete Forde said September 03, 2007:

Javascript! That’s a great idea. Is it yours, or did I miss a larger trend?

Here is what I propose:

Instead of a captcha or a hidden, JS-calculated field.. why not a combination of both? That is, a sort of auto-fill captcha. A bordered div that says “if you have js running, this box will contain F35Y, and if you don’t have js running, please type F35Y.” and then the aforementioned text input below. Much like the comment, if they have js, the box will be auto-completed.

You could take it a step further by actually using js to hide() the div if js is running, completely removing visual ugliness.

I think this could seriously pwn comment spam. Mind if I article-ize the idea?

Nathan said September 03, 2007:

Haha, no, it’s not my idea. It’s been floated around for a while. I don’t recall offhand that I’ve heard of it being used with a Captcha-style fallback, although I wouldn’t by any means trust my memory. You may certainly write about it, though.

For the record, I decided to go with Akismet for a minimum or user interaction (read: none). If it ends up being inaccurate, I may give a gracefully-degrading Javascript solution a try.

Make your comments snazzy with Textile!