Wednesday, April 25, 2007

Myths About CAPTCHA's

For those that don't know what a CAPTCHA is, it's something that typically a human can answer but automated software can't figure out. An example of this is on the comments page of this blog which has a box with the squiggly letters you have to type in before you can submit a comment.

Some people are declaring that it's the end of the CAPTCHA era either with human powered sites that trick visitors into providing the answer to the CAPTCHA, or automated image recognition software that just needs time and a little computing horsepower to decode the text in the image.

Myth #1 - CAPTCHA's aren't accessible to the visually impaired.

Accessibility issues are a legitimate complaint for some sites that don't implement a robust accessible CAPTCHA solution. For instance, the visually impaired can use the alternative audio CAPTCHA used on this very blog that solves this simple problem. Other types of CAPTCHAs that are math or word problems which are easier to read are also accessible.

Myth #2 - All CAPTCHA's are those squiggly text things seen on blogs.

Most of the comments about CAPTCHA's are based on the one type of CAPTCHA that uses extremely bent and distorted text called Gimpy. However, Gimpy is just scratching the surface when it comes to CAPTCHAs as they come in many forms.

Some of the other CAPTCHAs variants include identifying what's contained in a picture, simple math questions like "1 + 4 = ?", a text question like "What color is the sky?", or typing in the letters or numbers played via audio.

If you don't think people can spell "BLUE" or answer the math question properly you can always give them a nice drop list of possible answers and only give one chance to answer per question to stop bots from hacking at the answer.

Myth #3 - Bots can easily "BLOW THROUGH" CAPTCHAs.

When humans are being used to provide CAPTCHA answers that can be the case, but only when you implement sloppy CAPTCHA code in the first place. You can use a series of security measures to make sure there's a human sitting at the keyboard and it's not being passed through by a bot.

  1. Require Javascript to validate the CAPTCHA since the majority of bots don't run Javascript in the first place.
  2. Obfuscate your CAPTCHA in randomized encoded Javascript so that it's difficult, if not impossible, for a bot to even detect the presence of a CAPTCHA on the page in the first place.
  3. Use Javascript input sensory techniques such as MT Keystrokes to detect whether a human has actually typed into the field on the web page.
  4. Randomize the type of CAPTCHA being used so that there isn't a single specific type of CAPTCHA to target with an automated tool.
Summary

The real vulnerability of most forums, blogs and wikis face isn't even the risk of CAPTCHA failure, it's the identical footprint of all the Open Source software which makes locating the comments pages so easy.

Changing the name of the anchor text and page name on a blog from "comments" and "comments.php" to "Post an Opinion" and "youropinion.php" is another form of CAPTCHA because the human will immediately know where to click but the bot might get confused.

Better yet, since most bots don't read javascript, simply obfuscate the actual HTML of your "Leave a Comment" section in Javascript. When bots can't even find the link to "leave a comment" or the form fields where you enter a comment in HTML it may eliminate the need for the more complex text bending CAPTCHA's in the first place. Sure, the spammers could code the bot to decode a single instance of obfuscated Javascript for a single blog, but the code itself could be randomly obfuscated so that it would be quite a difficult task.

Don't let the naysayers dissuade you from increasing the strength of your spam blocking as stronger CAPTCHA's combined with Javascript tricks appear to be bulletproof until the bots get a lot more complex and smarter.

P.S. Note that the guy claiming CAPTCHA's are dead doesn't have one on his blog and if you scroll down past the actual comments you'll see he has a shitload of porn spam at the bottom. Obviously someone knee deep in spam is NOT the person you should be listening to about whether or not to use a CAPTCHA.

2 comments:

WebGeek said...

Great post, Bill. As usual, you're dead-on. I completely agree...there's a ton of techniques to couple with CAPTCHA that most developers are not even using. On that note, one of the dumbest CAPTCHA implementation blunders I've seen is when they name the file the same thing as the letters in the image. If I were programming a bot I'd set it check that first. Brilliant. I've actually had some really good success with some other methods that don't even require CAPTCHA on forms.

Catmoves said...

I've been wondering about putting a Captcha on my blog? I'm not getting spam in my comments section, but I'm getting tons in my gmail address. Any advice?