September 9, 2011

Captcha Is Hard

I am a judgment matchmaking specialist (Judgment Broker) who writes a lot. This article is my opinion about how Captcha and other challenge systems may have gone too far, far further than they should.

Captcha stands for “Completely Automated Public Turing test to tell Computers and Humans Apart”. The words were first used around 2000 by Luis von Ahn, Manuel Blum, Nicholas Hopper, and John Langford of the Carnegie Mellon University.

Turing is the name of a repeatable test of a computer’s ability to behave with human-like intelligence. If a computer can fool a person into believing they are communicating with another human, the machine passes the test. The test was introduced by Alan Turing in his 1950 paper, “Computing Machinery and Intelligence”.

A Captcha or a challenge system, is a test to see if you are human. It is usually text, pictures, or sounds, that automated computers and robots cannot understand. In the beginning, such tests screened out people with vision problems, hearing problems, small children, and those with mental challenges.

Now, most Captcha or challenge systems only allow those with perfect vision, perfect hearing, and people who can solve a specific computer puzzle. In the old (pre-2007) days, the tests were usually simple.

Back in 2007, estimates (seen at were that 160,000 human hours per day were spent solving these puzzles, at 10 seconds per try. I would guess that number is closer to a million human hours per day by now, when you factor in the number of times some must retry to get the puzzles solved.

Simple is what is needed. When one visits, one sees a simple example that almost anyone can solve. The problem is, now, the actual tests are designed to defeat optical character readers (OCRs). The words or numbers are garbled so much that the average person has to retry a few times.

When I say garbled, I mean not even the average teenager can figure out what it says most of the time. It is time to move back to simple Turing tests for instance adding two numbers, or a simple distortion of some text, or asking someone to pick all the names from a list of words.

To make matters worse, great effort went into making an alternative for people with vision problems. There is usually an “audio” button where you can listen instead of read. Try it sometime, and test how “understandable” the audio is to you. And of course, you cannot be on the phone, on Skype, watching TV, or listening to anyone or anything else while taking audio tests on web sites.

One benefit of the many web sites using Captcha or other challenge systems, is that at least one company, reCaptcha, used test results to help improve OCR on computer systems that scan and read old books and text.

Google purchased reCaptcha in 2009. Good for them, let’s try and make it is good for the rest of us.

When Captcha or challenge systems are too difficult, most people must try several times, and people with vision problems – forget about it. That is not right.

