Tuesday, October 02, 2007

3000 Hours Painlessly Donated

This article is amazing. It has to do with the effort to scan books so that they can be accessed electronically, and the issues that optical character readers (OCR's) have deciphering distorted or poor-quality texts. You know those words that you have to type in to comment at websites or to search for tickets at Ticketmaster? This program feeds words that the OCR fails to recognize into those sites, so that humans logging into websites effectively correct the OCR's mangled version of the text. So people are painlessly donating 3000 hours a day to the effort of digitizing books. How awesome is that?

My only issue with the article is this. How on earth do you get CAPTCHA out of Completely Automated Turing Test To Tell Computers and Humans Apart? I get CATTTTCHA. Does the P somehow mean that there are four T's?

2 comments:

roborob said...

Don't forget: those CAPTCHAs also prevent spam from showing up in your comments section. Not nearly as cool as digitization of distorted and poor-quality texts, of course - but oh so effective.

Katy said...

Yeah, I wasn't discounting the original purpose at all (they also prevent bots from signing up for e-mail with which to spam us, and I assume the ticketmaster sign-in prevents bots from buying all the tickets so that we have to buy them at even more ludicrously marked-up prices). But I think it's pure genius to get people to work for you without realizing it. Now if only I could master this art at my job...