Traditional Culture Encyclopedia - Traditional festivals - "I'm not a robot": the history and future of CAPTCHAs

"I'm not a robot": the history and future of CAPTCHAs

Basically anyone who has been on the Internet knows what a CAPTCHA is. These crooked letters, numbers and Chinese characters are constantly refreshed on various websites every day, challenging the eyesight of netizens. The history of this thing, in fact, is much shorter than you think, but its birth in just over a decade, has gone through a variety of twists and unexpected development. And now, with the east wind of modern IT technology, CAPTCHA has embarked on a whole new path.

The English name for CAPTCHA is "CAPTCHA", which is an acronym for a cool phrase - "Completely Automated Public Turing CAPTCHA" is an acronym for a cool phrase - "Completely Automated Public Turing test to tell Computers and Humans Apart", which directly translates to "Fully Automated Public Turing test to tell Computers and Humans Apart". The Turing test, which we are all familiar with, is a famous test proposed by Turing, in which the experimenter asks a machine and a human a series of questions, and if the experimenter can't tell the difference between the two, then the machine passes the Turing test.

Captcha is a reverse, simplified version of the Turing test, where instead of a human judging whether the other person is a machine, the machine reads whether the other person is a human. In the early years of the Internet, there were no CAPTCHAs on various websites, and at most some hackers intentionally typed out words in very strange ways for the purpose of avoiding sensitive word detection. Like typing XXX as "氵工氵尺 民". In the early 21st century, with the prevalence of spam and fraudulent software, companies began to utilize the idea of CAPTCHA to protect their websites. For example, in 2001 PayPal had humans enter twisted letters similar to CAPTCHA.

The term CAPTCHA was really coined in 2003, much later than many concepts, such as neural networks, which were much studied in the 1970s. Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and others at Carnegie Mellon University first coined the term "CAPTCHA". They did a deep research on the CAPTCHA system and programmed it. Since then, a large number of CAPTCHAs have been used on websites, effectively stopping scalping software from running rampant. To this day, there are over a billion CAPTCHAs being entered every day.

Luis invented CAPTCHA in 2003, and two years later he completed his PhD thesis, in which he proposed the concept of combining the capabilities of humans and computers*** to solve problems. He argued that computers are strong at calculating large amounts of data, while humans have strengths in areas such as perceiving images that computers are still struggling with, and that it would be a shame not to do something with so many CAPTCHAs being entered every day!

Luis's slogan at the time was "stop spam, read books." He did, and in 2007, he founded reCAPTCHA Studio, which provided a new type of CAPTCHA service that became the "visual word recognition system" (opcaptcha). In 2007, he founded reCAPTCHA Studio, which provides a new type of CAPTCHA service called "optical character recognition" (OCP). They scan paper books and then use software to cut them into individual words, which are bundled together with real CAPTCHAs and entered by the user. When the same picture is tagged by more than one person with the same word, the software records it and puts it together with other words to make a complete e-book. If you've ever seen one of those red CAPTCHA signs and the code text consists of two paragraphs, it's probably easy to see what I'm talking about.

This idea is just awesome. In this way tens of thousands of CAPTCHAs were successfully used to convert those physical books without expensive manual labor to type them up. google also found the system so brilliant that they decided - to acquire them. reCAPTCHA became part of Google in 2009, and using this system, Google electronicized a large number of books and published them on Google books. By 2012, in addition to books, reCAPTCHA also translated a lot of door numbers for Google Street View. It can be considered a great honor. The next time you come across this CAPTCHA, you'll know what a cool thing you're doing.

CAPTCHAs are very common, but almost everyone annoys him, after all, it's one more very troublesome process. In fact, as information technology has evolved, CAPTCHAs have received more and more challenges.

In the early years, the way to crack the CAPTCHA was very simple and violent, and that is to rely on people to solve. In the early days, the way to crack CAPTCHA was very simple and violent, and that was to rely on people to do it. Some people specialized in running sweatshop-style companies in third-world countries, receiving large numbers of CAPTCHAs and then entering them to be cracked by human beings, and this kind of company was known as "CAPTCHA farms. This method is theoretically unsolvable - after all, it's a real person on the other side of the fence - and of course, it's the most expensive.

On the other hand, machine learning, image processing, and other technologies have boomed in recent years. The ability of computers to detect alphanumerics in pictures is getting better and better. In response to this threat, CAPTCHAs have begun to develop in the direction of curiosity, distorting the text, adding distracting lines, mixing the background color with the color of the text, and so on, and even the big-eyed reCAPTCHA has added a couple of lines to it. This has had some effect, but is much more disruptive to the user. The probability of unsuccessful input is increasing, and people have wanted to smash their computers.

To make matters worse, at WOOT 2014, Bursztein showed a generic CAPTCHA cracking program, a reinforcement learning-based algorithm that works well for large pairs of CAPTCHAs on the market. This proves that relying on twisting text to make it more difficult as well is no longer useful, and the only ones who get hurt are the users.

But users are already freaking out. Not to mention disabled people with visual impairments, cognitive impairments, and other conditions that make entering a CAPTCHA more difficult than ever. People didn't want a more complex CAPTCHA, they wanted to simplify it. So a variety of solutions have been developed to stop computers while simplifying the means. These include sliding CAPTCHA, clicking CAPTCHA, and so on. As for the effect well ...... not to mention, 12306 is definitely a big counterexample ......

Remember just said reCAPTCHA, this time, they change logo!

No, that's not the point, the point is, they completely revolutionized CAPTCHA. reCAPTCHA developed a whole new form of CAPTCHA device. All you have to do is click on the box and the verification is done. It's very simple, no fuss. Even the website page becomes refreshing. And its interception success rate is several times higher than traditional text CAPTCHAs!

So how does this new reCAPTCHA work? Simply put, when you enter the page, the software begins to collect all kinds of information about you, including your IP, cookies, time, resolution, mouse movements, keyboard operations and other miscellaneous data, when you click on the completion of this information is processed by the server and ultimately determine whether you are a real person. So the new reCAPTCHA collects more information than a traditional text CAPTCHA and is harder to crack.

Normally, such a perceptual model would require a huge amount of data to be trained successfully, but fortunately reCAPTCHA has a powerful backend in Google, who has facilitated this service so strongly that they have no shortage of user data. Nowadays, you can already see it on major foreign websites, including youtube, facebook, and so on.

"Tough on bots, easy on humans" is the slogan of reCAPTCHA, and it's true. The service is completely free, just sign up on their website to get the interface to use. You can totally get one to play with on your own site if you want. It's a lot more fun than using all kinds of ugly CAPTCHAs!

CAPTCHA has been around for about 15 years now, and it has been branded as a technological development in the information age. Google's efforts to develop a new type of CAPTCHA have allowed us to see a more convenient future, and more importantly, the goal of technological development should be to make people's lives more convenient rather than adding to the problem. In contrast, 12306 should not be ashamed of it?