Here’s why robots can or can’t tick the “I’m not a robot” box on CAPTCHA:
Robots actually can tick the “I’m not a robot” box on a CAPTCHA.
The way the test identifies the bots is by analyzing how the box is checked.
Humans and computers control a mouse very differently, and that’s the tell.
CAPTCHAs can actually test behavior in a lot of ways, and they can even combine tests to spot robots.
So if you want to learn all about CAPTCHAs and how it tricks bots, then this article is for you.
Let’s jump right into it!
- Omegle Keeps Asking for CAPTCHAs: Why?
- Google CAPTCHA: How to Turn CAPTCHAs Off?
- I Am Not a Robot CAPTCHAs: How Does It Work?
- 7 Ways to STOP Omegle CAPTCHAs
What Is CAPTCHA?
Let’s start with the basics, and as we add concepts, this will all start to come together.
For starters, CAPTCHA is an acronym.
It stands for “Completely Automated Public Turing test to tell Computers and Humans Apart.”
With such a long name, there’s a lot given away right there, but I’ll explain a few concepts in the name for clarity.
As you’ve already guessed, CAPTCHA is a test that distinguishes between humans and computers.
That actually makes it a Turing test (which is in the name too). And, this involves a little bit of history.
Alan Turing was a super famous computer scientist in the first half of the 20th century.
He came up with the idea of using tests to see how well computers could imitate human behavior.
So any test that tries to compare the two is ultimately a type of Turing test.
CAPTCHAs are probably the most prolific Turing tests around.
You run into them everywhere, and the simple goal is to discern that you are in fact a human user and not a computer-controlled bot.
CAPTCHAs help reduce spam account generation and a lot of other annoying problems tied to bots and botting.
What Types of CAPTCHA Are There? (6 Types)
If CAPTCHAs can tell humans and bots apart, how do they work?
That’s a bit of an explanation, so an easier place to start is with the different types of CAPTCHAs.
Ultimately, each CAPTCHA is trying to present a task that is very easy for a person while presenting a challenge for a bot, and there are a number of ways to try to do that.
None of these types of CAPTCHAs will stop every bot, but if they can prevent large numbers of bots from getting through, it mitigates any problems attached to botting.
#1 Text Recognition CAPTCHA
Text recognition was one of the first CAPTCHAs, and it’s probably still the most common.
The original patents for this CAPTCHA were filed in 1997, so this is not a new concept.
The idea is pretty easy.
Show a picture of letters (and/or numbers).
A human can read what is written very easily.
A bot has to have the ability to process images into text, and that’s not an easy bot to make, especially back in 1997.
So, even the simplest CAPTCHA out there can prevent tons and tons of bots from getting through.
But because this type of CAPTCHA is so common, a lot of programmers have worked hard to make bots that can get through them.
Text CAPTCHAs have then evolved in response, and the bots were further adapted.
Today, text recognition CAPTCHA is in a bit of an arms race, and most text CAPTCHAs actually implement other aspects of Turing tests at the same time (which I’ll cover later).
#2 Image CAPTCHA
Image CAPTCHAs present pictures to the user, and the user is supposed to identify what is in the picture.
This implements the concept of contextualizing.
It’s very easy for you to look at a picture of a zebra and identify it as a zebra.
This is normal stuff for people.
But, how do you get a computer to realize that the ones and zeros making up an image are actually representing a zebra?
It’s challenging, and image recognition has been at the forefront of developing artificial intelligence for years.
The very ideas of contextualization and training were built around image recognition.
All of this is to say that there are bots that can do image recognition, but it’s a specialized skill, so these CAPTCHAs defeat a lot of bots.
#3 Audio CAPTCHA
Audio CAPTCHAs challenge an entirely different branch of artificial intelligence.
With these, the CAPTCHA plays a sound, and you have to type in what you heard.
Once again, this is everyday stuff for a human.
For a computer, it’s a different matter.
The way a computer processes and understands sound is fundamentally different from how it deals with images.
So, bots that can defeat text or image CAPTCHAs are very likely to be stumped by an audio CAPTCHA.
It’s just hard to make a single bot that can do it all.
#4 Math Problem CAPTCHA
These are less common, but they can work pretty well.
You might think that computers should be better at math problems than humans, and in some respects, you’re right.
When it comes to just calculating math, computers are way better than people.
But when it comes to reading a math problem, you can do things that are particularly difficult for a computer.
For instance, you can do a word problem.
A person can figure out what is in the word problem and make the calculation.
A computer has to discern information in the word problem before it can get to the math, and that’s very difficult.
You’re combining language contextualization with automated math skills.
They are very unrelated skills for computers, so even simple word problems can stump a lot of bots.
#5 Social Media
Social media isn’t really a test per se, but it’s a simple solution to the bot problem.
Simply ask people to sign in with their social media accounts.
If the bot doesn’t have a social media account, it can’t sign in.
Ultimately, this offloads bot mitigation to the major social media companies, and they are much better equipped to tackle this challenge than a lot of smaller companies that still want users to be able to log in.
ReCAPTCHA is the Google CAPTCHA system.
At its simplest, it just asks you to check a box.
How does this stump a computer?
Well, ReCAPTCHA actually implements a lot of different tests all at the same time.
I’m going to go over all of them in the next section when I explain how CAPTCHAs confuse bots in general.
How Does CAPTCHA Trick Robots? (5 Ways)
A lot of the ideas below were pioneered by ReCAPTCHA, but they are not exclusive to ReCAPTCHA.
Any CAPTCHA test can conceivably use any methods it wants.
So what you’ll see below are the leading ways that modern CAPTCHAs are testing humans and robots.
#1 Intuition and Recognition Tests
For the most part, CAPTCHA tests are designed around tasks that are very easy for humans and a lot harder for bots.
Image recognition is a great example.
It’s very easy for you to tell which images have a motorcycle or traffic light or whatever, but a computer program that can do the same is actually quite sophisticated.
By building tests that require advanced image recognition (even though it’s easy for us humans), the simplest and laziest bots are easily defeated.
Intuition is often combined with image recognition.
You may have seen CAPTCHAs where you simply type in the text, except the text is stretched or made to look weird in some way.
This doesn’t just image recognition.
It’s also intuition.
You understand what the distorted images are supposed to represent through your intuitive grasp of the written language.
A computer program has no intuition.
Either it was trained to recognize the distorted image or it wasn’t, so the intuitive layer of the CAPTCHA test is very difficult for a lot of bots.
#2 Keystrokes and Mouse Movement
This is a big one.
When you go through the CAPTCHA process, the test can track your mouse movement and/or keystrokes.
Humans and computers control these things very differently.
A human is using small muscles to make constant corrections to a mouse movement, and when you watch closely enough, all of those inconsistent motions are visible.
When a computer moves a mouse, it’s just inputting coordinates as they relate to the screen.
It’s a math equation, so there is no need for correction along the path.
The mouse can go directly to its destination with no correction.
So, when you just have to click the box, the movement of the mouse is watched closely.
If the motion is too perfect, the CAPTCHA will assume you are a bot.
#3 Time to Complete
Timing is another easy test.
You can’t move a mouse as quickly as a computer can.
It’s not even a contest.
So, if the mouse moves too fast with no error, the test assumes it’s a bot.
Timing can be added to other types of CAPTCHAs too.
If you’re doing image recognition, bots are still usually a lot faster than humans.
You’re going to analyze the pictures to make sure you get it right.
A bot is going to run its algorithm, and it’s going to spit out results very quickly.
Adding the timing test to the image test really helps figure out who is a real person and who is actually a bot.
#4 Cookies and Other Data
CAPTCHAs can also look at your cookies and temporary data.
How does this find bots?
Well, bots are usually made for specific purposes.
They aren’t going to randomly surf the internet.
Your search history is probably a combination of checking email, browsing social media, and looking up random things that pop in your head throughout the day.
A bot’s browsing history won’t look like that at all.
So, looking at cookies can reveal patterns in browsing data that ultimately identify bots.
Lastly, encryption is an important part of CAPTCHAs.
Encryptions aren’t part of the test at all.
Instead, CAPTCHAs are encrypted so that people can’t see exactly how they work.
If the CAPTCHA’s inner workings were revealed, programmers would quickly find ways to beat each individual CAPTCHA.
So, encryption protects the raw data to make it harder to design bots that get past the tests.