Was this helpful?

Captcha types

How do captcha work

25.02.2026

Ruben Herrera

Tech builder focused on infrastructure, automation, backend systems, and scalable SaaS development

The internet landscape has changed dramatically. Today, bots generate more than half of all global traffic, and roughly 37% of that traffic comes from so-called malicious or advanced bots. As a result, protection systems have moved to a completely new level. Forget the old-school scripts built with the requests library and straightforward Selenium automation. Modern anti-bot systems such as Cloudflare Turnstile, reCAPTCHA v3, and AWS WAF have shifted to a layered defense model. They no longer force users to frantically click traffic lights or crosswalks. Instead, they evaluate your behavior and device in the background before the page even fully loads.

If you need a service that can handle most captcha-related tasks in one place, you can use 2Captcha
— it supports multiple captcha types, provides an API for integration, and works well for automation workflows where fast results matter and you want to avoid manually handling each verification step.

In this article, we’ll explain how captcha works, and what bypass methods are used in scenarios such as testing, QA, automation, and anti-bot protection analysis.

The Anatomy of Modern Anti-Bot Systems

The web scraping and browser automation landscape has completely flipped over the last couple of years. As we roll into 2026, automated traffic makes up over 51% of all internet data, with sophisticated bad bots taking up a massive 37% chunk.

The old days of firing off simple HTTP requests and basic DOM parsing are dead. Traditional visual CAPTCHAs, which used to be the ultimate roadblock, are rapidly being replaced by invisible, multi-layered behavioral analysis engines like Cloudflare Turnstile, Datadome, Kasada, PerimeterX, and reCAPTCHA v3.

Modern Web Application Firewalls (WAFs) and bot management platforms don't just block IP addresses or check the User-Agent anymore. They use a "Defense-in-Depth" approach, cross-referencing your digital footprint across every possible layer—from the initial network handshake down to how the user physically moves their mouse. If you want to build effective bypass tools, you have to understand this stack.

Network Layer: The Shift from JA3 to JA4 TLS Fingerprinting

The first and hardest gatekeeper kicks in before the server even sends a single byte of HTML. During the TLS handshake, the client sends a ClientHello packet in plaintext. This reveals your entire cryptographic stack: supported TLS versions, cipher suites, extensions, and elliptic curves.

For years, the industry relied on the JA3 standard to hash these parameters. But JA3 broke when modern browsers (Chrome 110+, Firefox 114+) introduced TLS Extension Permutation—randomizing extension orders to prevent tracking.

The security industry responded with the JA4 standard. It normalizes fields, sorts extensions alphabetically to strip out the randomization, and factors in TCP and HTTP/3 (QUIC) data.

Standard libraries like Python's requests (via urllib3), httpx, or Go's http.Client have massive, glaringly obvious TLS fingerprints. Their OpenSSL configuration looks nothing like the BoringSSL setup used by Chrome. Smart WAFs rely on inconsistency detection: if your headers claim you're Chrome 120, but your JA4 fingerprint screams "Python script," you get an instant 403 Forbidden or an unsolvable CAPTCHA. Rotating proxies won't save you here; the anomaly is baked into the cryptography.

Hardware Layer and Deep Browser Fingerprinting

If you survive the network layer (or use a real headless browser like Playwright with valid TLS), the WAF moves to environment analysis. It injects obfuscated JavaScript challenges to query hundreds of APIs and validate your physical hardware.

Graphics are a massive target. Systems request the browser to render hidden 2D/3D canvases and hash the output. This rendering is unique to your specific combo of GPU, drivers, OS, and fonts. For server-side automation, the UNMASKED_RENDERER_WEBGL parameter is a death sentence. On cloud instances without dedicated GPUs, this returns Google SwiftShader or llvmpipe—software renderers that emulate GPUs via the CPU. Seeing these strings instantly drops your session trust score to zero.

The AudioContext API is another trap. By analyzing how a browser filters audio waves, WAFs can detect emulated hardware stacks. Throw in mismatched CPU core counts (navigator.hardwareConcurrency), RAM (navigator.deviceMemory), and OS-specific fonts, and any inconsistency (like having Windows-only fonts while claiming to be on macOS) immediately flags you as a bot.

Behavioral Biometrics and Risk Scoring

Protection systems have evolved from asking users to click fire hydrants to running invisible, continuous background scoring. Cloudflare Turnstile and reCAPTCHA v3 watch how you interact with the page.

This behavioral analytics engine leans heavily on Fitts's Law. Algorithms track mouse physics, scrolling rhythm, micro-tremors, and click pauses. Human movement is messy—we accelerate, decelerate, overshoot targets, and micro-correct. Basic automation scripts, however, draw perfect straight lines or instantly teleport the cursor to the exact center of a button. Machine learning models catch this mechanical perfection instantly.

reCAPTCHA v3 returns a score from 0.0 (bot) to 1.0 (human). To get above a 0.7 today, you need perfectly consistent fingerprints, a realistic session history, valid Google cookies, and human-like page interaction. Turnstile takes a different route, blending background telemetry with invisible Proof-of-Work (PoW) cryptographic challenges. It forces your browser to burn CPU cycles while analyzing your JS execution environment, filtering out primitive scrapers while stalling more advanced setups.

Strategic Flaws: The Frankenstein Effect

Looking at GitHub issue trackers and engineering forums, the biggest reason scrapers get blocked isn't the WAF itself—it's bad stealth architecture.

The most common trap is the "Frankenstein Effect." Developers try to hide by patching random browser properties. For instance, they'll spoof the User-Agent to say "Safari on macOS," but forget to override navigator.platform (which still reads Linux x86_64), or they leave Chrome-specific V8 APIs exposed that don't exist in Safari's WebKit. WAFs work on a "trust but verify" basis. Catch a bot lying about one minor attribute, and the whole session is burned.

Another massive flaw is relying on old-school JS injection. Classic plugins like puppeteer-extra-plugin-stealth use JS monkey-patching to hide flags. Modern anti-bots easily bypass this by running checks inside isolated Web Workers. Injectable stealth scripts usually only run in the main thread. The Web Worker bypasses the spoofed environment, reads the real system values, and reports back to the server.

Finally, there's the WebRTC leak issue. Engineers spend thousands on elite residential proxies, hook them into Playwright, but forget to disable WebRTC. When an obfuscated script triggers a WebRTC connection, the browser obediently leaks your actual AWS or Hetzner server IP, completely de-anonymizing your setup.

Network Tooling: TLS and HTTP Spoofing

To beat the JA3/JA4 checks, the open-source community built tools that make Python or Go scripts look exactly like legitimate browsers at the packet level.

Library / Client	Ecosystem	Key Features	Evasion Capability (Network)
requests (urllib3)	Python	No emulation, static JA3.	0% (Instant Cloudflare block)
curl_cffi	Python / C	Perfect Chrome/Safari ClientHello copy, HTTP/2 multiplexing.	Excellent for static content, bypasses basic WAFs.
httpcloak	Golang	JA4, HTTP/3 (QUIC), ECH, HTTP/2 SETTINGS.	Top-tier stealth at the transport layer.
tls-client	Golang	JA3/JA4 spoofing, custom header ordering.	Great for high-load API scraping.

curl_cffi (Python): A wrapper for the curl-impersonate project. It swaps out OpenSSL for Chrome's BoringSSL, allowing your requests to match real browser fingerprints down to the byte. It handles HTTP/2 multiplexing natively, pushing bypass success rates from 2% to over 85% on static pages.
httpcloak & tls-client (Golang): These go even deeper, handling Encrypted Client Hello (ECH) and HTTP/3 (QUIC). Advanced WAFs heavily scrutinize HTTP/2 stream weights and frame ordering; these tools handle that natively, letting you scrape APIs without spinning up heavy headless browsers.

The Evolution of Browser Frameworks

When you have to render dynamic SPAs or solve interactive CAPTCHAs, you need a real browser. But vanilla Selenium or Playwright are dead on arrival.

Tool	Tech Paradigm	Stealth Level	Key Takeaway
Vanilla Playwright	WebDriver / CDP	Low	Highly stable API, but instantly flagged by any WAF.
Puppeteer Stealth	JS Monkey-patching	Moderate	Vulnerable to Web Worker leaks. Rapidly losing effectiveness.
Patchright / Rebrowser	CDP Domain Isolation	High	Lets you use Playwright while hiding the CDP connection from the page.
Nodriver / Zendriver	Native Async CDP	Very High	No WebDriver binary. Zendriver is optimized for Docker scale.
Camoufox	C++ Engine Modification	Maximum	Spoofing at the source-code level. Scores 0% bot probability on extreme benchmarks.

The Driverless CDP Revolution

Bypassing intermediary binaries (ChromeDriver) entirely changed the game. Nodriver (Python) talks directly to the browser via Chrome DevTools Protocol (CDP) websockets, avoiding flagged domains. Its successor, Zendriver, added strict typing and native Docker support, solving the massive headache of deploying hardware-accelerated browsers in Linux containers.

Engine Patching: Patchright & Camoufox

Playwright is fast, but WAFs detect the active CDP connection through environment leaks. Patchright drops deep patches into Playwright to isolate injected code, keeping it invisible to the site's anti-bot scripts. Taking it further, Camoufox is a custom Firefox build where all fingerprint spoofing happens in C++ at the engine level. Since it's baked into the core, no JavaScript on the page can detect the spoofing.

Cursor Simulation

To beat behavioral heuristics, tools like Human-cursor use Fitts's Law and cubic Bezier curves to calculate dynamic speeds, add random control point scatter, and simulate realistic "overshoots" and micro-corrections, rather than clicking dead-center every time.

Solving CAPTCHAs: The Open-Source ML Era

The industry has shifted to autonomous, local open-source pipelines.

Tech / Engine	Best Use Case	Speed	Noise/Distortion Resistance
Tesseract v5.x	Clean document scans	Fast (CPU)	Low (Fails on lines and heavy noise)
ddddocr (-rs)	Classic alphanumeric CAPTCHAs	Extremely Fast	High (Trained specifically on synthetics)
PaddleOCR	Complex text, multiple languages	Moderate (GPU)	Very High
VLM (Qwen-VL, DeepSeek)	Logic puzzles (FunCaptcha)	Slow (Needs VRAM)	Maximum (Semantic context understanding)

Vision-Language Models (VLMs)

Classic Tesseract is obsolete for modern noisy CAPTCHAs. While ddddocr is great for simple alphanumeric strings, the real game-changers are open-source VLMs like Qwen2.5-VL and DeepSeek-OCR. They understand semantic context. Ask them to "Select the image where the dice add up to 14," and they return exact bounding box coordinates for your script to click. They achieve 90%+ accuracy on logic puzzles that completely break traditional OCR.

Audio CAPTCHAs & OpenAI Whisper

Accessibility laws require sites to offer audio CAPTCHAs, which happens to be a massive vulnerability. Scrapers simply click the audio button, download the file, and feed it into a lightweight, local instance of Whisper (like whisper-tiny). Whisper easily cuts through the synthetic background noise intended to confuse bots, returns the transcript in milliseconds, and solves the challenge for free.

Architectural Shifts

Proof-of-Work over Puzzles: Strict privacy laws (GDPR) are killing behavioral tracking. Projects like ALTCHA drop visual puzzles entirely, instead sending a cryptographic challenge (like computing a SHA-256 hash) to the client. This proves CPU time was spent, stopping massive DDoS attacks while natively supporting rate-limited API scraping.
The Struggle of Autonomous AI Agents: LLMs can reason through a checkout flow flawlessly but fail at CAPTCHAs. Why? Because the LLM thinks in high-latency steps and translates actions into perfect, mechanical clicks. WAFs block them instantly for lacking human micro-motor skills. AI agents now have to hand off CAPTCHA checkpoints to specialized stealth tooling.
Docker Orchestration: Running one stealth browser is easy. Scaling 500 concurrent sessions without cache collision, /dev/shm memory crashes, or WebRTC proxy leaks is the real engineering challenge. The industry is moving heavily toward cloud-native orchestrators and ephemeral containers, masking infrastructure complexity behind simple APIs.

The Bottom Line

Bypassing anti-bot systems is no longer just about parsing HTML. It's a hardcore engineering discipline sitting at the intersection of cryptographic reverse-engineering, C++ engine patching, and deploying heavy ML models.

Vanilla Playwright or Selenium is dead for enterprise scraping. Success requires a coherent digital footprint: matching JA4 fingerprints at the network layer (curl_cffi), driverless CDP interactions (Nodriver) or patched engines (Camoufox) at the execution layer, and Fitts's Law cursor simulations for behavior. Coupled with local Whisper/VLM pipelines for CAPTCHAs, this hybrid microservice approach is the only way to survive the escalating arms race of web automation.

Browser API

Browser fingerprints

Unlocker APICOMING SOON

Software

Blog

How-to

Captcha demos