Bypassing Cloudflare Challenge with Python and Selenium
2Captcha is Cloudflare captcha solver.
2Captcha service can automatically bypass Turnstile Challenge captchas. This article describes the process of interacting with the API.
The following text may be useful for developers of the Python projects invloves automation of tasks on websites protected by Turnstile Challenge captcha.
What is Turnstile Challenge captcha?
Turnstile is a Cloudflare captcha designed to block bots and automated systems. It is easy to bypass as a regular website visitor but challenging for bots and automated systems.
An example of how Turnstile Challenge looks:
Preparation for Bypassing Turnstile CAPTCHA
Before bypassing the Turnstile CAPTCHA, you need to prepare your working environment and tools. You'll need:
- Python and the SeleniumBase library for automating the web browser.
- A 2captcha API key for solving the CAPTCHA.
- Proxies if you need to bypass the CAPTCHA from different IP addresses.
Steps to Bypass Turnstile CAPTCHA
1. Import necessary libraries and dependencies:
Before bypassing the Turnstile CAPTCHA, you need to import necessary libraries such as json, re, requests, seleniumbase and time. These libraries are used for working with JSON data, performing regular expressions, sending HTTP requests, automating the browser and adding delays in the code.
import json
import re
import requests
from seleniumbase import Driver
from selenium.webdriver.common.by import By
import time
2. Starting the script:
This step involves defining the URL of the website protected by the Turnstile CAPTCHA, setting up a proxy (if necessary), 2captcha API key, setting up an updated UserAgent, and configuring the web driver to work in headless mode.
proxy = "xxxxxx:xxxxxx@xx.xxx.xxx.xx:xxxx" # YOUR_2CAPTCHA_PROXY
my_key = "your_2captcha_api_key"
# Setting up an updated UserAgent
agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36"
# Configuring the web driver to work in headless mode
driver = Driver(uc=True, log_cdp=True, headless=True, no_sandbox=True, agent=agent, proxy=False)
# URL of the website protected by the Turnstile CAPTCHA
url = "URL of the website protected by the Turnstile"
driver.get(url)
driver.refresh
time.sleep(5)
3. Intercepting Turnstile CAPTCHA parameters:
Next, we override the turnstile.render
and console.clear
methods to output the necessary CAPTCHA parameters in the browser console. We also define a callback function needed to send the token.
# Function to intercept CAPTCHA parameters using JavaScript
def intercept(driver):
driver.execute_script("""
console.clear = () => console.log('Console was cleared')
const i = setInterval(()=>{
if (window.turnstile)
console.log('success!!')
{clearInterval(i)
window.turnstile.render = (a,b) => {
let params = {
sitekey: b.sitekey,
pageurl: window.location.href,
data: b.cData,
pagedata: b.chlPageData,
action: b.action,
userAgent: navigator.userAgent,
json: 1
}
console.log('intercepted-params:' + JSON.stringify(params))
window.cfCallback = b.callback
return }
}
},50)
""")
time.sleep(1)
# Retrieving browser logs containing intercepted parameters
logs = driver.get_log("browser")
for log in logs:
if log['level'] == 'INFO':
if "intercepted-params:" in log["message"]:
log_entry = log["message"].encode('utf-8').decode('unicode_escape')
match = re.search(r'"intercepted-params:({.*?})"', log_entry)
json_string = match.group(1)
params = json.loads(json_string)
return params
4. Sending parameters to the 2captcha API server:
After obtaining the necessary CAPTCHA parameters, we send it to the 2captcha server using the following API request.
proxy = "login:password@ip_address:port" I # Specify the address of your proxy
data0 = {
"key": my_key,
"method": "turnstile",
"sitekey": params["sitekey"],
"action": params["action"],
"data": params["data"],
"pagedata": params["pagedata"],
"useragent": params["userAgent"],
"json": 1,
"pageurl": params["pageurl"],
"proxy": proxy,
"proxytype": "http",
}
response = requests.post(f"https://2captcha.com/in.php?", data=data0)
print("Request sent", response.text)
s = response.json()["request"]
5. Using the CAPTCHA solving result
After sending a request to the 2captcha server, wait for the CAPTCHA solving result. Make requests to the server every 5-10 seconds to check if the CAPTCHA has been solved. When the CAPTCHA is successfully solved, the API will return a token, which needs to be passed to the script's callback function to continue execution.
while True:
solu = requests.get(
f"https://2captcha.com/res.php?key={my_key}&action=get&json=1&id={s}",
proxies={"http": "http://" + proxy}
).json()
if solu["request"] == "CAPCHA_NOT_READY":
print(solu["request"])
time.sleep(8)
elif "ERROR" in solu["request"]:
print(solu["request"])
driver.close()
driver.quit()
exit(0)
else:
break
for key, value in solu.items():
print(key, ": ", value)
solu = solu['request']
driver.execute_script(f" cfCallback('{solu}');")
time.sleep(5)
6. Done! You have successfully passed Turnstile captcha challenge, and you can continue working with the website.
Useful links
- Cloudflare Challenge test page
- Turnstile CAPTCHA demo code
- Python demo code on GitHub
- JavaScript demo code on GitHub
References
- Using the extension and Puppeteer you can create a Puppeteer captcha solver solution.
- Learn how to use the 2Captcha solver extension in Puppeteer for bypass reCAPTCHA and others captchas (blog)
Learn how to bypass Cloudflare Turnstile (not Cloudflare Challenge) on any website.