DataDome bypass: Python guide

17.05.2024

If you have encountered DataDome captcha and are looking for a way to bypass it, this tutorial is for you.

2Captcha API solver helps you to bypass datadome captcha.

The article describes the process of interacting with the API.

The following information may be useful for developers of the Python projects that involve the automation of tasks on websites protected by Datadome captcha.

How does Datadome Captcha work?

DataDome is a security company that specializes in bot protection for websites, mobile apps, and APIs. It uses artificial intelligence (AI) and machine learning algorithms to analyze traffic patterns and identify bot-like behavior.

DataDome's technology is very sophisticated, however, with the right tool you can bypass it and reliably scrape the data you need.

A DataDome captcha challenge could look something like:

How to solve and bypass DataDome captcha with the fastest recognize service

Preparation for Bypassing Datadome captcha

Before bypassing the Datadome, you need to set up your working environment and tools. You'll need:

Python - the language in which we will compose the solution
API Key from your 2captcha Personal Cabinet.
Proxies - to hide your real IP address and avoid being blocked. You may use 2captcha proxy service for that.

Steps to Bypass Datadome captcha

1. Import necessary libraries and dependencies:

First, you will need the necessary libraries, such as requests for executing HTTP requests and json for working with data in JSON format.

import json
import time
import requests

Next, you have to register account on 2captcha.com website to get your API key, and prepare proxy parameters.

2. Defining parameters and cookies

Now, after all the preparations, let's set the request parameters, such as the site URL, proxy, key for 2Captcha, and others.
In addition, we will use cookies to bypass the protection of the DataDome. Cookies will be stored in a separate file cookies_datadome.json, which will be created and updated automatically during the script operation.

proxy = "login:password@ip_address:port"  # indicate your proxy parameters
my_key = "your_2captcha_api_key"
url = "https://www.example.com/"
user_agent = "your_user_agent_here"

# Reading cookies from a file
with open("cookies_datadome.json", 'r') as json_file:
    cookie_value = json.load(json_file)
cookies = {'datadome': cookie_value}

# Forming the request headers
headers = {
    'accept': '*/*',
    'accept-language': 'en-US',
    'content-type': 'text/plain;charset=UTF-8',
    'origin': url,
    'referer': url,
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Windows"',
    'sec-fetch-dest': 'empty',
    'sec-fetch-mode': 'cors',
    'sec-fetch-site': 'same-origin',
    'user-agent': user_agent,
}

Note! The content of the headers depends on the site you are working with. You have to inspect it with the Dev Tools.

3. Sending a Request to the Website

Once we have prepared the necessary parameters and headers, the next step is to send a GET request to the target website. In this step, we use the request's library, which provides a simple interface for making HTTP requests.

res = requests.get(
    url,
    proxies={'http': 'http://' + proxy}, 
    cookies={'datadome': cookie_value},
    headers=headers
)

4. Handling the blocking case

If we receive the status code 403 (Forbidden), it means that the DataDome is blocking access. In this case, we will need to send a request to 2Captcha API to bypass the captcha.
To do so, we need to fetch the necessary parameters for the request:

if res.status_code == 403:
   # Fetching DataDome lock information
    dd = res.text.split('dd=')[1]
    dd = dd.split('</script')[0]
    dd = json.loads(dd.replace("'", '"'))

    # Fetching CID from Cookie
    cid = res.headers.get('Set-Cookie').split('datadome=')[1]
    cid = cid.split(';')[0]

5. Generating the URL for the captcha request

After we have extracted the information about blocking from the DataDome and the CID from cookies, it is necessary to form the URL for bypassing the captcha challenge using the 2Captcha API service.

captcha_url = (
    f"https://geo.captcha-delivery.com/captcha/?"
    f"initialCid={dd['cid']}&hash={dd['hsh']}&"
    f"cid={cid}&t={dd['t']}&referer=https%3A%2F%2Fwww.example.com%2Fapi%2Fgraphql&"
    f"s={dd['s']}&e={dd['e']}"
)

6. Send the request to 2Captcha API

We send a POST request to the Captcha API, passing the captcha URL and the other parameters.

    data = {
        "key": my_key,
        "method": "datadome",
        "captcha_url": cap_url,
        "pageurl": url,
        "json": 1,
        "userAgent": user_agent,
        "proxy": proxy,
        "proxytype": "http",
    }
    response = requests.post("https://2captcha.com/in.php?", data=data)

7. Waiting for the response

After sending the request, we need to wait for the recognition. To do so, a loop is used that checks the status of the captcha solution and waits until the captcha is solved.

response = requests.post("https://2captcha.com/in.php?", data=data)
s = response.json()["request"]  # Getting the request ID

while True:
    solu = requests.get(f"https://2captcha.com/res.php?key={my_key}&action=get&json=1&id={s}").json()
    if solu["request"] == "CAPCHA_NOT_READY":
        time.sleep(5)
    elif "ERROR" in solu["request"]:
        print(solu["request"])
        exit(0)
    else:
        break

8. Updating cookies and writing to a file

cookie_value = solu["request"].split(";")[0].split("=")[1]
with open("cookies_datadome.json", 'w') as json_file:
    json.dump(cookie_value, json_file)

9 . Done! You have successfully passed DataDome captcha, and you can continue working with the website.

Restart the script with a new proxy and there will be no 403 error, you will get access to the data on the website.

Conclusion

In this article we described the process of bypassing the DataDome captcha challenge using the 2Captcha API service and updating the cookies for later use when bypassing the DataDome protection. This process ensures efficient and reliable access to the required data on the websites protected by DataDome.

Bypass any captchas using python captcha solver.

References

You can also download the full code here or view it in Gist.

Detailed information on bypassing captcha is published on the API page.

Additional information on working with the API service for customers is available on the FAQ page.

Code examples for working with the 2Captcha API service could be found on the official page in GitHub.