Captcha bypass tutorials

Was this helpful?

How to Automatically Extract Captcha Sitekeys for Submission

To solve captchas via the 2Captcha API, you need a valid sitekey. You can find it manually using DevTools, but for automation, it's more reliable to extract parameters programmatically.

In this article, I'll show you working ways to automatically extract sitekeys for different captcha types, cover common mistakes, and provide ready-to-use Python examples tailored to 2Captcha requirements.


What is a sitekey and why you need it

A sitekey is a public identifier for a captcha widget. The protection service (Google reCAPTCHA, Cloudflare Turnstile, and others) issues it when a site registers.

Within the 2Captcha ecosystem, you need the sitekey to:

  • identify the target site when creating a task,
  • build a correct API request,
  • generate a valid solution token.

Without the right sitekey, 2Captcha will return an ERROR_NO_SITEKEY, or the task might solve but the token won't work on the target page.

Note: in the 2Captcha API, the parameter name varies. For reCAPTCHA v2/v3 it's googlekey, for Turnstile it's sitekey, for GeeTest it's gt. Automatic extraction gives you a universal value that you then map to the correct parameter.


Where to usually find the sitekey

Depending on the implementation, the sitekey can appear in different places:

  • data-sitekey attribute: <div class="g-recaptcha" data-sitekey="6Ld..."> (reCAPTCHA v2/v3)
  • iframe parameter: src=".../anchor?k=6Ld..." (reCAPTCHA, Turnstile)
  • JavaScript object: window.grecaptcha_config = {sitekey: "6Ld..."} (reCAPTCHA enterprise)
  • Inline script: var captchaSitekey = "6Ld..."; (custom implementations)
  • Form attribute: <input name="cf-turnstile-response" data-sitekey="0x4..."> (Cloudflare Turnstile)

Knowing these sources helps you build a robust extraction algorithm before sending to 2Captcha.


Method 1: Extracting from HTML attributes

The most common case is a sitekey in the data-sitekey attribute.

Python + BeautifulSoup:

python Copy
from bs4 import BeautifulSoup


def extract_sitekey_bs4(html: str) -> str | None:
   soup = BeautifulSoup(html, 'html.parser')
  
   widget = soup.find(attrs={'data-sitekey': True})
   if widget:
       return widget['data-sitekey']
  
   turnstile = soup.find(attrs={'data-sitekey': True})
   if turnstile and 'cf-turnstile' in turnstile.get('class', []):
       return turnstile['data-sitekey']
  
   return None

Method 2: Searching with Selenium or Playwright

When the captcha loads dynamically, browser automation works better.

Selenium example:

python Copy
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


def extract_sitekey_selenium(driver, timeout=10) -> str | None:
   try:
       widget = WebDriverWait(driver, timeout).until(
           EC.presence_of_element_located((By.CSS_SELECTOR, '[data-sitekey]'))
       )
       return widget.get_attribute('data-sitekey')
   except:
       pass
  
   return extract_sitekey_from_iframe(driver)

Playwright example:

python Copy
def extract_sitekey_playwright(page) -> str | None:
   page.wait_for_selector('[data-sitekey]', timeout=10000)
  
   sitekey = page.evaluate('''() => {
       const widget = document.querySelector('[data-sitekey]');
       return widget ? widget.dataset.sitekey : null;
   }''')
  
   return sitekey

Method 3: Extracting from iframe URL

Some implementations pass the sitekey in URL parameters inside an iframe.

Parsing the k parameter:

python Copy
from urllib.parse import urlparse, parse_qs


def extract_sitekey_from_iframe(driver) -> str | None:
   iframes = driver.find_elements(By.TAG_NAME, 'iframe')
  
   for iframe in iframes:
       src = iframe.get_attribute('src') or ''
       if 'recaptcha' in src or 'turnstile' in src:
           parsed = urlparse(src)
           params = parse_qs(parsed.query)
           if 'k' in params:
               return params['k'][0]
           if 'sitekey' in params:
               return params['sitekey'][0]
  
   return None

Method 4: Extracting via JavaScript objects

On complex sites, captcha configuration lives in global objects.

Extraction using execute_script:

python Copy
def extract_sitekey_js(driver) -> str | None:
   scripts = [
       "return window.grecaptcha_config?.sitekey",
       "return window.captchaConfig?.sitekey",
       "return window.captchaConfig?.key",
       "return document.querySelector('[data-sitekey]')?.dataset.sitekey"
   ]
  
   for script in scripts:
       try:
           result = driver.execute_script(script)
           if result and isinstance(result, str) and len(result) > 10:
               return result
       except:
           continue
   return None

Working with dynamically loaded captchas

On SPA sites (React, Vue, Angular), the captcha widget appears after a click, form submission, or async request.

Waiting strategy:

python Copy
def wait_for_captcha(driver, timeout=30) -> bool:
   selectors = [
       '[data-sitekey]',
       '.g-recaptcha',
       '.cf-turnstile',
       'iframe[src*="recaptcha"]',
       'iframe[src*="challenges.cloudflare.com"]'
   ]
  
   for selector in selectors:
       try:
           WebDriverWait(driver, timeout).until(
               EC.presence_of_element_located((By.CSS_SELECTOR, selector))
           )
           return True
       except:
           continue
   return False

Full extraction example for 2Captcha:

python Copy
def get_captcha_sitekey_for_2captcha(driver, page_url: str) -> dict:
   result = {
       'sitekey': None,
       'type': None,
       'api_param': None,
       'error': None
   }
  
   if not wait_for_captcha(driver):
       result['error'] = 'CAPTCHA not found'
       return result
  
   sitekey = (
       extract_sitekey_selenium(driver) or
       extract_sitekey_from_iframe(driver) or
       extract_sitekey_js(driver)
   )
  
   if not sitekey:
       result['error'] = 'Sitekey not extracted'
       return result
  
   result['sitekey'] = sitekey
  
   page_source = driver.page_source.lower()
   if 'g-recaptcha' in page_source or 'recaptcha' in page_source:
       result['type'] = 'userrecaptcha'
       result['api_param'] = 'googlekey'
   elif 'cf-turnstile' in page_source or 'challenges.cloudflare.com' in page_source:
       result['type'] = 'turnstile'
       result['api_param'] = 'sitekey'
   elif 'geetest' in page_source:
       result['type'] = 'geetest_v4'
       result['api_param'] = 'gt'
   else:
       result['type'] = 'unknown'
  
   return result

Common issues and fixes

1. Captcha hasn't loaded yet

Symptom: element not found, search returns None.
Fix: use explicit waits (WebDriverWait) and check multiple selectors.

2. Captcha inside Shadow DOM

Symptom: widget visible in DevTools but not found via standard search.
Fix: use JavaScript to cross shadow boundaries:

python Copy
def find_in_shadow(driver, selector: str):
   return driver.execute_script(f'''
       function searchShadow(root, sel) {{
           const el = root.querySelector(sel);
           if (el) return el;
           for (const node of root.querySelectorAll('*')) {{
               if (node.shadowRoot) {{
                   const found = searchShadow(node.shadowRoot, sel);
                   if (found) return found;
               }}
           }}
           return null;
       }}
       return searchShadow(document, '{selector}');
   ''')

3. Multiple iframes on the page

Symptom: you extract a sitekey from the wrong captcha.
Fix: filter iframes by domain and parameters:

python Copy
def is_captcha_iframe(src: str) -> bool:
   captcha_domains = [
       'google.com/recaptcha',
       'recaptcha.net',
       'challenges.cloudflare.com'
   ]
   return any(domain in src for domain in captcha_domains)

4. Sitekey with extra characters

Symptom: 2Captcha returns ERROR_INVALID_SITEKEY.
Fix: clean the value before use:

python Copy
import re


def clean_sitekey(key: str) -> str | None:
   if not key:
       return None
   cleaned = re.sub(r'[^A-Za-z0-9\-_]', '', key)
   return cleaned if len(cleaned) >= 20 else None

Checklist before sending to 2Captcha

Before creating a task, verify:

  • The sitekey is extracted and 20-50 characters long
  • No spaces, line breaks, or HTML entities in the sitekey
  • Captcha type is correctly identified (reCAPTCHA v2/v3, Turnstile, etc.)
  • Page URL is complete with protocol (https://)
  • For reCAPTCHA use the googlekey parameter, for Turnstile use sitekey
  • Captcha fully loaded before extracting parameters

Summary

Automatic sitekey extraction is a key step when integrating with 2Captcha. A solid implementation speeds up automation, reduces errors from manual input, and helps you adapt to page structure changes.

Main sitekey sources: the data-sitekey attribute in the HTML widget, the k parameter in iframe URLs, global JavaScript objects, and embedded JSON configs.

Recommendations for 2Captcha: use explicit waits for dynamic content, implement multiple extraction methods with fallbacks, validate the sitekey before sending to the API, and log extraction steps for debugging.

Once you've successfully extracted the sitekey and identified the captcha type, build your task for the 2Captcha API with the correct parameters (method, googlekey or sitekey, pageurl, proxy). This ensures stable, automated captcha solving.


Helpful resources

2Captcha documentation and API

Official repositories


2Captcha support

If you run into a problem you can't solve:

  1. Check the FAQ and API status: https://2Captcha/support/faq
  2. Make sure your account balance is positive
  3. Open a support ticket and include the captcha page URL, full API request and response (without your API key), and a DevTools screenshot of the captcha element