Was this helpful?
How to bypass Audio captcha
Technical engineer
Introduction
Audio captchas are often used on websites as an alternative to visual protection or to ensure accessibility. From a parsing perspective, this is one of the simplest types of captchas — recognition is fully automated and performed by a neural network.
Let's look at how to properly prepare an audio file, configure JSON requests via API v2, and avoid validation errors so your balance isn't wasted.
Step 1. Preparing Files and Parameters
How to Find and Download the Captcha Audio File
Before sending a recognition task, you need to get the audio file from the site. There are three main ways:
Method 1: Via the Network Tab (Recommended)
- Open DevTools by pressing F12 (or Cmd+Option+I on Mac).
- Go to the Network tab.
- Filter requests by Media type or type mp3 or audio in the filter.
- Click the play button on the captcha.
- A request to the audio file will appear in the list — click on it.
- Copy the URL from the Request URL field or right-click and select Open in new tab to download.
Method 2: Via Elements (Inspect Element)
- Right-click the captcha audio play button.
- Select Inspect.
- In DevTools, find the audio tag with the src attribute.
- Copy the value of the src attribute — this is the audio file URL.
- For reCAPTCHA:
- Find the captcha iframe.
- Inside the iframe, find the element with id audio-source.
- Copy the value of the src attribute.
Method 3: Programmatically via Selenium
If you are automating the process, you can extract the audio file URL programmatically:
python
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import urllib.request
import time
driver = webdriver.Chrome()
driver.get("https://example.com")
# Find and click the reCAPTCHA checkbox
WebDriverWait(driver, 10).until(
EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe[title='reCAPTCHA']"))
)
driver.find_element(By.CSS_SELECTOR, "span#recaptcha-anchor").click()
time.sleep(1)
# Switch back to the main document and switch to the challenge iframe
driver.switch_to.default_content()
iframes = driver.find_elements(By.TAG_NAME, "iframe")
for iframe in iframes:
if "recaptcha" in iframe.get_attribute("src"):
driver.switch_to.frame(iframe)
break
# Click the audio button
audio_button = WebDriverWait(driver, 10).until(
EC.element_to_be_clickable((By.CSS_SELECTOR, "button#recaptcha-audio-button"))
)
audio_button.click()
time.sleep(2)
# Get the audio file URL
audio_element = driver.find_element(By.ID, "audio-source")
audio_src = audio_element.get_attribute("src")
print(f"Audio URL: {audio_src}")
# Download the file
urllib.request.urlretrieve(audio_src, "captcha.mp3")
After downloading the file, proceed to the task parameters. To solve an audio captcha, only two key parameters are required:
- body: the audio file itself in Base64 encoding. Must be strictly in MP3 format and weigh no more than 1 MB.
- lang: the language of the audio recording. Supported codes: en (English), fr (French), de (German), el (Greek), pt (Portuguese), ru (Russian). If not specified, en is used by default.
Step 2. Audio File Requirements
The most common cause of errors when working with audio captchas is an incorrect file format or size. The neural network simply won't be able to process it.
Preparation rules:
- Format must be strictly MP3. If you have WAV, OGG, or other formats, be sure to convert them (e.g., using ffmpeg or online converters).
- Size must be no more than 1 MB. If the file is heavier, trim it or lower the bitrate.
- The file needs to be encoded in Base64. The string must be clean, without BOM and extra line breaks.
Python encoding example:
python
import base64
with open("audio.mp3", "rb") as file:
audio_base64 = base64.b64encode(file.read()).decode("utf-8")
Step 3. Sending the Task (API v2)
We show it using the JSON example. It returns structured errors and supports all necessary parameters.
Creating a Task
Endpoint: POST https://api.2captcha.com/createTask
json
{
"clientKey": "YOUR_API_KEY",
"task": {
"type": "AudioTask",
"body": "BASE64_ENCODED_MP3_AUDIO",
"lang": "en"
}
}
The response comes immediately. Save the taskId for subsequent polling.
json
{
"errorId": 0,
"taskId": "123456789"
}
Getting the Result
Poll the server every 3–5 seconds. The neural network usually solves the audio captcha in 3–15 seconds, but during high load, the time may increase.
Endpoint: POST https://api.2captcha.com/getTaskResult
json
{
"clientKey": "YOUR_API_KEY",
"taskId": "123456789"
}
Successful response:
json
{
"errorId": 0,
"status": "ready",
"solution": {
"text": "hello world"
},
"cost": "0.0005",
"ip": "1.2.3.4",
"createTime": 1692808229,
"endTime": 1692808326,
"solveCount": 0
}
Step 4. Sending the Result to the Site
An audio captcha is usually implemented as a simple form: an audio play button, a text input field, and a submit button. Unlike complex captchas (reCAPTCHA, Funcaptcha), you don't need to pass tokens here — it's enough to substitute the recognized text into the field and submit the form.
Finding the Input Field
Open DevTools (F12) and find the input where the user manually enters the text. Usually, it's a standard text field named something like captcha, code, audio_code, or captcha_response.
Typical markup example:
html
<form action="/check-captcha" method="POST">
<audio src="/captcha/audio/abc123.mp3"></audio>
<input type="text" name="captcha_code" placeholder="Enter text">
<button type="submit">Verify</button>
</form>
Option 1: Substitution via JavaScript
Find the input, substitute the recognized text, then submit the form.
javascript
const audioInput = document.querySelector('input[name="captcha_code"]');
if (audioInput) {
audioInput.value = "API_SOLUTION";
audioInput.closest('form').submit();
}
If the field name is unknown, you can search by type and context:
javascript
const audioInput = document.querySelector('form input[type="text"]');
if (audioInput) {
audioInput.value = "API_SOLUTION";
audioInput.form.submit();
}
Option 2: Direct Form Submission via Python
If you parse the site via requests, you can submit the form directly, bypassing the browser.
python
import requests
session = requests.Session()
# First load the page to get cookies and CSRF token (if any)
session.get("https://target-site.com/page-with-captcha")
# Send the solution
payload = {
"captcha_code": "API_SOLUTION"
}
response = session.post("https://target-site.com/check-captcha", data=payload)
print(response.status_code)
How to Find Exact Parameters
- Open the site with the audio captcha in a browser.
- Press F12, go to the Network tab.
- Manually listen to the audio, enter any text, and click "Verify".
- Find the POST request that went to the server (usually /check-captcha, /verify, or similar).
- In the Payload (or Form Data) tab, see what fields are sent. Copy the captcha field name and the endpoint URL.
Use this data in your script.
Step 5. Possible Errors and Solutions
-
ERROR_WRONG_USER_KEY or ERROR_KEY_DOES_NOT_EXIST
Cause: Invalid API key.
Solution: Check the key in your 2captcha dashboard. Make sure you are using the key from your 2captcha.com account. -
ERROR_ZERO_BALANCE
Cause: Insufficient funds in the account.
Solution: Top up the balance. Audio captcha is very cheap, but with mass requests, the consumption grows quickly. -
ERROR_CAPTCHA_UNSOLVABLE
Cause: The neural network could not recognize the text. Usually due to heavy noise, overlapping voices, distortions, or a non-standard accent.
Solution: Refresh the captcha on the site to get a new, cleaner audio file. If the error occurs massively, check the correctness of the lang parameter. -
ERROR_MALFORMED_REQUEST or other validation errors
Cause: Invalid request format or file itself.
Solution: Make sure the file is strictly in MP3 format, its size does not exceed 1 MB, and the body field is a correct Base64 string without extra spaces. The lang parameter must contain only supported codes: en, fr, de, el, pt, ru.
Checklist
- File is strictly in MP3 format
- File size does not exceed 1 MB
- File is correctly encoded in Base64
- lang parameter is specified correctly and matches the site language
- API key is valid, balance is topped up
- Validation and UNSOLVABLE errors are handled
- getTaskResult polling interval is at least 3 seconds
Code
Ready-made implementation examples for popular programming languages are available in the official repository:
https://github.com/2captcha
API Documentation:
https://2captcha.com/api-docs
Basic Python Example:
python
import requests
import time
import base64
API_KEY = "YOUR_API_KEY"
CREATE_TASK_URL = "https://api.2captcha.com/createTask"
GET_RESULT_URL = "https://api.2captcha.com/getTaskResult"
def solve_audio_captcha(audio_file_path, lang="en"):
# Read and encode the file
with open(audio_file_path, "rb") as file:
audio_base64 = base64.b64encode(file.read()).decode("utf-8")
# Create the task
create_payload = {
"clientKey": API_KEY,
"task": {
"type": "AudioTask",
"body": audio_base64,
"lang": lang
}
}
response = requests.post(CREATE_TASK_URL, json=create_payload)
result = response.json()
if result.get("errorId") != 0:
raise Exception(f"Task creation error: {result}")
task_id = result["taskId"]
# Poll for the result
while True:
time.sleep(5)
result_payload = {
"clientKey": API_KEY,
"taskId": task_id
}
response = requests.post(GET_RESULT_URL, json=result_payload)
result = response.json()
if result.get("status") == "ready":
return result["solution"]["text"]
elif result.get("errorId") != 0:
raise Exception(f"Solution error: {result}")
# Usage
text = solve_audio_captcha("audio.mp3", lang="en")
print(f"Recognized text: {text}")
Conclusion
Bypassing audio captcha via the 2captcha API is a process where a neural network automatically recognizes the audio recording and returns the text. This makes the method fast and reliable even for complex audio recordings with noise and accents.
Key principles of stable operation:
-
Strict format compliance: the audio file must be exclusively in MP3 format and not exceed 1 MB. Any deviation will lead to a validation error and balance deduction without a result.
-
Correct language specification: the lang parameter is mandatory. If the site captcha is in Russian and you pass en, the neural network may incorrectly recognize the pronunciation, and accuracy will drop.
-
Error handling: provide logic for retry attempts in the code when receiving ERROR_CAPTCHA_UNSOLVABLE. Just refresh the captcha on the site and send a new file.
-
Polling interval optimization: the neural network solves the task in 3–15 seconds, but you shouldn't poll the server more than once every 3 seconds to avoid hitting API request limits.
Audio captcha remains one of the simplest types of captchas for programmatic bypass. With proper file preparation and request configuration, you will get a reliable tool for bypassing protection.
Recommended implementation order:
-
Set up audio conversion to Base64 and file size checking.
-
Implement basic task sending and result polling.
-
Add text substitution into the required field on the target site.
-
Implement handling of specific errors (invalid format, UNSOLVABLE).
This approach will ensure maximum reliability and minimal costs when working with audio captchas.