How long does cf_clearance last?

It depends on the Cloudflare configuration of the specific site. Typical lifetimes range from a few minutes to several hours. For long scraping jobs the cookie may expire mid-session, requiring you to re-solve the challenge to continue.

Can I reuse a cf_clearance cookie across multiple sessions?

No. The cookie is tied to a specific IP address and User Agent. If either changes, or if the cookie expires, Cloudflare will reject it. You need to obtain a fresh cookie for each new session or when the IP changes.

Why does CF-Clearance-Scraper sometimes fail?

Several reasons. Cloudflare runs JavaScript-based challenges that vary in complexity. The tool may not successfully execute all challenge types. Cloudflare also updates its detection techniques regularly, and open-source tools that lag behind those updates see lower success rates. Free proxies and shared IP ranges with poor reputation also reduce the chance of passing the challenge.

Does Spidra work on all Cloudflare configurations?

Spidra handles Cloudflare, DataDome, PerimeterX, and other common anti-bot systems automatically. Because it is a managed service it stays current with anti-bot updates without you doing anything. Proxy usage for bypass is billed against bandwidth rather than credits, so there is no multiplier cost when protection is encountered.

Blog/ How to scrape cf_clearance cookies from Cloudflare-protected websites

June 10, 2026 · 9 min read

How to scrape cf_clearance cookies from Cloudflare-protected websites

Joel Olawanle

How to scrape cf_clearance cookies from Cloudflare-protected websites

Cloudflare does not just check whether you have a valid cookie. It checks your browser fingerprint, JavaScript execution ability, behavioral patterns, IP reputation, and a range of other signals before deciding whether to grant access. If your request passes all of those checks, Cloudflare issues a cf_clearance cookie that acts as a session pass for subsequent requests to that site.

The challenge for scrapers is that standard HTTP clients like Python's requests library cannot solve Cloudflare's initial challenges. No challenge solved means no cf_clearance cookie, which means no access.

One approach is to solve the challenge once using a tool that can handle it, extract the cf_clearance cookie, and then use that cookie in your regular scraping requests within the same session.

In this tutorial you will learn how cf_clearance works, how to extract it using CF-Clearance-Scraper, and how to use it in a requests session to bypass Cloudflare.

Understanding cf_clearance and how Cloudflare issues it

When a request reaches a Cloudflare-protected site, Cloudflare runs a series of checks before deciding whether to let it through. These include JavaScript challenge solving, browser fingerprint analysis, IP reputation checks, behavioral signals, and network traffic patterns.

A request that passes all of these checks receives the cf_clearance cookie. This cookie is then required on all subsequent requests to that site within the same session.

Two things make cf_clearance strict to work with:

It is bound to an IP address. The cookie is tied to the IP that solved the original challenge. If the IP changes mid-session, Cloudflare invalidates the cookie immediately.
It is bound to a User Agent. The same User Agent string used during the challenge must be sent with every subsequent request. A mismatch triggers a new challenge or a block.

This means you cannot just extract a cookie once and reuse it freely. You need to maintain the exact same IP and User Agent throughout the entire session that cookie covers.

How to scrape and use cf_clearance cookies

You will use CF-Clearance-Scraper, a command-line tool that runs a headless Chrome instance to solve Cloudflare challenges and extract the resulting cf_clearance cookie. Then you will use that cookie in a requests session to access the protected content.

Step 1: Requirements and installation

CF-Clearance-Scraper requires Python 3.10 or later and Chrome installed on your machine. Clone the repository and install its dependencies:

git clone https://github.com/Xewdy444/CF-Clearance-Scraper
cd CF-Clearance-Scraper
pip3 install -r requirements.txt

Step 2: Understanding the parameters

CF-Clearance-Scraper runs from the command line by executing main.py with the target URL and optional configuration parameters:

Parameter	Description
`URL`	The Cloudflare-protected target URL (required)
`-f`	Output JSON file to write the scraped cookies
`-t`	Request timeout in seconds
`-p`	Proxy URL to use when solving the challenge
`-ua`	User Agent string for the request
`--disable-http2`	Disables HTTP/2 protocol
`--disable-http3`	Disables HTTP/3 protocol
`-ac`	Save all cookies in addition to cf_clearance

The tool works best when you provide a User Agent and a proxy. The User Agent you pass here is the one you must use in every subsequent request that uses this cookie.

The basic command structure:

python main.py -p <PROXY_URL> -t <TIMEOUT> -ua "<USER_AGENT>" -f cookies.json <TARGET_URL>

Step 3: Scraping the cf_clearance cookie

Run the command against a Cloudflare-protected page. This example uses a 60 second timeout and writes cookies to cookies.json:

python main.py \
  -p http://190.58.248.86:80 \
  -t 60 \
  -ua "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36" \
  -f cookies.json \
  https://www.scrapingcourse.com/cloudflare-challenge

# Output
[12:40:42] [INFO] Cookie: cf_clearance=KkssR4xQ9xEJwlNtUXQEKkoQl...lgI5

The cookie is logged to the terminal and written to cookies.json. To use this in a scraper, you need to capture it programmatically. Here is a Python function that runs the command via subprocess and extracts the cookie value from the output using regex:

import subprocess
import re

def get_cf_clearance(url, proxy, user_agent):
    command = [
        "python", "main.py",
        "-p", proxy,
        "-t", "60",
        "-ua", user_agent,
        "-f", "cookies.json",
        url,
    ]

    try:
        process = subprocess.run(
            command,
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            text=True,
        )

        match = re.search(r"cf_clearance=([^\s]+)", process.stdout)
        return match.group(1) if match else None

    except Exception as e:
        print(f"Error: {e}")
        return None

target_url = "https://www.scrapingcourse.com/cloudflare-challenge"
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36"
proxy = "http://190.58.248.86:80"

cf_clearance = get_cf_clearance(target_url, proxy, user_agent)
print(cf_clearance)

# Output
MKybX880PCu.GfWLhonkBnG64WBs4ASAXeZ...Tux0eDI

Step 4: Using the cf_clearance cookie in your scraper

Now use the cookie in a requests session. The session must use the exact same User Agent and proxy that was used to obtain the cookie. Any deviation invalidates it:

import subprocess
import re
import requests

def get_cf_clearance(url, proxy, user_agent):
    command = [
        "python", "main.py",
        "-p", proxy,
        "-t", "60",
        "-ua", user_agent,
        "-f", "cookies.json",
        url,
    ]

    try:
        process = subprocess.run(
            command,
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            text=True,
        )
        match = re.search(r"cf_clearance=([^\s]+)", process.stdout)
        return match.group(1) if match else None
    except Exception as e:
        print(f"Error: {e}")
        return None

def scrape_with_clearance(url, cf_clearance, proxy, user_agent):
    session = requests.Session()

    # cookie, User Agent, and proxy must all match what was used to obtain the cookie
    session.cookies.set("cf_clearance", cf_clearance)
    session.headers.update({"User-Agent": user_agent})
    session.proxies.update({"http": proxy, "https": proxy})

    try:
        response = session.get(url)
        response.raise_for_status()
        return response.text
    except requests.exceptions.RequestException as e:
        return f"Request failed: {e}"

target_url = "https://www.scrapingcourse.com/cloudflare-challenge"
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/136.0.0.0 Safari/537.36"
proxy = "http://190.58.248.86:80"

cf_clearance = get_cf_clearance(target_url, proxy, user_agent)

if cf_clearance:
    html = scrape_with_clearance(target_url, cf_clearance, proxy, user_agent)
    print(html)
else:
    print("Failed to retrieve cf_clearance. Exiting.")

<!-- Output -->
<h2>You bypassed the Cloudflare challenge! :D</h2>

A successful run returns the protected page HTML. Note the caveat in the code comment: the cookie, User Agent, and proxy must all match exactly what was used during the challenge. One mismatch and Cloudflare rejects the session.

Sticky sessions for rotating proxies

If you use a rotating proxy service, standard rotation will break your session because the IP changes between requests. Look for a service that supports sticky sessions, which pins you to the same exit IP for a configurable time window.

With a sticky session you set the IP lifetime long enough to cover your full scraping session. If it is a short crawl, 1 to 5 minutes is usually enough. For longer jobs, extend it accordingly.

The limitations of the cf_clearance approach

The manual cf_clearance approach works but it is genuinely fragile in practice.

Low and inconsistent success rate. The ZenRows docs on CF-Clearance-Scraper openly acknowledge that the tool may need multiple runs before successfully extracting the cookie. On some Cloudflare configurations it may not succeed at all. You often need to retry, and there is no reliable signal for how many retries a given target will take.
Cookies expire mid-session. cf_clearance cookies have a finite lifetime. A long scraping job can run past the cookie's expiry, which breaks the session mid-run and leaves you with incomplete data. You need to detect this, re-solve the challenge, and restart the affected portion of the crawl.
IP binding is strict. If your proxy rotates between the challenge-solving step and the scraping step, the cookie is immediately invalid. Even a brief IP change is enough to trigger a block. This makes the approach incompatible with most standard rotating proxy setups unless sticky sessions are available and configured correctly.
Cloudflare updates break it. CF-Clearance-Scraper is open source. Cloudflare can study its approach and update their challenge mechanism to defeat it. A tool that worked reliably last month may start failing consistently after a Cloudflare update. There is no automatic recovery.
Chrome overhead. The tool runs a full headless Chrome instance to solve each challenge. That is significant memory and startup time for what is essentially a cookie retrieval step, before any actual scraping has happened.

A more reliable alternative: Spidra

The core problem with the cf_clearance approach is that you are doing Cloudflare's challenge-solving in a fragile, manually-maintained way and then trying to carry that solved state across into a different HTTP client. Every handoff point in that chain is a failure mode.

Spidra eliminates the handoff entirely. Every request runs through a real browser with residential proxy rotation, CAPTCHA solving, and fingerprint management built in.

Cloudflare's challenge-solving happens inside the same request context that fetches the page. There is no cookie to extract, transfer, or expire. You just send the URL.

pip install spidra

from spidra import SpidraClient, ScrapeParams, ScrapeUrl
import os

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://www.scrapingcourse.com/cloudflare-challenge/")],
    prompt="Extract the main heading and body text",
    use_proxy=True,
    proxy_country="us",
))

print(job.result.content)
# { "heading": "You bypassed the Cloudflare challenge! :D" }

No Chrome to launch. No cookie to manage. No sticky session to configure. No retry logic to build. The same request works on the first call.

And unlike the cf_clearance approach, which returns raw HTML you still need to parse, Spidra extracts exactly what you describe and returns clean structured JSON. For the Cloudflare page above, the output is already structured and ready to use without any parsing step.

For scraping the actual content of a protected page:

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://www.scrapingcourse.com/cloudflare-challenge/")],
    prompt="Extract all product names and prices",
    output="json",
    use_proxy=True,
    proxy_country="us",
))

print(job.result.content)

[
    {"name": "Abominable Hoodie", "price": "$69.00"},
    {"name": "Adrienne Trek Jacket", "price": "$57.00"}
]

Proxy usage is billed against your bandwidth quota separately so there is no credit multiplier when anti-bot bypass is needed.

cf_clearance approach vs. Spidra

	cf_clearance + CF-Clearance-Scraper	Spidra
Cloudflare bypass	Inconsistent, may need retries	Built in, automatic
Cookie management	Manual, must maintain IP and UA	Not needed
Session expiry handling	Manual, you detect and re-solve	Not applicable
Proxy requirement	Sticky session required	Built in, 50 countries
Chrome overhead	Yes, full instance per challenge	Managed infrastructure
Structured output	Raw HTML, you parse it	AI extraction, optional schema
Maintenance as Cloudflare updates	Manual, tool can break	Handled by Spidra
Best for	Understanding how cf_clearance works	Production scraping of protected sites

Conclusion

The cf_clearance approach is a real technique and understanding how it works is genuinely useful. The cf_clearance cookie is Cloudflare's session pass and extracting it manually is one way to get through the protection.

The practical problem is reliability. The success rate is inconsistent, cookies expire, IP binding is strict, and Cloudflare updates can break the entire approach without warning. For a production scraping pipeline that needs to run reliably, the maintenance overhead of keeping the cf_clearance approach working is significant.

Spidra handles Cloudflare bypass automatically inside every request, with no cookie management, no sticky session configuration, and no fragile handoffs between tools. The same code works today and after the next Cloudflare update.

Get started free at spidra.io. No credit card required.

Frequently asked questions

It is a session cookie that Cloudflare issues to clients that have passed its bot detection checks. The cookie acts as a clearance token for that session, allowing subsequent requests to proceed without repeating the full challenge. It is bound to the IP address and User Agent used during the original challenge.

The cf_clearance approach solves the challenge once externally, extracts the cookie, and carries it into a separate HTTP client. Every step in that handoff is a potential failure point. Spidra solves the challenge inside the same request context that fetches the page content, so there is no cookie to extract or transfer. Cloudflare bypass is part of every request automatically without any separate setup.

Share this article

Start scraping for free.

Get 300 free credits to explore Spidra. Build your first scraper in minutes, not hours. Upgrade anytime as you scale.

We build features around real workflows. Usually within days.