Blog/ How to scrape with Patchright and avoid detection
May 27, 2026 · 7 min read

How to scrape with Patchright and avoid detection

Joel Olawanle
Joel Olawanle
How to scrape with Patchright and avoid detection

Headless browsers like Playwright are detectable out of the box. Sites check for automation flags in the browser fingerprint and block requests that look like they came from a script rather than a real user.

Patchright is a patched version of Playwright that tries to fix this by removing or masking the signals that give automated browsers away.

In this tutorial you will learn how Patchright works, how to use it to extract data from a real e-commerce page, and how it holds up against a site with actual anti-bot protection.

What is Patchright?

patchright.jpg

Patchright is an open-source Python and Node.js library that wraps Playwright and patches its automation fingerprints to make scraping harder to detect. It sets navigator.webdriver to False, changes the HeadlessChrome user agent flag to Chrome, deactivates pop-up blocking, and applies other browser-level patches that standard Playwright leaves exposed.

Because it inherits Playwright's full API, Patchright is a drop-in replacement. If you already have a working Playwright scraper, you can switch to Patchright by changing the import without touching anything else.

Web scraping with Patchright

You will extract product names and prices from an e-commerce test page. Here is what you need to get started.

Prerequisites

  • Python 3.10 or later
  • Patchright
pip3 install patchright

Patchright's developer recommends using Chrome rather than the built-in Chromium instance for better stealth. Install the Chrome binary:

patchright install chrome

Step 1: Scrape product data

Patchright supports both synchronous (sync_api) and asynchronous (async_api) methods, matching Playwright's API exactly. This tutorial uses sync_api.

Launch a browser, open the target page, locate all product elements, and extract the name and price from each one:

# pip3 install patchright
from patchright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()

    page.goto("https://www.scrapingcourse.com/ecommerce/")

    products = page.locator(".product")
    product_data = []

    for product in products.all():
        product_data.append({
            "title": product.locator(".product-name").inner_text(),
            "price": product.locator(".price").inner_text(),
        })

    print(product_data)

    page.wait_for_timeout(2000)
    page.close()
# Output
[
    {"title": "Abominable Hoodie", "price": "$69.00"},
    {"title": "Artemis Running Short", "price": "$45.00"},
    # ... rest of results
]

That works cleanly. The Playwright API you already know, with the fingerprint patches applied underneath. Now test the part that actually matters: whether those patches hold up against a protected site.

Step 2: Testing Patchright against anti-bot protection

The real question with any stealth tool is not whether it scrapes open pages but whether it gets through bot detection on sites that are actively looking for automation signals.

Launch the browser as a persistent context with headless=False so it runs in GUI mode. This is the recommended setup for Patchright when dealing with protected pages, since it gives the browser a more realistic profile:

# pip3 install patchright
from patchright.sync_api import sync_playwright
import time

with sync_playwright() as p:
    browser = p.chromium.launch_persistent_context(
        user_data_dir="patchright-data",
        channel="chrome",
        no_viewport=True,
        headless=False,
    )
    page = browser.new_page()

    page.goto("https://www.scrapingcourse.com/antibot-challenge")

    time.sleep(20)
    browser.close()

Running this against the anti-bot challenge page shows the problem. The browser opens, navigates to the page, and gets stuck on the challenge. Even in non-headless mode with persistent context, Patchright cannot get through the JavaScript-based challenge. The session sits on the block page for the full 20 seconds and never reaches the content.

This happens because modern anti-bot systems go well beyond checking navigator.webdriver. They look at WebGL fingerprints, font rendering, browser extension signatures, timing patterns, and dozens of other signals. Patching a few obvious Playwright flags is not enough to pass all of those checks consistently.

Running Patchright through a fingerprinting test tool like bot.sannysoft.com shows another issue. In non-headless mode it passes most tests, but headless mode reveals a HeadlessChrome flag that remains in the user agent. That single signal is enough for sophisticated anti-bot systems to identify the session as automated.

The limitations of Patchright

  • Cannot bypass JavaScript challenges. The anti-bot test above shows this directly. Patchright cannot solve the JavaScript-based challenges that Cloudflare, DataDome, and similar systems use. Getting stuck on the challenge page is not a configuration problem. It is a fundamental limitation of the approach.
  • Headless mode leaks fingerprints. The HeadlessChrome flag in headless mode is a detectable signal that Patchright does not fully hide. Anti-bot systems run many detection checks in parallel, and a single leaked signal is often enough.
  • Open source is a liability for stealth. Anti-bot vendors reverse-engineer public libraries. Any open-source stealth tool that gets popular will eventually have its patches studied, and detection will be updated to catch it. The gap between a new Patchright release and an anti-bot update that catches it tends to shrink over time.
  • No proxy infrastructure. Patchright has no built-in proxy rotation or geo-targeting. All of that needs to come from you.
  • Resource-heavy at scale. Like Playwright, Patchright requires a full browser instance per session. Memory and CPU overhead limit how much you can parallelize.

These limitations do not make Patchright useless. For light scraping on sites without serious bot protection, it is a simple and effective upgrade over standard Playwright. The gaps show up when the target is actually trying to stop you.

Getting past what Patchright cannot handle

When a site's anti-bot system is sophisticated enough to block Patchright, patching the browser further is not a reliable path forward. Anti-bot vendors update their detection faster than open-source tools update their patches.

The approach that works consistently is to move the anti-bot handling out of your code entirely and into a managed service that maintains its own infrastructure against those updates.

Spidra handles this at the API level. Every request runs through a real browser with residential proxy rotation across 50 countries, CAPTCHA solving, and browser fingerprinting that stays current as anti-bot systems evolve. You do not configure any of that. You just send the URL.

Here is the same e-commerce page you scraped with Patchright, using Spidra instead:

pip install spidra
from spidra import SpidraClient, ScrapeParams, ScrapeUrl
import os

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://www.scrapingcourse.com/ecommerce/")],
    prompt="Extract all product names and prices",
    output="json",
))

print(job.result.content)
[
    {"title": "Abominable Hoodie", "price": "$69.00"},
    {"title": "Artemis Running Short", "price": "$45.00"}
]

Same data. No CSS selectors. No browser to launch. No fingerprints to manage.

Now the same request on the anti-bot challenge page that Patchright could not get through:

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[ScrapeUrl(url="https://www.scrapingcourse.com/antibot-challenge/")],
    prompt="Extract the main heading",
    use_proxy=True,
    proxy_country="us",
))

print(job.result.content)
# { "heading": "You bypassed the Antibot challenge! :D" }

No configuration change between the open page and the protected one. The same request works on both. Spidra handles Cloudflare and other systems automatically, and because it is a managed service, it stays current with anti-bot updates without you doing anything.

Proxy usage is billed against your bandwidth quota separately, so there is no credit multiplier when bypass is needed.

Patchright vs. Spidra

PatchrightSpidra
JavaScript renderingYes, via patched PlaywrightYes, real browser built in
Cloudflare bypassBasic only, fails on JS challengesBuilt in, automatic
DataDome / PerimeterXNot reliableBuilt in, automatic
Proxy rotationNot built inBuilt in, 50 countries
Stays current with anti-bot updatesManual, you update the libraryManaged by Spidra
Structured outputRaw HTML, you write the parserAI extraction, optional JSON schema
Language supportPython, Node.jsPython, Node.js, Go, PHP, Ruby, and more
Best forLight scraping, basic fingerprint bypassProtected sites, production pipelines

Conclusion

Patchright is a genuine improvement over standard Playwright for sites that check basic automation flags. The drop-in replacement approach means almost no setup cost if you already use Playwright, and it works well for lighter scraping jobs where the target is not running sophisticated detection.

The gap shows up when the target runs JavaScript challenges or more advanced fingerprint analysis. At that point, patching browser flags is not enough and the anti-bot system wins. Keeping up with detection updates as they evolve is also ongoing work that falls entirely on you.

If you need to reliably scrape protected sites without maintaining the anti-bot layer yourself, Spidra handles that full stack automatically. The same code runs on open pages and protected ones without any changes.

Get started free at spidra.io. No credit card required.

Frequently asked questions

It depends on the protection. Patchright handles basic fingerprint checks well. It struggles with JavaScript-based challenges like those used by Cloudflare, DataDome, and PerimeterX, where patching browser flags alone is not enough to pass all the detection checks those systems run.

Share this article

Start scraping for free.

Get 300 free credits to explore Spidra. Build your first scraper in minutes, not hours. Upgrade anytime as you scale.

We build features around real workflows. Usually within days.