Headless browsers like Playwright are detectable out of the box. Sites check for automation flags in the browser fingerprint and block requests that look like they came from a script rather than a real user.
Patchright is a patched version of Playwright that tries to fix this by removing or masking the signals that give automated browsers away.
In this tutorial you will learn how Patchright works, how to use it to extract data from a real e-commerce page, and how it holds up against a site with actual anti-bot protection.
What is Patchright?
Patchright is an open-source Python and Node.js library that wraps Playwright and patches its automation fingerprints to make scraping harder to detect. It sets navigator.webdriver to False, changes the HeadlessChrome user agent flag to Chrome, deactivates pop-up blocking, and applies other browser-level patches that standard Playwright leaves exposed.
Because it inherits Playwright's full API, Patchright is a drop-in replacement. If you already have a working Playwright scraper, you can switch to Patchright by changing the import without touching anything else.
Web scraping with Patchright
You will extract product names and prices from an e-commerce test page. Here is what you need to get started.
Prerequisites
- Python 3.10 or later
- Patchright
pip3 install patchrightPatchright's developer recommends using Chrome rather than the built-in Chromium instance for better stealth. Install the Chrome binary:
patchright install chromeStep 1: Scrape product data
Patchright supports both synchronous (sync_api) and asynchronous (async_api) methods, matching Playwright's API exactly. This tutorial uses sync_api.
Launch a browser, open the target page, locate all product elements, and extract the name and price from each one:
# pip3 install patchright
from patchright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://www.scrapingcourse.com/ecommerce/")
products = page.locator(".product")
product_data = []
for product in products.all():
product_data.append({
"title": product.locator(".product-name").inner_text(),
"price": product.locator(".price").inner_text(),
})
print(product_data)
page.wait_for_timeout(2000)
page.close()# Output
[
{"title": "Abominable Hoodie", "price": "$69.00"},
{"title": "Artemis Running Short", "price": "$45.00"},
# ... rest of results
]That works cleanly. The Playwright API you already know, with the fingerprint patches applied underneath. Now test the part that actually matters: whether those patches hold up against a protected site.
Step 2: Testing Patchright against anti-bot protection
The real question with any stealth tool is not whether it scrapes open pages but whether it gets through bot detection on sites that are actively looking for automation signals.
Launch the browser as a persistent context with headless=False so it runs in GUI mode. This is the recommended setup for Patchright when dealing with protected pages, since it gives the browser a more realistic profile:
# pip3 install patchright
from patchright.sync_api import sync_playwright
import time
with sync_playwright() as p:
browser = p.chromium.launch_persistent_context(
user_data_dir="patchright-data",
channel="chrome",
no_viewport=True,
headless=False,
)
page = browser.new_page()
page.goto("https://www.scrapingcourse.com/antibot-challenge")
time.sleep(20)
browser.close()Running this against the anti-bot challenge page shows the problem. The browser opens, navigates to the page, and gets stuck on the challenge. Even in non-headless mode with persistent context, Patchright cannot get through the JavaScript-based challenge. The session sits on the block page for the full 20 seconds and never reaches the content.
This happens because modern anti-bot systems go well beyond checking navigator.webdriver. They look at WebGL fingerprints, font rendering, browser extension signatures, timing patterns, and dozens of other signals. Patching a few obvious Playwright flags is not enough to pass all of those checks consistently.
Running Patchright through a fingerprinting test tool like bot.sannysoft.com shows another issue. In non-headless mode it passes most tests, but headless mode reveals a HeadlessChrome flag that remains in the user agent. That single signal is enough for sophisticated anti-bot systems to identify the session as automated.
The limitations of Patchright
- Cannot bypass JavaScript challenges. The anti-bot test above shows this directly. Patchright cannot solve the JavaScript-based challenges that Cloudflare, DataDome, and similar systems use. Getting stuck on the challenge page is not a configuration problem. It is a fundamental limitation of the approach.
- Headless mode leaks fingerprints. The
HeadlessChromeflag in headless mode is a detectable signal that Patchright does not fully hide. Anti-bot systems run many detection checks in parallel, and a single leaked signal is often enough. - Open source is a liability for stealth. Anti-bot vendors reverse-engineer public libraries. Any open-source stealth tool that gets popular will eventually have its patches studied, and detection will be updated to catch it. The gap between a new Patchright release and an anti-bot update that catches it tends to shrink over time.
- No proxy infrastructure. Patchright has no built-in proxy rotation or geo-targeting. All of that needs to come from you.
- Resource-heavy at scale. Like Playwright, Patchright requires a full browser instance per session. Memory and CPU overhead limit how much you can parallelize.
These limitations do not make Patchright useless. For light scraping on sites without serious bot protection, it is a simple and effective upgrade over standard Playwright. The gaps show up when the target is actually trying to stop you.
Getting past what Patchright cannot handle
When a site's anti-bot system is sophisticated enough to block Patchright, patching the browser further is not a reliable path forward. Anti-bot vendors update their detection faster than open-source tools update their patches.
The approach that works consistently is to move the anti-bot handling out of your code entirely and into a managed service that maintains its own infrastructure against those updates.
Spidra handles this at the API level. Every request runs through a real browser with residential proxy rotation across 50 countries, CAPTCHA solving, and browser fingerprinting that stays current as anti-bot systems evolve. You do not configure any of that. You just send the URL.
Here is the same e-commerce page you scraped with Patchright, using Spidra instead:
pip install spidrafrom spidra import SpidraClient, ScrapeParams, ScrapeUrl
import os
spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])
job = spidra.scrape.run_sync(ScrapeParams(
urls=[ScrapeUrl(url="https://www.scrapingcourse.com/ecommerce/")],
prompt="Extract all product names and prices",
output="json",
))
print(job.result.content)[
{"title": "Abominable Hoodie", "price": "$69.00"},
{"title": "Artemis Running Short", "price": "$45.00"}
]Same data. No CSS selectors. No browser to launch. No fingerprints to manage.
Now the same request on the anti-bot challenge page that Patchright could not get through:
job = spidra.scrape.run_sync(ScrapeParams(
urls=[ScrapeUrl(url="https://www.scrapingcourse.com/antibot-challenge/")],
prompt="Extract the main heading",
use_proxy=True,
proxy_country="us",
))
print(job.result.content)
# { "heading": "You bypassed the Antibot challenge! :D" }No configuration change between the open page and the protected one. The same request works on both. Spidra handles Cloudflare and other systems automatically, and because it is a managed service, it stays current with anti-bot updates without you doing anything.
Proxy usage is billed against your bandwidth quota separately, so there is no credit multiplier when bypass is needed.
Patchright vs. Spidra
| Patchright | Spidra | |
|---|---|---|
| JavaScript rendering | Yes, via patched Playwright | Yes, real browser built in |
| Cloudflare bypass | Basic only, fails on JS challenges | Built in, automatic |
| DataDome / PerimeterX | Not reliable | Built in, automatic |
| Proxy rotation | Not built in | Built in, 50 countries |
| Stays current with anti-bot updates | Manual, you update the library | Managed by Spidra |
| Structured output | Raw HTML, you write the parser | AI extraction, optional JSON schema |
| Language support | Python, Node.js | Python, Node.js, Go, PHP, Ruby, and more |
| Best for | Light scraping, basic fingerprint bypass | Protected sites, production pipelines |
Conclusion
Patchright is a genuine improvement over standard Playwright for sites that check basic automation flags. The drop-in replacement approach means almost no setup cost if you already use Playwright, and it works well for lighter scraping jobs where the target is not running sophisticated detection.
The gap shows up when the target runs JavaScript challenges or more advanced fingerprint analysis. At that point, patching browser flags is not enough and the anti-bot system wins. Keeping up with detection updates as they evolve is also ongoing work that falls entirely on you.
If you need to reliably scrape protected sites without maintaining the anti-bot layer yourself, Spidra handles that full stack automatically. The same code runs on open pages and protected ones without any changes.
Get started free at spidra.io. No credit card required.
