Puppeteer is one of the most popular browser automation libraries for Node.js. It lets you control a real Chrome browser using JavaScript, making it useful for tasks like automated testing, web scraping, form filling, and interacting with JavaScript-heavy websites.
Out of the box, however, Puppeteer is designed for automation (not stealth). Browsers launched through Puppeteer expose detectable automation signals such as the WebDriver flag, missing browser APIs, and bot-like execution traces. Those signals are acceptable in testing environments, but during web scraping, they become exactly what anti-bot systems look for to identify and block automated traffic.
Puppeteer Real Browser is a library built to reduce those detection signals and make Puppeteer behave more like a regular human-operated browser. It patches common fingerprinting leaks, improves browser authenticity, and adds stealth features intended to help scrapers avoid basic bot detection systems.
In this tutorial, you’ll learn how Puppeteer Real Browser works, how it extends standard Puppeteer with stealth capabilities, how to use it to scrape data from a live website, and how well it performs when tested against real anti-bot protection systems.
What is Puppeteer real browser?
Puppeteer Real Browser is a Node.js library that wraps Puppeteer and applies a set of stealth patches to reduce its detectability. It relies on Rebrowser patches under the hood, which hide the WebDriver navigator field, replace missing browser APIs, patch bot-like stack traces, and mimic a real browser runtime environment more closely than standard Puppeteer.
Because it uses Puppeteer's API directly, it is a drop-in replacement. If you already have a working Puppeteer scraper, switching to Puppeteer Real Browser means changing the import and the browser launch call, not rewriting your scraping logic.
The library also includes Ghost Cursor, a cursor emulator that generates realistic mouse movement patterns rather than the robotic straight-line movements Puppeteer produces by default. This is particularly relevant for CAPTCHA interactions that check mouse behavior before deciding whether the visitor is human.
One important note before going further: the project has been discontinued and is no longer actively maintained. That context matters and we will come back to it.
How Puppeteer real browser enhances stealth
Standard Puppeteer fails basic fingerprinting checks on sites that test for automation signals. It reveals WebDriver usage, missing browser extensions, and other identifiers that flag the session as automated before your scraper ever touches the data you need.
Puppeteer Real Browser patches the most obvious of these. In headless mode with a custom User-Agent it passes fingerprinting tests that standard Puppeteer fails. The Rebrowser patches combined with Ghost Cursor's realistic mouse movement give it noticeably better stealth than base Puppeteer for sites running lighter bot detection.
How to scrape with Puppeteer real browser
You will extract product names and prices from an e-commerce test page.
Prerequisites
- Node.js (latest LTS)
- puppeteer-real-browser
npm install puppeteer-real-browserStep 1: Scrape product data
Import the library, launch the browser, navigate to the target page, and extract product data using standard Puppeteer selectors:
// npm install puppeteer-real-browser
const { connect } = require('puppeteer-real-browser');
const scraper = async () => {
const { browser, page } = await connect({
headless: true,
});
await page.goto('https://www.scrapingcourse.com/ecommerce/');
await new Promise((resolve) => setTimeout(resolve, 3000));
const products = await page.$$eval('.product', (items) => {
return items.map((item) => ({
name: item.querySelector('.product-name')?.innerText || '',
price: item.querySelector('.price')?.innerText || '',
}));
});
console.log(products);
await browser.close();
};
scraper();// Output
[
{ name: 'Abominable Hoodie', price: '$69.00' },
{ name: 'Artemis Running Short', price: '$45.00' },
// ... rest of results
]That works. The API feels exactly like standard Puppeteer, which is the point. Now test what actually matters: whether the stealth patches hold up against real protection.
Step 2: Testing against anti-bot protection
Puppeteer Real Browser's stealth works best in non-headless mode. Configure it with turnstile: true to activate the automatic Turnstile CAPTCHA clicker, defaultViewport: null for a full browser window, and --start-maximized to mimic a real user's screen:
// npm install puppeteer-real-browser
const { connect } = require('puppeteer-real-browser');
const scraper = async () => {
const { browser, page } = await connect({
headless: false,
turnstile: true,
connectOption: {
defaultViewport: null,
},
args: ['--start-maximized'],
});
await page.goto('https://www.scrapingcourse.com/antibot-challenge/');
await new Promise((resolve) => setTimeout(resolve, 20000));
const content = await page.content();
console.log(content);
await browser.close();
};
scraper();Running this against the anti-bot challenge page shows the problem. The browser opens in GUI mode, navigates to the page, and gets stuck on the challenge. The Turnstile clicker does not fire. The session sits on the block page for the full 20 seconds and never reaches the content.
Despite passing fingerprinting tests in controlled conditions, Puppeteer Real Browser still leaks enough signals in live conditions to fail the JavaScript challenge that Cloudflare and similar systems run in the background.
The limitations of Puppeteer real browser
- JavaScript challenges remain a blocker. The test above shows this directly. Fingerprinting test sites check for surface-level signals. Real anti-bot systems run deeper JavaScript-based challenges that check browser behavior, timing, and execution environment in ways that patching flags alone does not satisfy.
- Open source means diminishing returns over time. Anti-bot vendors study public libraries. Any stealth tool that gets popular will eventually have its specific patches identified and added to the detection checklist. The window between a new patch version and the corresponding detection update keeps shrinking.
- The project is discontinued. No active maintenance means no updates when anti-bot systems change their detection logic. Any evasion technique that works today has no guarantee of working in three months, and there is nobody updating the library when it stops.
- No proxy infrastructure. Puppeteer Real Browser has no built-in proxy rotation or geo-targeting. All of that is your responsibility.
- Resource-heavy at scale. Full browser instances are memory-intensive. Running many concurrent sessions pushes hardware limits quickly and makes parallel scraping expensive.
For sites without serious anti-bot protection, Puppeteer Real Browser is a solid upgrade over standard Puppeteer. For anything running modern bot detection, these limitations surface fast.
Getting past what Puppeteer real browser cannot handle
The test above points to something worth understanding. Puppeteer Real Browser got stuck on the challenge page, not because its patches are poorly written, but because the problem it is trying to solve keeps changing. Anti-bot systems update. Open-source patches catch up. Then the anti-bot systems update again.
Keeping up with that cycle yourself is what burns engineering time. The alternative is to move the anti-bot handling out of your code and into a service that maintains it for you.
Spidra handles the full stack at the API level. Every request runs through a real browser with residential proxy rotation across 50 countries, CAPTCHA solving, and fingerprinting that stays current with anti-bot updates automatically. You do not configure any of that. You send a URL and get back clean data.
Here is the same e-commerce page from the Puppeteer Real Browser tutorial, using Spidra's Node.js SDK:
npm install spidra-jsimport { SpidraClient } from 'spidra-js';
const spidra = new SpidraClient({ apiKey: process.env.SPIDRA_API_KEY });
const job = await spidra.scrape.run({
urls: [{ url: 'https://www.scrapingcourse.com/ecommerce/' }],
prompt: 'Extract all product names and prices',
output: 'json',
});
console.log(job.result.content);[
{ "name": "Abominable Hoodie", "price": "$69.00" },
{ "name": "Artemis Running Short", "price": "$45.00" }
]Same data. No CSS selectors. No browser to launch. No fingerprints to manage.
Now the same request on the anti-bot challenge page that Puppeteer Real Browser could not get through:
const job = await spidra.scrape.run({
urls: [{ url: 'https://www.scrapingcourse.com/antibot-challenge/' }],
prompt: 'Extract the main heading',
useProxy: true,
proxyCountry: 'us',
});
console.log(job.result.content);
// { "heading": "You bypassed the Antibot challenge! :D" }No configuration change between the open page and the protected one. The same request works on both. Because Spidra is a managed service, it stays current with anti-bot changes without you tracking library updates or applying patches.
Proxy usage is billed against your bandwidth quota separately, so there is no credit multiplier when bypass is needed.
Puppeteer Real Browser vs. Spidra
| Puppeteer Real Browser | Spidra | |
|---|---|---|
| JavaScript rendering | Yes, via patched Puppeteer | Yes, real browser built in |
| Cloudflare bypass | Fails on JS challenges | Built in, automatic |
| DataDome / PerimeterX | Not reliable | Built in, automatic |
| Ghost Cursor / human-like mouse | Yes | Handled internally |
| Proxy rotation | Not built in | Built in, 50 countries |
| Actively maintained | No, discontinued | Yes |
| Stays current with anti-bot updates | Manual patches only | Managed by Spidra |
| Structured output | Raw HTML, you parse it | AI extraction, optional JSON schema |
| Language | Node.js | Node.js, Python, Go, PHP, Ruby, and more |
| Best for | Light scraping, basic fingerprint bypass | Protected sites, production pipelines |
Conclusion
Puppeteer Real Browser is a meaningful improvement over standard Puppeteer for sites that check surface-level automation signals. The drop-in replacement API means almost zero migration cost if you already use Puppeteer, and it does what it says on lighter targets.
The limits show up against real anti-bot protection, and the fact that the project is discontinued means those limits will only grow over time as detection systems evolve and the library does not keep up.
If you need reliable scraping on protected sites without maintaining the anti-bot layer yourself, Spidra handles that full stack automatically. The same code that works on open pages works on protected ones without any changes.
Get started free at spidra.io. No credit card required.
