A large portion of valuable web data resides on dashboards, internal tools, admin panels, analytics views, private documentation, customer portals, CRM systems and is not publicly accessible to traditional scrapers.
Until now, Spidra was limited to publicly available pages. If the content required signing in, there was no clean way to access it.
Today, we are excited to introduce authenticated scraping and crawling, which helps remove that limitation by allowing Spidra to operate inside a logged-in session, just like a real user’s browser.
How authenticated scraping and crawling work
Spidra uses cookie-based authentication, which is how browsers maintain login sessions across requests.
Instead of asking you to reimplement authentication flows or handle credentials directly, Spidra relies on cookies that already exist after you log in normally.
To do this, simply log in to the target website using your browser and open browser DevTools to copy the relevant cookies.
You then provide those cookies to Spidra as name-value pairs so that Spidra can inject them into its browser session before navigation.
From that point on, all requests behave as if they are coming from your authenticated browser session.
Tip: In some cases, including cookies related to prior verification steps (such as clearance or challenge cookies) can significantly reduce or eliminate CAPTCHA challenges during scraping or crawling. This can lead to faster runs and fewer interruptions, especially on heavily protected sites.
Authenticated scraping and crawling in the Spidra API
Authenticated access is available not only in the Spidra Playground but also through the API, making it easy to integrate into automated workflows.
You can pass cookies directly when submitting a scrape job:
curl --request POST \
--url https://api.spidra.io/api/scrape \
--header 'Content-Type: application/json' \
--header 'x-api-key: YOUR_API_KEY' \
--data '{
"urls": [
{ "url": "https://app.example.com/dashboard" }
],
"cookies": "session_id=abc123; auth_token=xyz789"
}'Authenticated crawling works the same way, with cookies applied at the start of the crawl:
curl --request POST \
--url https://api.spidra.io/api/crawl \
--header 'Content-Type: application/json' \
--header 'x-api-key: YOUR_API_KEY' \
--data '{
"baseUrl": "https://app.example.com",
"crawlInstruction": "Find all report pages",
"transformInstruction": "Extract report data",
"maxPages": 10,
"cookies": "session_id=abc123; auth_token=xyz789"
}'One key difference between scraping and crawling is session persistence.
For scraping, the provided cookies apply to the specific request being executed. For crawling, the same authenticated session is preserved across all discovered pages. This ensures consistent access as Spidra navigates links, pagination, and internal sections without re-authentication issues.
Common use cases
Authenticated scraping and crawling unlock a wide range of workflows, including:
- Extracting data from SaaS admin panels
- Lead generation and account research from gated platforms
- Crawling private documentation or internal knowledge bases
- Accessing member-only or subscription content
- Scraping e-commerce backends for orders or inventory
- Collecting analytics or reports from logged-in tools
- Reducing friction from CAPTCHAs on protected sites
Privacy and security considerations
Authentication cookies are never stored by Spidra.
They are used transiently for the duration of a scrape or crawl and destroyed/discarded immediately afterward. Spidra does not persist session data or reuse cookies across jobs.
Users are responsible for ensuring they have legal authorization to access and extract any content. Clear guidance and disclaimers are provided in both the UI and documentation.
