Introducing authenticated scraping and crawling

A large portion of valuable web data resides on dashboards, internal tools, admin panels, analytics views, private documentation, customer portals, CRM systems and is not publicly accessible to traditional scrapers.

Until now, Spidra was limited to publicly available pages. If the content required signing in, there was no clean way to access it.

Today, we are excited to introduce authenticated scraping and crawling, which helps remove that limitation by allowing Spidra to operate inside a logged-in session, just like a real user’s browser.

How authenticated scraping and crawling work

Spidra uses cookie-based authentication, which is how browsers maintain login sessions across requests.

Instead of asking you to reimplement authentication flows or handle credentials directly, Spidra relies on cookies that already exist after you log in normally.

To do this, simply log in to the target website using your browser and open browser DevTools to copy the relevant cookies.

You then provide those cookies to Spidra either as name-value pairs so that Spidra can inject them into its browser session before navigation:

Or by simply pasting them directly from your browser’s DevTools. Spidra automatically parses and applies them to the session before navigation.

From that point on, all requests behave as if they are coming from your authenticated browser session.

Tip: In some cases, including cookies related to prior verification steps (such as clearance or challenge cookies) can significantly reduce or eliminate CAPTCHA challenges during scraping or crawling. This can lead to faster runs and fewer interruptions, especially on heavily protected sites.

Authenticated scraping and crawling in the Spidra API

Authenticated access is available not only in the Spidra Playground but also through the API, making it easy to integrate into automated workflows.

You can pass cookies directly when submitting a scrape job:

curl --request POST \
  --url https://api.spidra.io/api/scrape \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: YOUR_API_KEY' \
  --data '{
    "urls": [
      { "url": "https://app.example.com/dashboard" }
    ],
    "cookies": "session_id=abc123; auth_token=xyz789"
  }'

Note: The API supports both standard cookie strings and raw DevTools pastes. Spidra automatically detects the format and parses it accordingly.

{
  "urls": [{"url": "https://app.example.com/dashboard"}],
  "prompt": "Extract company details",
  "output": "json",
  "cookies": "authcookie\teyJhbGciOiJIUzUxMiJ9...\t.example.com\t/\t2026-06-30T14:29:30.522Z\t881\t✓\t✓\tLax\t\t\tMedium\ncf_clearance\tQnuFniylefl3k3FTfCbnp...\t.example.com\t/\t2027-01-01T14:29:32.591Z\t310\t✓\t✓\tNone\thttps://example.com\t\tMedium\n_ga\tGA1.1.1832229719.1766335524\t.example.com\t/\t2027-01-27T11:29:56.430Z\t30\t\t\t\t\t\tMedium"
}

Authenticated crawling works the same way, with cookies applied at the start of the crawl:

curl --request POST \
  --url https://api.spidra.io/api/crawl \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: YOUR_API_KEY' \
  --data '{
    "baseUrl": "https://app.example.com",
    "crawlInstruction": "Find all report pages",
    "transformInstruction": "Extract report data",
    "maxPages": 10,
    "cookies": "session_id=abc123; auth_token=xyz789"
  }'

One key difference between scraping and crawling is session persistence.

For scraping, the provided cookies apply to the specific request being executed. For crawling, the same authenticated session is preserved across all discovered pages. This ensures consistent access as Spidra navigates links, pagination, and internal sections without re-authentication issues.

Common use cases

Authenticated scraping and crawling unlock a wide range of workflows, including:

Extracting data from SaaS admin panels
Lead generation and account research from gated platforms
Crawling private documentation or internal knowledge bases
Accessing member-only or subscription content
Scraping e-commerce backends for orders or inventory
Collecting analytics or reports from logged-in tools
Reducing friction from CAPTCHAs on protected sites

Privacy and security considerations

Authentication cookies are never stored by Spidra.

They are used transiently for the duration of a scrape or crawl and destroyed/discarded immediately afterward. Spidra does not persist session data or reuse cookies across jobs.

Users are responsible for ensuring they have legal authorization to access and extract any content. Clear guidance and disclaimers are provided in both the UI and documentation.

Introducing authenticated scraping and crawling

How authenticated scraping and crawling work

Authenticated scraping and crawling in the Spidra API

Common use cases

Privacy and security considerations

Share