Changelog

Stay up to date with the latest features, improvements, and bug fixes in Spidra.

Introducing Authenticated Scraping and Crawling

Introducing Authenticated Scraping and Crawling

A large portion of valuable web data sits behind login walls — dashboards, internal tools, admin panels, analytics views, private documentation, customer portals, and CRM systems. Until now, Spidra was limited to publicly available pages.

Today, we're introducing authenticated scraping and crawling, which allows Spidra to operate inside a logged-in session, just like a real user's browser.

Spidra uses cookie-based authentication, the same mechanism browsers use to maintain login sessions. Simply log in to the target website using your browser, copy the relevant cookies from DevTools, and pass them to Spidra. From that point on, all requests behave as if they're coming from your authenticated session.

This works in both the Playground and the API. Here's how it looks via the API:

curl --request POST \
  --url https://api.spidra.io/api/scrape \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: YOUR_API_KEY' \
  --data '{
    "urls": [
      { "url": "https://app.example.com/dashboard" }
    ],
    "cookies": "session_id=abc123; auth_token=xyz789"
  }'

Authenticated crawling works the same way — cookies are applied at the start of the crawl and the session is preserved across all discovered pages, so Spidra can navigate links, pagination, and internal sections without re-authentication issues.

The API supports both standard cookie strings and raw DevTools pastes. Spidra automatically detects the format and parses it accordingly. Authentication cookies are never stored — they are used transiently for the duration of a job and discarded immediately afterward.

Read the full blog here.

Introducing Crawling via the Spidra API

Introducing Crawling via the Spidra API

The Spidra API now includes a comprehensive set of crawling endpoints. You can submit crawl jobs, track progress, and retrieve results — all from your own code.

Submit a crawl by providing a starting URL, a natural-language instruction describing which pages to discover, and how content should be extracted:

curl --request POST \
  --url https://api.spidra.io/api/crawl \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: YOUR_API_KEY' \
  --data '{
    "baseUrl": "https://example.com/blog",
    "crawlInstruction": "Crawl all blog post pages",
    "transformInstruction": "Extract title, author, date, and content",
    "maxPages": 5
  }'

Spidra handles page discovery, navigation, captcha solving, and transformation automatically. The crawl runs asynchronously — you get back a job ID and can poll for progress or results:

{
  "status": "completed",
  "progress": {
    "pagesCrawled": 5,
    "maxPages": 10
  },
  "result": [
    {
      "url": "https://example.com/blog/post-1",
      "status": "success",
      "data": {
        "title": "First Post",
        "author": "John",
        "date": "2025-01-01"
      }
    }
  ]
}

The API also includes endpoints for fetching crawled pages independently, viewing job configuration details, and listing crawl history with pagination. The API uses the same crawling engine as the Playground, including automatic CAPTCHA handling, optional stealth mode with proxy rotation, and retry logic for failed pages.

Read the full blog here.

Captcha Solving and Stealth Mode Are Now Live for Crawling

Captcha Solving and Stealth Mode Are Now Live for Crawling

Crawling protected websites just got easier.

Spidra's crawler can now solve captchas automatically and run in stealth mode using residential proxy rotation. This reduces blocks, improves reliability, and makes it possible to crawl sites that were previously difficult to access.

When the crawler encounters a captcha, it detects the type and solves it automatically in the background. No manual steps are required.

Some websites also block or limit repeated requests from the same IP address. When stealth mode is enabled, traffic is routed through rotating residential proxies to reduce detection and avoid rate limits during larger crawls. These updates are especially useful for:

  • Websites with aggressive bot protection
  • Large or deep crawls that previously hit blocking limits
  • Competitor research where anonymity matters

Read the full blog here.

Dark Mode Is Now Available in Spidra

Dark Mode Is Now Available in Spidra

Dark mode is now live in Spidra. You can switch between light and dark mode anytime from the dashboard. The experience stays exactly the same while scraping, crawling, and reviewing results — it's simply easier on the eyes during long sessions.

This update is part of our ongoing beta improvements as we continue refining Spidra based on real usage and feedback.

Read the full blog here.

Introducing AI-Powered Crawling

Introducing AI-Powered Crawling

Until now, Spidra focused on no-code scraping — extracting clean content from specific URLs. Many users kept asking: "Can I crawl an entire website, not just individual pages?" Now you can.

In real workflows, you rarely know every URL you need upfront. Whether you're auditing a website, analyzing competitor content, or building datasets for research, manually listing pages doesn't scale. Crawling solves this by automatically discovering pages and extracting content at scale.

You provide a website, how many pages you want, and a natural-language instruction describing what content matters. Spidra then discovers valid pages based on your instructions, crawls each page using a real browser, extracts only meaningful content, and transforms everything into clean, structured output. All of this happens inside the UI — no code required.

AI-powered crawling includes full-site page discovery, clean content extraction without navigation noise, bulk results in the dashboard, ZIP export with proper filenames, crawl logs with per-page retries, and prompt-based refinement so you can adjust instructions and rerun transformations without starting over.

Read the full blog here.

Spidra Beta Is Now Live

Spidra Beta Is Now Live

After months of building, refining, and testing behind the scenes, Spidra is officially in beta. We're welcoming our first batch of early users to help shape the future of AI-powered web scraping.

Spidra is an AI-first web scraping platform that lets you extract data from any website using real browser automation and natural language prompts. Instead of writing selectors, scripts, or dealing with brittle crawlers, you tell Spidra what you want and it handles the clicking, scrolling, typing, and waiting for you.

Under the hood, Spidra uses large language models and real browser automation to navigate pages like a human, extract clean data, solve captchas, and stay undetected by anti-bot systems.

Web scraping hasn't evolved much in the last decade. Most tools still rely on selectors, brittle automations, and endless debugging. We wanted to rethink scraping from the ground up — what if you could simply describe what you need and the scraper actually understood the page, located the right data, and delivered it in clean, structured formats?

You can join the beta by signing up at app.spidra.io. It's free to get started.

Read the full blog here.

Start scraping for free.

Get 300 free credits to explore Spidra. Build your first scraper in minutes, not hours. Upgrade anytime as you scale.

We build features around real workflows. Usually within days.