When should I use forEach instead of a plain prompt?

Use a plain prompt when all the data you need is on a single page and extracting it does not require interacting with individual elements. Use forEach when data is spread across many cards or rows, when you need to click into detail pages, or when content is hidden behind per-item interactions like accordions or modals.

Can forEach follow multiple pages of results automatically?

Yes. Add a pagination object with nextSelector pointing to the CSS selector of the Next button, and maxPages for the maximum number of pages to follow. forEach will collect items from each page and combine all results into one array.

What does captureSelector do in forEach?

It narrows the scope of what content gets captured per item. Without it, forEach captures the full page content for each item. With captureSelector: "article.product_page", it captures only the content inside that CSS selector on each destination page. Useful for reducing noise and keeping per-item extraction focused.

How many elements can forEach process in one request?

Set maxItems to control this. There is no hard limit, but very large values will take longer. Use pagination to spread collection across pages rather than processing hundreds of elements in one forEach call. For large-scale collection across many pages, the batch scraping API is more efficient than forEach with high maxItems.

Can I nest actions inside a forEach?

Yes. The actions array inside a forEach runs on each item after navigation or click. For example, scroll down the detail page before extracting, or click a specific tab before reading content.

What is the difference between itemPrompt and the top-level prompt?

itemPrompt runs on each individual element as it is processed. The top-level prompt (on the overall scrape request) runs on the combined output after all items have been collected. You can use both together: itemPrompt to extract raw fields per item, and prompt to reshape or summarize the full combined result.

Blog/ Spidra browser actions: complete guide to clicking, scrolling, and interacting before scraping

June 23, 2026 · 14 min read

Spidra browser actions: complete guide to clicking, scrolling, and interacting before scraping

Joel Olawanle

Spidra browser actions: complete guide to clicking, scrolling, and interacting before scraping

Most web data does not sit waiting for you on a static page. It hides behind cookie banners. It appears only after you scroll far enough. It lives on detail pages you have to click into. It loads when a dropdown filter is selected or a search form is submitted.

Traditional scrapers fail here because they make one HTTP request and read whatever HTML comes back. If the content requires user interaction first, they get nothing.

Spidra's browser actions let you describe what to do on a page before extraction runs: click a button, type into a field, scroll down, wait for content to appear. The actions execute in order inside a real browser. When they are done, extraction runs on the result. You get the data that only appears after interaction.

This guide covers every action available, how to write them using CSS selectors or plain English, and how to combine them for real-world scraping tasks on sites like books.toscrape.com, quotes.toscrape.com, and scrapingcourse.com.

How browser actions work

Actions live inside the actions array on each URL object in your scrape request. They run in order, inside a real browser, before extraction begins.

import requests, time, os

API_KEY = os.environ["SPIDRA_API_KEY"]
HEADERS = {"x-api-key": API_KEY, "Content-Type": "application/json"}

response = requests.post(
    "https://api.spidra.io/api/scrape",
    headers=HEADERS,
    json={
        "urls": [
            {
                "url": "https://example.com/products",
                "actions": [
                    {"type": "click", "value": "Accept cookies"},
                    {"type": "scroll", "to": "80%"},
                ]
            }
        ],
        "prompt": "Extract all product names and prices",
        "output": "json",
    }
)
job_id = response.json()["jobId"]

The value field on click is a plain English description. Spidra uses AI to locate the element on the page. The alternative is selector, which takes a CSS selector or XPath expression:

# Plain English — Spidra AI finds the element
{"type": "click", "value": "Accept cookies button"}

# CSS selector — goes straight to the browser engine, faster
{"type": "click", "selector": "#accept-cookies"}

# XPath
{"type": "click", "selector": "//button[contains(text(), 'Accept')]"}

Both work and you can mix them in the same actions array. CSS and XPath are faster because they go straight to the browser engine without an AI call. Plain English is more resilient because it adapts when the page changes.

Action reference

Action	What it does	Key fields
`click`	Clicks any element: button, link, tab, icon	`selector` or `value`
`type`	Types text into an input field	`selector`, `value`
`check`	Checks a checkbox	`selector` or `value`
`uncheck`	Unchecks a checkbox	`selector` or `value`
`wait`	Pauses for a number of milliseconds	`duration`
`scroll`	Scrolls to a percentage of page height	`to` (e.g. `"80%"`)
`forEach`	Finds matching elements and processes each one	`value`, `mode`

click

The most common action. Use it to dismiss cookie banners, open tabs, expand sections, click navigation links, or interact with any element before extraction.

Dismissing a cookie banner on quotes.toscrape.com

quotes.toscrape.com is a scraping practice site built by the toscrape.com team. It uses a login gate on some endpoints and has a clean paginated structure that makes it ideal for testing browser actions.

response = requests.post(
    "https://api.spidra.io/api/scrape",
    headers=HEADERS,
    json={
        "urls": [
            {
                "url": "http://quotes.toscrape.com",
                "actions": [
                    {"type": "click", "value": "Accept cookies"},
                ]
            }
        ],
        "prompt": "Extract all quotes with author name and tags",
        "output": "json",
    }
)

Clicking a tab to reveal content

Some pages hide data behind tabs. Without clicking the right tab first, extraction only sees the default visible content.

"actions": [
    {"type": "click", "selector": "button[data-tab='specifications']"},
    {"type": "wait", "duration": 500},
]

Navigating to a filtered page

"actions": [
    {"type": "click", "selector": "a[href='/catalogue/mystery_3/']"},
    {"type": "wait", "duration": 1000},
]

type

Types text into an input field. Use it to fill search boxes, filter inputs, or any form field that changes what content appears on the page.

Searching quotes.toscrape.com by tag

quotes.toscrape.com has tag-based filtering. You can type a tag into the search or click tag links to filter quotes.

response = requests.post(
    "https://api.spidra.io/api/scrape",
    headers=HEADERS,
    json={
        "urls": [
            {
                "url": "https://www.scrapingcourse.com/",
                "actions": [
                    {"type": "type", "selector": "input[type='search']", "value": "python"},
                    {"type": "click", "selector": "button[type='submit']"},
                    {"type": "wait", "duration": 1500},
                ]
            }
        ],
        "prompt": "Extract all search result titles and URLs",
        "output": "json",
    }
)

Typing into multiple fields

Fill out a form with several fields before submitting:

"actions": [
    {"type": "type", "selector": "input[name='username']", "value": "[email protected]"},
    {"type": "type", "selector": "input[name='password']", "value": "yourpassword"},
    {"type": "click", "selector": "button[type='submit']"},
    {"type": "wait", "duration": 2000},
]

check and uncheck

Toggle checkboxes to apply or remove filters before extraction runs. Useful for e-commerce category pages, job boards with filter panels, and directory listings with tag filters.

"actions": [
    # Check "In stock only" filter
    {"type": "check", "selector": "input[name='filter-in-stock']"},
    {"type": "wait", "duration": 1000},

    # Uncheck a category that should not be included
    {"type": "uncheck", "value": "Used items checkbox"},
    {"type": "wait", "duration": 500},
]

wait

Pauses execution for a specified number of milliseconds. Use it after clicking something that triggers a network request, after typing into a search field, or after any action where content takes a moment to appear.

"actions": [
    {"type": "click", "selector": "#load-more"},
    {"type": "wait", "duration": 2000},   # wait 2 seconds for content to load
    {"type": "scroll", "to": "100%"},
    {"type": "wait", "duration": 1500},   # wait for lazy-loaded content after scroll
]

The duration value is in milliseconds. A few practical reference points: 500ms is enough for a simple show/hide toggle; 1000-2000ms covers most AJAX requests; 3000ms+ is appropriate for pages with heavy JavaScript rendering.

scroll

Scrolls the page to a percentage of its total height. Use it to trigger lazy loading, bring content into the viewport, or reveal elements that only become visible on scroll.

Loading all books on books.toscrape.com

books.toscrape.com is a fictional bookstore built specifically for scraping practice. Each page lists 20 books. The page itself is static so no scrolling is needed for the initial load. But it is a clean example for combining scroll with extraction:

response = requests.post(
    "https://api.spidra.io/api/scrape",
    headers=HEADERS,
    json={
        "urls": [
            {
                "url": "https://books.toscrape.com",
                "actions": [
                    {"type": "scroll", "to": "50%"},
                    {"type": "wait", "duration": 500},
                    {"type": "scroll", "to": "100%"},
                    {"type": "wait", "duration": 500},
                ]
            }
        ],
        "prompt": "Extract all book titles, prices, and star ratings",
        "output": "json",
    }
)

Triggering infinite scroll on scrapingcourse.com

scrapingcourse.com/infinite-scrolling is a practice site that loads product cards as you scroll. Without scrolling, only the first batch of products is visible.

response = requests.post(
    "https://api.spidra.io/api/scrape",
    headers=HEADERS,
    json={
        "urls": [
            {
                "url": "https://www.scrapingcourse.com/infinite-scrolling",
                "actions": [
                    {"type": "scroll", "to": "30%"},
                    {"type": "wait", "duration": 1000},
                    {"type": "scroll", "to": "60%"},
                    {"type": "wait", "duration": 1000},
                    {"type": "scroll", "to": "100%"},
                    {"type": "wait", "duration": 1500},
                ]
            }
        ],
        "prompt": "Extract all product names and prices",
        "output": "json",
    }
)

Each scroll triggers another load. By the time you reach 100%, all products that the page is willing to show are visible. Extraction then runs on the fully loaded page.

forEach: the most powerful action

forEach is in a different category from the other actions. It does not just interact with one element. It finds a set of matching elements, interacts with each one independently, runs extraction on whatever content each interaction reveals, and returns all the results combined into a single array.

It is the right tool when:

Data is spread across many cards, rows, or list items on the page
You need to navigate into each item's detail page to get full information
Content is hidden behind a click (modal, accordion, drawer) and you need it from every item

The three modes

inline reads the content of each element directly without clicking. Use this for product grids, table rows, list items, and any container where all the content you need is already visible inside the element.

navigate follows each element as a link, loads the destination page, and extracts content there. Use this when the data you want lives on detail pages you have to click into.

click (the default) clicks each element, waits for content to appear, and extracts from whatever opens. Use this for accordions, modals, expandable panels, and drawers.

forEach inline: scraping all books on books.toscrape.com

books.toscrape.com lists books in article.product_pod elements. Each card has the title, price, and star rating directly visible without needing to click anywhere. inline mode is the right choice here.

response = requests.post(
    "https://api.spidra.io/api/scrape",
    headers=HEADERS,
    json={
        "urls": [
            {
                "url": "https://books.toscrape.com",
                "actions": [
                    {
                        "type": "forEach",
                        "value": "Find all book cards on the page",
                        "mode": "inline",
                        "itemPrompt": "Extract the book title, price, and star rating",
                        "maxItems": 20,
                    }
                ]
            }
        ],
        "output": "json",
    }
)

Result:

[
  {"title": "A Light in the Attic", "price": "£51.77", "rating": "Three"},
  {"title": "Tipping the Velvet", "price": "£53.74", "rating": "One"},
  {"title": "Soumission", "price": "£50.10", "rating": "One"}
]

forEach navigate: scraping detail pages on books.toscrape.com

Each book card on books.toscrape.com links to a detail page with the full description, product information (UPC, number of reviews, availability count), and price including and excluding tax. To get that data, you need to navigate into each book's page. navigate mode handles this automatically.

response = requests.post(
    "https://api.spidra.io/api/scrape",
    headers=HEADERS,
    json={
        "urls": [
            {
                "url": "https://books.toscrape.com",
                "actions": [
                    {
                        "type": "forEach",
                        "value": "Find all book title links",
                        "mode": "navigate",
                        "maxItems": 10,
                        "itemPrompt": "Extract the book title, price excluding tax, price including tax, availability, star rating, UPC, and full description",
                    }
                ]
            }
        ],
        "output": "json",
        "schema": {
            "type": "object",
            "required": ["title", "price_excl_tax"],
            "properties": {
                "title":          {"type": "string"},
                "price_excl_tax": {"type": ["number", "null"]},
                "price_incl_tax": {"type": ["number", "null"]},
                "availability":   {"type": ["number", "null"]},
                "rating":         {"type": ["string", "null"]},
                "upc":            {"type": ["string", "null"]},
                "description":    {"type": ["string", "null"]},
            }
        }
    }
)

For each of the 10 book cards, Spidra follows the link to the detail page, extracts the data using the itemPrompt, and returns an array of 10 objects. All in one API call.

forEach with pagination: all quotes across multiple pages

quotes.toscrape.com paginates its quotes across 10 pages with a "Next →" button at the bottom of each page. The pagination config tells forEach to keep following the next page button until it hits the limit.

response = requests.post(
    "https://api.spidra.io/api/scrape",
    headers=HEADERS,
    json={
        "urls": [
            {
                "url": "http://quotes.toscrape.com",
                "actions": [
                    {
                        "type": "forEach",
                        "value": "Find all quote blocks on the page",
                        "mode": "inline",
                        "maxItems": 100,
                        "itemPrompt": "Extract the quote text, author name, and all tags",
                        "pagination": {
                            "nextSelector": "li.next a",
                            "maxPages": 5,
                        }
                    }
                ]
            }
        ],
        "output": "json",
    }
)

li.next a is the CSS selector for the Next button on quotes.toscrape.com. forEach collects quotes from page 1, then follows the Next link to page 2, collects again, and continues until it has processed 5 pages or run out of pages.

The result is a single flat array of all quotes collected across all pages — up to 50 quotes in this case (10 per page × 5 pages).

forEach with nested actions per item

Sometimes each item needs its own interaction after being navigated into. forEach supports nested actions that run per item — inside each detail page or after clicking each element.

This example scrapes books.toscrape.com, navigates into each detail page, scrolls down to make sure the full description is visible, and then extracts:

response = requests.post(
    "https://api.spidra.io/api/scrape",
    headers=HEADERS,
    json={
        "urls": [
            {
                "url": "https://books.toscrape.com",
                "actions": [
                    {
                        "type": "forEach",
                        "value": "Find all book title links",
                        "mode": "navigate",
                        "maxItems": 5,
                        "waitAfterClick": 1000,
                        "captureSelector": "article.product_page",
                        "actions": [
                            {"type": "scroll", "to": "50%"},
                        ],
                        "itemPrompt": "Extract title, price, rating, and the full product description",
                    }
                ]
            }
        ],
        "output": "json",
    }
)

captureSelector narrows what content is captured after navigation — in this case only the article.product_page element rather than the whole page. waitAfterClick adds a delay after each navigation before the nested actions run. Useful for pages that take a moment to fully render after loading.

forEach click mode: accordion and modal content

Some pages hide content behind accordions or "View details" buttons. click mode opens each element and captures whatever appears.

A product page with a "Show all specifications" accordion:

"actions": [
    {
        "type": "forEach",
        "value": "Find all specification accordion rows",
        "mode": "click",
        "maxItems": 50,
        "itemPrompt": "Extract the specification name and value",
    }
]

An FAQ page where each question expands into an answer:

"actions": [
    {
        "type": "forEach",
        "value": "Find all FAQ question rows",
        "mode": "click",
        "itemPrompt": "Extract the question and the full answer text",
        "waitAfterClick": 500,
    }
]

Combining actions: a full real-world example

Most real scraping tasks combine several actions. This example scrapes the scrapingcourse.com e-commerce practice site: dismisses a cookie notice, applies a category filter, scrolls to load all products, and then uses forEach to navigate into each product page.

import requests, time, os

API_KEY = os.environ["SPIDRA_API_KEY"]
HEADERS = {"x-api-key": API_KEY, "Content-Type": "application/json"}

response = requests.post(
    "https://api.spidra.io/api/scrape",
    headers=HEADERS,
    json={
        "urls": [
            {
                "url": "https://www.scrapingcourse.com/ecommerce/",
                "actions": [
                    # Dismiss cookie banner if present
                    {"type": "click", "value": "Accept cookies"},

                    # Scroll to trigger any lazy-loaded content
                    {"type": "scroll", "to": "50%"},
                    {"type": "wait", "duration": 800},
                    {"type": "scroll", "to": "100%"},
                    {"type": "wait", "duration": 1000},

                    # Navigate into each product card for full details
                    {
                        "type": "forEach",
                        "value": "Find all product cards",
                        "mode": "navigate",
                        "maxItems": 20,
                        "itemPrompt": "Extract the product name, price, description, and available sizes or variants",
                        "pagination": {
                            "nextSelector": "a.next",
                            "maxPages": 3,
                        }
                    }
                ]
            }
        ],
        "output": "json",
        "schema": {
            "type": "object",
            "required": ["name", "price"],
            "properties": {
                "name":        {"type": "string"},
                "price":       {"type": ["number", "null"]},
                "description": {"type": ["string", "null"]},
                "sizes":       {"type": "array", "items": {"type": "string"}},
            }
        },
        "use_proxy": True,
    }
)

job_id = response.json()["jobId"]

while True:
    status = requests.get(
        f"https://api.spidra.io/api/scrape/{job_id}",
        headers=HEADERS
    ).json()

    if status["status"] == "completed":
        products = status["result"]["content"]
        print(f"Extracted {len(products)} products")
        for p in products[:3]:
            print(p)
        break
    time.sleep(3)

Using the Python SDK

The same actions work identically in the Python SDK. Import BrowserAction and pass it as a list:

from spidra import SpidraClient, ScrapeParams, ScrapeUrl, BrowserAction
import os

spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])

job = spidra.scrape.run_sync(ScrapeParams(
    urls=[
        ScrapeUrl(
            url="https://books.toscrape.com",
            actions=[
                BrowserAction(
                    type="forEach",
                    value="Find all book title links",
                    mode="navigate",
                    max_items=10,
                    item_prompt="Extract title, price, and star rating",
                    pagination={
                        "nextSelector": "li.next a",
                        "maxPages": 3,
                    }
                )
            ]
        )
    ],
    output="json",
))

print(job.result.content)

Note: Python SDK uses snake_case (max_items, item_prompt) while the REST API uses camelCase (maxItems, itemPrompt). Both work the same way.

Node.js SDK

import { SpidraClient } from 'spidra-js'

const spidra = new SpidraClient({ apiKey: process.env.SPIDRA_API_KEY! })

const job = await spidra.scrape.run({
  urls: [
    {
      url: 'https://books.toscrape.com',
      actions: [
        {
          type: 'forEach',
          value: 'Find all book title links',
          mode: 'navigate',
          maxItems: 10,
          itemPrompt: 'Extract title, price, and star rating',
          pagination: {
            nextSelector: 'li.next a',
            maxPages: 3,
          },
        },
      ],
    },
  ],
  output: 'json',
})

console.log(job.result.content)

Writing good actions

Use CSS selectors when you can identify a stable element

CSS and XPath selectors go straight to the browser engine without using AI. They are faster and do not consume extra tokens. If the element has a clean, stable ID or class that is unlikely to change, use selector.

# Good: stable selector
{"type": "click", "selector": "button#accept-cookies"}

# Also good: attribute selector
{"type": "click", "selector": "button[data-testid='cookie-accept']"}

Use plain English when the layout might change

Natural language descriptions survive page redesigns. If a site updates its HTML every few weeks, "value": "Accept all cookies button" will still work after the class names change.

# Resilient to layout changes
{"type": "click", "value": "Accept all cookies button"}

Add wait after anything that triggers a network request

Clicks that open modals, submit forms, or navigate to new pages all need time for content to appear. A missing wait is the most common reason actions do not produce the expected result.

{"type": "click", "selector": "button#filter-apply"},
{"type": "wait", "duration": 1500},  # wait for filtered results to load

Use forEach inline for visible content, navigate for detail pages

If you can see the data you want inside the element cards on the listing page, inline is simpler and faster. If the full data only exists on a detail page after clicking, navigate is the right mode.

# inline: all the data is in the cards
{
    "type": "forEach",
    "value": "Find all job listing cards",
    "mode": "inline",
    "itemPrompt": "Extract job title, company, and location"
}

# navigate: need to click into each listing for full details
{
    "type": "forEach",
    "value": "Find all job listing links",
    "mode": "navigate",
    "itemPrompt": "Extract job title, company, full description, and salary range"
}

Use maxItems deliberately

Without maxItems, forEach will attempt to process every matching element it finds. On a large directory page that could be hundreds of items. Set maxItems to what you actually need and use pagination to go wider rather than making a single huge request.

forEach options reference

Option	Type	Description
`value` or `observe`	string	Plain English description of elements to find
`mode`	string	`"inline"`, `"navigate"`, or `"click"` (default)
`maxItems`	number	Maximum number of elements to process
`itemPrompt`	string	AI extraction prompt applied to each item individually
`pagination`	object	`{ nextSelector: "...", maxPages: N }` — follow pagination
`waitAfterClick`	number	Milliseconds to wait after clicking each element
`captureSelector`	string	CSS selector to narrow what content is captured per item
`actions`	array	Nested actions to run per item after navigation or click

Frequently asked questions

selector is a CSS selector or XPath expression like "#accept-cookies" or ".submit-btn". It goes directly to the browser engine. value is a plain English description like "Accept all cookies button" and Spidra uses AI to locate the element. Both are valid. selector is faster. value is more resilient to layout changes.

Share this article

Guides

Spidra batch scraping API: how to scrape 50 URLs in parallel

Process up to 50 URLs in parallel with the Spidra batch scraping API. Covers AI extraction, JSON schema, retrying failures, and proxy routing.

June 22, 2026 · 11 min read

Guides

HTML vs Markdown for AI: which format is better for LLMs?

Raw HTML wastes up to 90% of your LLM context window on boilerplate. Learn why Markdown is better for RAG and AI training, and when HTML is actually the right choice.

June 12, 2026 · 12 min read

Guides

What is a headless browser? How it works, uses, and tools (2026)

Learn what a headless browser is, how it works, Playwright vs Puppeteer explained with real code, and when to use a managed API instead.

June 11, 2026 · 15 min read

Start scraping for free.

Get 300 free credits to explore Spidra. Build your first scraper in minutes, not hours. Upgrade anytime as you scale.

We build features around real workflows. Usually within days.

Spidra browser actions: complete guide to clicking, scrolling, and interacting before scraping

How browser actions work

Action reference

click

Dismissing a cookie banner on quotes.toscrape.com

Clicking a tab to reveal content

Navigating to a filtered page

type

Searching quotes.toscrape.com by tag

Typing into multiple fields

check and uncheck

wait

scroll

Loading all books on books.toscrape.com

Triggering infinite scroll on scrapingcourse.com

forEach: the most powerful action

The three modes

forEach inline: scraping all books on books.toscrape.com

forEach navigate: scraping detail pages on books.toscrape.com

forEach with pagination: all quotes across multiple pages

forEach with nested actions per item

forEach click mode: accordion and modal content

Combining actions: a full real-world example

Using the Python SDK

Node.js SDK

Writing good actions

Use CSS selectors when you can identify a stable element

Use plain English when the layout might change

Add wait after anything that triggers a network request

Use forEach inline for visible content, navigate for detail pages

Use maxItems deliberately

forEach options reference

Frequently asked questions

Share this article

Related posts

Spidra batch scraping API: how to scrape 50 URLs in parallel

HTML vs Markdown for AI: which format is better for LLMs?

What is a headless browser? How it works, uses, and tools (2026)

Start scraping for free.