Skip to main content
Data Enrichment

Feed in a list.
Get back a database.

Use the Spidra API to loop through any list of URLs, perform AI extraction, handle proxies, and follow multi-hop links. You own the orchestration. We handle the hard parts.

300 free credits included. No credit card required.

Input — your URL list
booking.com/hotel/br/grand-hyatt
eventbrite.com/o/houston-arts-42417
zillow.com/homedetails/123-main-st
amazon.com/dp/B08N5WRWNW
Spidra enriches
Output — structured fields
name"Grand Suite Ocean View"
sizeM282
price$1,249,000
rating4.7
inStocktrue

Perfect for

Travel and hospitality

Tour operators, OTAs, and hotel chains that maintain large property catalogs. Extract room specs, amenities, dining, wellness, and location data from Booking.com, IHG, Marriott, or any hotel website.

Sales and marketing teams

Enrich existing contact lists with emails, phone numbers, social links, and company details from organizer pages, business directories, and external websites.

E-commerce and product teams

Populate product databases with specs, prices, descriptions, and images from supplier sites or competitor pages. Keep catalog data fresh without manual data entry or expensive third-party feeds.

Real estate and finance

Aggregate property listings, valuation data, planning permits, and neighborhood stats from multiple sources. Normalize every record to the same schema so your models and dashboards always have complete, consistent data.

How it works

Four steps from a raw list of URLs to a fully populated dataset.

01

Start with your seed list

Start with any URL list. Read a CSV in your script, hardcode an array, or pull from a database. Pass each URL to the Spidra API and it works from whatever you have.

02

Define your schema

Describe the fields you want as a text prompt, or pass a JSON Schema. Spidra locks the output shape so every record comes back with the same fields, every time.

03

Spidra follows the chain

Most data lives across multiple pages. Spidra clicks into modals, follows links, visits external websites, and resolves redirects. All automatic, no extra code from you.

04

Get normalized JSON

Every field is extracted, normalized, and returned as clean JSON. Null means not found. The shape never changes. Plug it straight into your database, CRM, or pipeline.

Multi-hop extraction

Real-world data rarely lives on a single page. Spidra follows every link in the chain until it has everything you asked for.

Hotel content pipeline

Hotel page

https://booking.com/hotel/br/grand-hyatt-rio

Opens page, scrolls to availability table

Room modals (forEach)

Clicks each room category link

Extracts name, size, view, amenities per room

Parallel crawls

8 simultaneous category extractions

Dining, wellness, sport, facilities, services, kids, location, basic

Structured output

Full hotel profile, normalized to schema

{ rooms: [...], dining: {...}, wellness: {...}, location: {...} }

Contact enrichment pipeline

Event page

https://eventbrite.com/e/event-123

Extracts event name, date, organizer name and profile link

Organizer profile

https://eventbrite.com/o/organizer-456

Extracts website URL, Facebook page, follower count, total events

Organizer website

Tries homepage, /contact, /about

Extracts email, phone, address — falls back to Facebook if missing

Structured output

CRM-ready record, all fields filled

{ email: "...", phone: "...", address: "...", followers: 2400 }

Developer API

Build your enrichment pipeline with a few API calls.

No scraper maintenance. No fragile selectors. Just describe what you need and Spidra handles the browser, the AI, the proxies, and the extraction.

Batch any number of URLs in parallel
forEach opens modals and collapsed sections automatically
Proxy rotation built in for geo-restricted sources
Returns consistent JSON schema every single run
// Hotel content enrichment with forEach + schema
const res = await fetch("https://api.spidra.io/api/scrape", {
method: "POST",
headers: {
"Content-Type": "application/json",
"x-api-key": API_KEY,
},
body: JSON.stringify({
urls: [{
url: "https://www.booking.com/hotel/br/grand-hyatt-rio.de.html",
actions: [{
type: "forEach",
observe: "Find all clickable room category links in the availability table",
mode: "click",
itemPrompt: "Extract room name, size in m2, view, bathroom type, and amenities"
}]
}],
prompt: "Extract all room details. Normalize sizes to m2.",
output: "json",
useProxy: true,
proxyCountry: "de",
}),
});
 
const { jobId } = await res.json();

Built for real pipelines

Two examples of what teams build with the Spidra API. Same API, different pipelines.

Enterprise

Hotel content pipeline for a large catalog

A tour operator managing a large hotel catalog needs structured facts across every property: rooms, amenities, dining, wellness, and location. Extracted from Booking.com and direct hotel sites, normalized to an internal content schema.

SourceBooking.com, IHG, Marriott, direct hotel sites
ExtractionforEach for room modals + 8 parallel crawls per hotel
CategoriesRooms, dining, wellness, sport, facilities, services, kids, location
OutputStructured JSON, normalized to internal schema
ScaleHandles 100k+ hotels, quarterly refresh
Sample output — one room
{
  "name": "Grand Suite Ocean View",
  "sizeM2": 82,
  "view": "sea",
  "accommodationType": "suite",
  "bathroom": "both",
  "airConditioning": true,
  "minibar": true,
  "balcony": true,
  "safe": true,
  "coffeeTea": true
}
Sales automation

Contact enrichment for an event organizer database

A sales automation team has a list of Eventbrite organizer URLs with partial data. They need email, phone, address, and social links filled in across thousands of records and exported as a CRM-ready dataset in a single automated run.

SourceEventbrite organizer pages + external websites + Facebook
Extraction4-hop chain: event → organizer → website → social fallback
FieldsEmail, phone, address, social links, follower count, event count
OutputCRM-ready JSON exported to CSV
Scale4,500 records per run, skips already-enriched rows
Sample output — one organizer
{
  "organizer_name": "Houston Arts Collective",
  "email": "[email protected]",
  "phone": "(713) 555-0182",
  "website": "houstonarts.org",
  "facebook": "fb.com/houstonarts",
  "follower_count": 3200,
  "total_events": 47
}

How Spidra compares

See how Spidra stacks up for large-scale data enrichment.

Feature
Spidra
Manual entry
Data vendors
Build your own
Multi-hop extraction (follows links)
Real-time data from source
Custom schema per record type
JavaScript rendering + modal clicks
Proxy rotation built in
Scales to 100k+ records
No infrastructure to maintain
Works on any website

FAQ

Common questions about data enrichment with Spidra.

Yes. In your script, read the records you already have and pass each URL to the Spidra API. Skip rows that are already complete by checking before you call. Spidra fetches and fills only what you send it. This is how the contact enrichment example works: read a CSV, skip rows that already have an email, enrich the rest.

Stop filling in data
by hand.

Use the API to loop through your URLs. Get back a complete, structured dataset. 300 free credits to start.

We build features around real workflows. Usually within days.