eBay is a different kind of scraping target compared to most e-commerce sites. Amazon has one seller per listing. eBay has dozens of sellers competing on the same product, each with their own price, condition, shipping terms, and seller reputation.
That seller layer is where a lot of the value sits — understanding who is selling what, at what price, with what feedback history, tells you things about a market that a simple price comparison does not.
This guide covers the Spidra API from the ground up for eBay. Not just how to make a request and get data back, but how the async job pattern works, how to build schemas that give you consistent structured output, how to handle the quirks specific to eBay's page structure, and how to build a production-grade pipeline that collects item URLs from search results and enriches them with full item page data.
All the code in this guide was tested against real eBay URLs. The output examples are from actual requests, not invented data.
What you can extract from eBay
Before writing any code it is worth mapping out what data actually exists on eBay's public pages, because it shapes how you design your requests and schemas.
Item pages — ebay.com/itm/{item_number} — are the richest source. A single item page gives you:
- Title
- Current price and original price if on sale
- Currency
- Condition (Brand New, Pre-Owned, Parts Only, etc.)
- Availability (how many units the seller has listed)
- eBay item number
- Seller username
- Seller feedback score as a percentage
- Seller feedback count
- Shipping cost and type
- Whether returns are accepted
- Seller location
- Product images (full-size webp)
- Specification table (label/value pairs: Brand, Type, DPI, Connectivity, etc.)
- Product description (available on listings that show it outside an iframe)
Search results pages (ebay.com/sch/i.html?_nkw={keyword}) give you shallower data per listing but across 40 items per page:
- Title
- Item URL
- Price
- Condition
- Shipping cost
- Sold count (on high-volume listings only)
- Thumbnail
Seller pages give you a seller's full feedback history and listing catalogue. Not covered in depth here but the same API approach applies.
How the Spidra API works
Spidra uses an async job pattern. You do not send a request and wait for the data to come back in the same response. Instead, you submit a job and receive a job ID immediately. The scraping runs in the background and you poll a status endpoint until the job is done.
This pattern is deliberate. Amazon and eBay pages take time to fully render, especially when residential proxies are routing the request through a specific geography. A synchronous request would time out. The async pattern lets you submit a batch, do other work, and collect results when they are ready.
The three endpoints you will use for eBay work:
POST https://api.spidra.io/api/scrape — submit a single-URL job
GET https://api.spidra.io/api/scrape/{jobId} — poll for result
POST https://api.spidra.io/api/batch/scrape — submit up to 50 URLs at once
GET https://api.spidra.io/api/batch/scrape/{batchId} — poll batch resultAuthentication uses the x-api-key header. Every request needs it.
Getting your API key
Sign up at app.spidra.io. The free plan gives you 300 credits with no card required. After signing in, go to Settings → API Keys and create a new key.
Store the key as an environment variable. Never put it in source code.
export SPIDRA_API_KEY="spd_YOUR_KEY_HERE"For the Python examples below, the key is read from the environment:
import os
API_KEY = os.environ["SPIDRA_API_KEY"]Your first eBay request
The target throughout this guide is a real wireless mouse listing that was tested and confirmed working:
https://www.ebay.com/itm/125575167955Note: the URL you click from search results often looks like https://www.ebay.com/itm/125575167955?_skw=mouse+wireless&hash=item.... Strip everything after ? — the clean /itm/{number} format is more reliable and what you should store.
Submitting the job
Send a POST request to the scrape endpoint with the item URL, your prompt, and the proxy settings. The request returns within a second or two — you get a job ID, not the data itself.
curl -X POST https://api.spidra.io/api/scrape \
-H "x-api-key: $SPIDRA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"urls": [{"url": "https://www.ebay.com/itm/125575167955"}],
"prompt": "Extract the full product listing details",
"output": "json",
"useProxy": true,
"proxyCountry": "us"
}'The response comes back immediately:
{
"jobId": "550e8400-e29b-41d4-a716-446655440000",
"status": "queued",
"message": "Job queued. Poll /api/scrape/550e8400... for results."
}Polling for the result
Use the job ID from the submit response to check the status. Keep polling every few seconds until status is "completed" — at that point result.content holds your extracted data.
curl https://api.spidra.io/api/scrape/550e8400-e29b-41d4-a716-446655440000 \
-H "x-api-key: $SPIDRA_API_KEY"Poll every 3 seconds until status is "completed". Then read result.content.
Understanding prompts and schemas
Spidra uses an AI extraction model to read the rendered page and pull out the data you want. The prompt tells it what to extract in plain English. The schema tells it what shape to return the data in.
- Without a schema, the AI decides the shape. You get something back, but the field names and structure vary between requests. That is fine for exploration but breaks any code that expects consistent field names.
- With a schema, every required field always appears in the output — as
nullif the page does not have that value. The structure is guaranteed regardless of how different two item pages might be. This is what you want for any pipeline that writes to a database or downstream service.
Building the item schema
Here is the full schema for an eBay item page:
ITEM_SCHEMA = {
"type": "object",
"required": [
"title", "price", "condition", "availability"
],
"properties": {
"title": {"type": "string"},
"price": {"type": ["number", "null"]},
"original_price": {"type": ["number", "null"]},
"currency": {"type": ["string", "null"]},
"condition": {"type": ["string", "null"]},
"availability": {"type": "string"},
"item_number": {"type": ["string", "null"]},
"seller_name": {"type": ["string", "null"]},
"seller_feedback": {"type": ["number", "null"]},
"seller_feedback_count": {"type": ["integer", "null"]},
"shipping_cost": {"type": ["string", "null"]},
"returns_accepted": {"type": ["boolean", "null"]},
"location": {"type": ["string", "null"]},
"images": {
"type": "array",
"items": {"type": "string"}
},
"features": {
"type": "array",
"items": {
"type": "object",
"properties": {
"label": {"type": "string"},
"value": {"type": "string"}
}
}
},
"description": {"type": ["string", "null"]}
}
}Note: If writing schemas by hand feels tedious, Spidra has a free JSON Schema Generator.
Scraping a single eBay item page
This version uses Python's requests library with no SDK dependency. It handles the full submit-poll cycle manually, which gives you explicit control over timing and error handling.
import requests, time, json, os
API_KEY = os.environ["SPIDRA_API_KEY"]
BASE = "https://api.spidra.io/api"
HEADERS = {"x-api-key": API_KEY, "Content-Type": "application/json"}
ITEM_SCHEMA = {
"type": "object",
"required": ["title", "price", "condition", "availability"],
"properties": {
"title": {"type": "string"},
"price": {"type": ["number", "null"]},
"original_price": {"type": ["number", "null"]},
"currency": {"type": ["string", "null"]},
"condition": {"type": ["string", "null"]},
"availability": {"type": "string"},
"item_number": {"type": ["string", "null"]},
"seller_name": {"type": ["string", "null"]},
"seller_feedback": {"type": ["number", "null"]},
"seller_feedback_count": {"type": ["integer", "null"]},
"shipping_cost": {"type": ["string", "null"]},
"returns_accepted": {"type": ["boolean", "null"]},
"location": {"type": ["string", "null"]},
"images": {"type": "array", "items": {"type": "string"}},
"features": {
"type": "array",
"items": {
"type": "object",
"properties": {
"label": {"type": "string"},
"value": {"type": "string"}
}
}
},
"description": {"type": ["string", "null"]}
}
}
# Submit the job
resp = requests.post(f"{BASE}/scrape", headers=HEADERS, json={
"urls": [{"url": "https://www.ebay.com/itm/125575167955"}],
"prompt": "Extract the full product listing details. For returns_accepted check for Returns accepted or No returns text near the shipping section. You can get the description from the Item description from the seller section. Extract all visible customer reviews as an array of strings.",
"output": "json",
"useProxy": True,
"proxyCountry": "us",
"schema": ITEM_SCHEMA,
})
resp.raise_for_status()
job_id = resp.json()["jobId"]
print(f"Job submitted: {job_id}")
# Poll until complete
while True:
result = requests.get(f"{BASE}/scrape/{job_id}", headers=HEADERS).json()
print(f"Status: {result['status']}")
if result["status"] == "completed":
break
if result["status"] == "failed":
print("Job failed:", result.get("error"))
break
time.sleep(3)
# Access the result
item = result["result"]["content"]
print(json.dumps(item, indent=2))Using the Python SDK
The Python SDK wraps the submit-and-poll cycle for you. run_sync() blocks until the job is complete and returns the result directly:
import os
from spidra import SpidraClient, ScrapeParams, ScrapeUrl
spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])
job = spidra.scrape.run_sync(ScrapeParams(
urls=[ScrapeUrl(url="https://www.ebay.com/itm/125575167955")],
prompt="Extract the full product listing details. For returns_accepted check for Returns accepted or No returns text near the shipping section. You can get the description from the Item description from the seller section. Extract all visible customer reviews as an array of strings.",
output="json",
schema=ITEM_SCHEMA,
use_proxy=True,
proxy_country="us",
))
item = job.result.content
print(item["title"])
print(item["price"], item["currency"])Install the SDK with pip install spidra. The full SDK reference is at docs.spidra.io/sdks/python.md.
The actual output
This is real output from scraping https://www.ebay.com/itm/125575167955 with proxyCountry: "us":
{
"title": "Portable Wireless Mouse, 2.4GHz Silent with USB Receiver, Optical USB Mouse",
"price": 7.95,
"original_price": null,
"currency": "USD",
"condition": "New",
"availability": "3 available",
"item_number": "125575167955",
"seller_name": "butrad-0",
"seller_feedback": 99.3,
"seller_feedback_count": 15446,
"shipping_cost": "US $15.54",
"returns_accepted": null,
"location": "Flushing, NY, United States",
"images": [
"https://i.ebayimg.com/images/g/jpoAAOSwrf9jVE4k/s-l500.webp",
"https://i.ebayimg.com/images/g/1i8AAOSwSyhjVE5R/s-l500.webp",
"https://i.ebayimg.com/images/g/SqUAAOSwD45jVE5S/s-l500.webp",
"https://i.ebayimg.com/images/g/HSYAAOSwKoBjVE5T/s-l500.webp"
],
"features": [
{"label": "Compatible Brand", "value": "Universal, Microsoft Windows 10/8/7/XP"},
{"label": "Brand", "value": "Unbranded"},
{"label": "Type", "value": "Mini Mouse"},
{"label": "Maximum DPI", "value": "1600"},
{"label": "Connectivity", "value": "Wireless"},
{"label": "Interface", "value": "Bluetooth"},
{"label": "Number of Buttons", "value": "4"},
{"label": "Tracking Method", "value": "Optical"},
{"label": "Country of Origin", "value": "China"}
],
"description": "...",
"reviews": [
"So far, the mouse has worked well. It feels good in my hand, is a nice size, and feels like it's made of quality materials.",
"color is lovely, shipping was fast and it seems to be a working mouse good deal for the $$",
"A simple bluetooth mouse that works fine.",
"Excellent communication, shipping, packaging and was exactly as described. I am a happy camper~!!!!!",
"The QUALITY of this inexpensive usb mouse is OUTSTANDING! It's extremely quiet and extraordinarily silent when I use it."
]
}Why proxyCountry: "us" matters
eBay serves localised content based on the connecting IP's geographic location. Without specifying a country, Spidra routes through whichever residential proxy is geographically closest to available capacity. Depending on which country that proxy is in, you might get:
- Prices in SEK, EUR, GBP, or another currency
- Shipping costs shown as "approx 151.31 SEK" alongside the USD amount
- Different listings surfacing in search results
Setting proxyCountry: "us" locks the request to a US residential IP. For ebay.com this consistently returns USD pricing, US shipping costs, and US-relevant listings.
The same principle applies to other eBay regional domains:
| Marketplace | Domain | proxyCountry |
|---|---|---|
| United States | ebay.com | "us" |
| United Kingdom | ebay.co.uk | "gb" |
| Germany | ebay.de | "de" |
| Australia | ebay.com.au | "au" |
| Canada | ebay.ca | "ca" |
For the proxy scraping docs including the full list of supported countries, see the Spidra product page.
Scraping eBay search results
A single search results page returns around 40 listings. Three pages of results gives you 120 item URLs to enrich with full item page data. The search results schema wraps the listings array in an object because the Spidra API always requires the root type to be "object":
SEARCH_SCHEMA = {
"type": "object",
"required": ["listings"],
"properties": {
"listings": {
"type": "array",
"items": {
"type": "object",
"required": ["title", "url"],
"properties": {
"title": {"type": "string"},
"url": {"type": "string"},
"price": {"type": ["number", "null"]},
"condition": {"type": ["string", "null"]},
"shipping_cost": {"type": ["string", "null"]},
"sold_count": {"type": ["integer", "null"]},
"sponsored": {"type": ["boolean", "null"]},
"thumbnail": {"type": ["string", "null"]}
}
}
}
}
}REST API
Submit the search page URL the same way you would an item page. The only difference is the URL itself — the schema and proxy settings work identically.
import requests, time, os, json
API_KEY = os.environ["SPIDRA_API_KEY"]
BASE = "https://api.spidra.io/api"
HEADERS = {"x-api-key": API_KEY, "Content-Type": "application/json"}
resp = requests.post(f"{BASE}/scrape", headers=HEADERS, json={
"urls": [{"url": "https://www.ebay.com/sch/i.html?_nkw=wireless+mouse"}],
"prompt": "Extract all product listings on this page",
"output": "json",
"useProxy": True,
"proxyCountry": "us",
"schema": SEARCH_SCHEMA,
})
job_id = resp.json()["jobId"]
while True:
result = requests.get(f"{BASE}/scrape/{job_id}", headers=HEADERS).json()
if result["status"] == "completed":
break
time.sleep(3)
listings = result["result"]["content"]["listings"]
print(f"Got {len(listings)} listings")Python SDK
If you are using the SDK, run_sync() handles the polling and returns job.result.content once the job is complete. Access the listings array by key.
from spidra import SpidraClient, ScrapeParams, ScrapeUrl
spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])
job = spidra.scrape.run_sync(ScrapeParams(
urls=[ScrapeUrl(url="https://www.ebay.com/sch/i.html?_nkw=wireless+mouse")],
prompt="Extract all product listings on this page",
output="json",
schema=SEARCH_SCHEMA,
use_proxy=True,
proxy_country="us",
))
listings = job.result.content["listings"]Real output from testing
From an actual test against https://www.ebay.com/sch/i.html?_nkw=wireless+mouse:
{
"listings": [
{
"title": "Wireless Mouse 2.4G Optical Cordless Mice for PC Laptop Computer",
"url": "https://www.ebay.com/itm/166383500910",
"price": 8.88,
"condition": "Brand New",
"shipping_cost": "Free International Shipping",
"sold_count": 2575,
"sponsored": null,
"thumbnail": "https://i.ebayimg.com/images/g/3BkAAOSwL4Jm9RSR/s-l500.webp"
},
{
"title": "Inland ic210 Wireless Keyboard & Mouse Combo",
"url": "https://www.ebay.com/itm/267100499182",
"price": 38.85,
"condition": "Pre-Owned",
"shipping_cost": "+$29.60 shipping",
"sold_count": 52,
"sponsored": null,
"thumbnail": "https://i.ebayimg.com/images/g/6V4AAOSwr8lnYuqN/s-l500.webp"
}
]
}eBay's pagination structure
eBay uses _pgn= for pagination. Page 2 of the same search is:
https://www.ebay.com/sch/i.html?_nkw=wireless+mouse&_pgn=2Increment _pgn to walk through pages. There is no reliable "last page" signal from the search results content itself, so control it with a page count parameter in your code.
Scraping multiple pages of search results
This function iterates through search pages, collects unique item URLs, and deduplicates across pages using a set. Call it with a keyword and a page count and it returns a clean list ready for the batch stage.
import os
from spidra import SpidraClient, ScrapeParams, ScrapeUrl
spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])
def collect_item_urls(keyword: str, pages: int = 3) -> list[str]:
urls = []
seen = set()
for page in range(1, pages + 1):
page_url = (
f"https://www.ebay.com/sch/i.html"
f"?_nkw={keyword.replace(' ', '+')}&_pgn={page}"
)
job = spidra.scrape.run_sync(ScrapeParams(
urls=[ScrapeUrl(url=page_url)],
prompt="Extract all product listings on this page",
output="json",
schema=SEARCH_SCHEMA,
use_proxy=True,
proxy_country="us",
))
if job.result.ai_extraction_failed:
print(f"Page {page}: AI extraction failed, skipping")
continue
for listing in job.result.content.get("listings", []):
url = listing.get("url")
if url and url not in seen:
urls.append(url)
seen.add(url)
print(f"Page {page}: {len(urls)} unique item URLs collected")
return urls
item_urls = collect_item_urls("wireless mouse", pages=3)
print(f"\nTotal: {len(item_urls)} unique item URLs")The ai_extraction_failed check is important. If Spidra's AI could not extract data from the page — because eBay served a block page or an unexpected layout — this flag is set to True and job.result.content may be empty. Always check it in production code before processing the result.
The seen set deduplicates across pages. eBay sometimes shows the same listing on multiple search pages, particularly sponsored listings.
Batch scraping: 50 item pages in parallel
Once you have a list of item URLs from search, the batch endpoint scrapes up to 50 at a time in parallel. Each URL runs in its own independent worker. You submit once, poll once, and get all results together.
import os, json
from spidra import SpidraClient, BatchScrapeParams
spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])
def scrape_items(item_urls: list[str]) -> list[dict]:
results = []
chunk_size = 50
total = -(-len(item_urls) // chunk_size) # ceiling division
for i in range(0, len(item_urls), chunk_size):
chunk = item_urls[i:i + chunk_size]
batch_num = i // chunk_size + 1
print(f"Batch {batch_num}/{total}: submitting {len(chunk)} items")
batch = spidra.batch.run_sync(BatchScrapeParams(
urls=chunk,
prompt="Extract the full product listing details. For returns_accepted check for Returns accepted or No returns text near the shipping section. You can get the description from the Item description from the seller section. Extract all visible customer reviews as an array of strings.",
output="json",
schema=ITEM_SCHEMA,
use_proxy=True,
proxy_country="us",
))
for item in batch.items:
if item.status == "completed" and item.result:
results.append(item.result)
else:
print(f" Failed: {item.url} — {item.error}")
print(f" {batch.completed_count}/{batch.total_urls} succeeded")
return resultsA few things about this code worth understanding:
batch.items is a list of BatchItem objects. Each one has status, url, result, and error. Check item.status == "completed" before accessing item.result — a failed item has item.result as None.
item.result is the extracted JSON directly. It is a dict matching your schema, not a wrapper object. Accessing item.result["title"] works immediately without any unwrapping.
The chunk_size = 50 limit is the batch endpoint's maximum. For a list of 200 URLs, the loop runs four times, submitting 50 at a time.
The full pipeline: search to structured data
Putting it all together:
import os, json
from spidra import SpidraClient, ScrapeParams, ScrapeUrl, BatchScrapeParams
spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])
# Step 1: collect item URLs from search
item_urls = collect_item_urls("wireless mouse", pages=2)
print(f"Collected {len(item_urls)} item URLs")
# Step 2: scrape each item page in batches
items = scrape_items(item_urls)
print(f"Scraped {len(items)} items")
# Step 3: save as JSONL (one JSON object per line)
with open("ebay_items.jsonl", "w") as f:
for item in items:
f.write(json.dumps(item) + "\n")
print("Saved to ebay_items.jsonl")JSONL is the right format here. Each line is a valid JSON object. You can stream it, append to it, and load subsets without reading the entire file into memory. It also imports cleanly into most databases and data pipelines.
Exporting to CSV
For spreadsheet analysis or sharing with non-technical teammates:
import csv, json
with open("ebay_items.jsonl") as f:
items = [json.loads(line) for line in f if line.strip()]
if not items:
print("No data to export")
else:
fieldnames = [
"title", "price", "currency", "condition", "availability",
"item_number", "seller_name", "seller_feedback",
"seller_feedback_count", "shipping_cost", "returns_accepted",
"location"
]
with open("ebay_items.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=fieldnames, extrasaction="ignore")
writer.writeheader()
writer.writerows(items)
print(f"Exported {len(items)} rows to ebay_items.csv")extrasaction="ignore" tells DictWriter to skip fields in the data that are not in fieldnames. This handles the features, images, and description fields cleanly — they stay in your JSONL file for detailed work but do not clutter the CSV.
Price monitoring across eBay sellers
eBay's seller competition makes it particularly useful for price monitoring. The same product exists across dozens of listings at different prices, conditions, and shipping costs. A daily batch run surfaces price drops, condition changes (Pre-Owned to Brand New at the same price), and new sellers entering a category.
import os, json
from datetime import datetime, timezone
from pathlib import Path
from spidra import SpidraClient, BatchScrapeParams
spidra = SpidraClient(api_key=os.environ["SPIDRA_API_KEY"])
PRICE_SCHEMA = {
"type": "object",
"required": ["item_number", "title", "price", "condition", "availability"],
"properties": {
"item_number": {"type": "string"},
"title": {"type": "string"},
"price": {"type": ["number", "null"]},
"currency": {"type": ["string", "null"]},
"condition": {"type": "string"},
"availability": {"type": "string"},
"seller_name": {"type": ["string", "null"]},
}
}
# Real item numbers confirmed working
WATCHED_ITEMS = [
"125575167955", # Portable Wireless Mouse — $7.95
"125575135033", # Related listing from same seller
"166383500910", # High-volume seller — 2,575 sold
]
def load_previous(path="data/ebay_prices.json") -> dict:
p = Path(path)
return json.loads(p.read_text()) if p.exists() else {}
def save_snapshot(data: dict, path="data/ebay_prices.json"):
Path(path).parent.mkdir(parents=True, exist_ok=True)
Path(path).write_text(json.dumps(data, indent=2))
def fetch_current_prices(item_numbers: list[str]) -> dict:
batch = spidra.batch.run_sync(BatchScrapeParams(
urls=[f"https://www.ebay.com/itm/{n}" for n in item_numbers],
prompt="Extract the item number, title, current price, condition, and availability",
output="json",
schema=PRICE_SCHEMA,
use_proxy=True,
proxy_country="us",
))
results = {}
for item in batch.items:
if item.status == "completed" and item.result:
number = item.result.get("item_number")
if number:
results[number] = {
**item.result,
"checked_at": datetime.now(timezone.utc).isoformat(),
}
return results
def find_changes(previous: dict, current: dict, threshold_pct: float = 3.0) -> list[dict]:
changes = []
for number, data in current.items():
curr_price = data.get("price")
prev_data = previous.get(number, {})
prev_price = prev_data.get("price")
if curr_price is None or prev_price is None or prev_price == 0:
continue
change_pct = ((curr_price - prev_price) / prev_price) * 100
if abs(change_pct) >= threshold_pct:
changes.append({
"item_number": number,
"title": data.get("title", "")[:70],
"seller": data.get("seller_name"),
"condition": data.get("condition"),
"prev_price": prev_price,
"curr_price": curr_price,
"change_pct": round(change_pct, 1),
"direction": "up" if change_pct > 0 else "down",
})
return sorted(changes, key=lambda x: abs(x["change_pct"]), reverse=True)
# Run the monitor
print(f"Checking {len(WATCHED_ITEMS)} items...")
previous = load_previous()
current = fetch_current_prices(WATCHED_ITEMS)
save_snapshot(current)
changes = find_changes(previous, current)
if changes:
print(f"\n{len(changes)} price changes detected:")
for c in changes:
arrow = "+" if c["direction"] == "up" else ""
print(
f" {c['title']}\n"
f" Seller: {c['seller']} | Condition: {c['condition']}\n"
f" ${c['prev_price']} to ${c['curr_price']} ({arrow}{c['change_pct']}%)\n"
)
else:
print("\nNo significant price changes")
print(f"\nDone. {len(current)} items checked.")Run this on a schedule — daily with cron or a job queue — and you have a live price intelligence feed across any eBay category.
