eBay's listing data is one of the more useful public datasets in e-commerce. Prices, seller reputation, item condition, shipping terms, product specifications. All of it is publicly visible on every listing without a login.
For price monitoring, market research, or building a product catalogue, eBay gives you more seller-level signal than most other platforms because dozens of sellers compete on the same product simultaneously.
This guide covers the Spidra API for eBay scraping in JavaScript and Node.js. It follows the same structure as our Python guide and covers everything from a single item page through to a full batch pipeline.
What you can extract from eBay
Before writing any code it is worth understanding what data actually lives on eBay's public pages, because it shapes how you design your requests and schemas.
- Item pages at
ebay.com/itm/{item_number}are the richest source. A single item page gives you the title, current price, original price if on sale, currency, condition, availability, eBay item number, seller username, seller feedback percentage and count, shipping cost, returns policy, location, product images, the item specifics table, and the seller's description. - Search results pages at
ebay.com/sch/i.html?_nkw={keyword}return around 40 listing cards per page with shallower data: title, URL, price, condition, shipping cost, sold count on high-volume listings, and thumbnails.
The production pattern is to collect item URLs from search results and then enrich each one with a full item page scrape. We cover both and then combine them.
Prerequisites
You need Node.js 18 or higher for native fetch. Install the Spidra SDK:
npm install spidraGet your API key at app.spidra.io under Settings → API Keys. The free plan gives you 300 credits with no card required. Store the key as an environment variable and never hardcode it:
export SPIDRA_API_KEY="YOUR_API_KEY"How the Spidra API works
Spidra uses an async job pattern. You do not send a request and get data back in the same call. Instead, you submit a job and get a jobId back immediately.
The scraping runs in the background while a real browser renders the page, residential proxies route the request, and the AI extraction model reads the fully rendered content. You then poll a status endpoint until the job completes and collect the result.
The endpoints you need for eBay:
POST https://api.spidra.io/api/scrape — submit a single-URL job
GET https://api.spidra.io/api/scrape/{jobId} — poll for result
POST https://api.spidra.io/api/batch/scrape — submit up to 50 URLs at once
GET https://api.spidra.io/api/batch/scrape/{batchId} — poll batch resultEvery request needs an x-api-key header with your API key.
Your first eBay request
The item we use throughout this guide is a real listing confirmed working during testing:
https://www.ebay.com/itm/125575167955When you click through from eBay search results the URL picks up tracking parameters like ?_skw=mouse+wireless&hash=item.... Strip everything after the ?. The clean /itm/{number} format is more reliable to store and construct programmatically.
Understanding prompts and schemas
Spidra uses an AI extraction model to read the rendered page and pull out the data you want. The prompt tells it what to extract. The schema tells it what shape to return the data in.
Without a schema the AI decides the structure itself. Field names and nesting can vary between requests, which breaks any downstream code that expects consistent keys. With a schema every required field always appears in the output, as null if the page does not have that value. The output shape is guaranteed regardless of how different two item pages might be.
The prompt also helps the AI locate data in specific parts of the page. eBay's seller description lives at the bottom of the listing in a section separate from the main product details. Without guidance the AI can miss it.
The schema
Here is the full schema for an eBay item page:
const ITEM_SCHEMA = {
type: 'object',
required: ['title', 'price', 'condition', 'availability'],
properties: {
title: { type: 'string' },
price: { type: ['number', 'null'] },
original_price: { type: ['number', 'null'] },
currency: { type: ['string', 'null'] },
condition: { type: ['string', 'null'] },
availability: { type: 'string' },
item_number: { type: ['string', 'null'] },
seller_name: { type: ['string', 'null'] },
seller_feedback: { type: ['number', 'null'] },
seller_feedback_count: { type: ['integer', 'null'] },
shipping_cost: { type: ['string', 'null'] },
returns_accepted: { type: ['boolean', 'null'] },
location: { type: ['string', 'null'] },
images: {
type: 'array',
items: { type: 'string' }
},
features: {
type: 'array',
items: {
type: 'object',
properties: {
label: { type: 'string' },
value: { type: 'string' }
}
}
},
description: { type: ['string', 'null'] }
}
}features is an array of {label, value} objects rather than a flat array of strings. eBay's item specifics table pairs attributes with values, Brand/Unbranded, Type/Mini Mouse, Maximum DPI/1600. Flattening those into strings loses which label belongs to which value. The object structure keeps that relationship intact.
Only four fields are in required. Everything else is nullable. Not every listing has an original price, not every listing shows a returns policy in readable text, and not every listing has a seller description outside the iframe. Making those fields nullable means the schema works across all listing types without failing when a field is absent.
Using the REST API
Node.js 18 has native fetch so no additional HTTP library is needed. This function handles the full submit-and-poll cycle:
const API_KEY = process.env.SPIDRA_API_KEY
const BASE = 'https://api.spidra.io/api'
const HEADERS = {
'x-api-key': API_KEY,
'Content-Type': 'application/json',
}
const PROMPT = 'Extract the full product listing details. Get the description from the Item description from the seller section at the bottom of the page. For returns_accepted check for Returns accepted or No returns text near the shipping section.'
async function scrapeItem(url) {
const submitRes = await fetch(`${BASE}/scrape`, {
method: 'POST',
headers: HEADERS,
body: JSON.stringify({
urls: [{ url }],
prompt: PROMPT,
output: 'json',
useProxy: true,
proxyCountry: 'us',
schema: ITEM_SCHEMA,
}),
})
const { jobId } = await submitRes.json()
console.log(`Job submitted: ${jobId}`)
while (true) {
const res = await fetch(`${BASE}/scrape/${jobId}`, { headers: HEADERS })
const data = await res.json()
if (data.status === 'completed') return data.result.content
if (data.status === 'failed') throw new Error(`Job failed: ${data.error}`)
await new Promise(r => setTimeout(r, 3000))
}
}
const item = await scrapeItem('https://www.ebay.com/itm/125575167955')
console.log(item)Using the Node.js SDK
The Node.js SDK wraps the submit-and-poll cycle into a single await.
import { SpidraClient } from 'spidra'
const spidra = new SpidraClient({ apiKey: process.env.SPIDRA_API_KEY })
const job = await spidra.scrape.run({
urls: [{ url: 'https://www.ebay.com/itm/125575167955' }],
prompt: PROMPT,
output: 'json',
schema: ITEM_SCHEMA,
useProxy: true,
proxyCountry: 'us',
})
const item = job.result.content
console.log(item.title)
console.log(`$${item.price} ${item.currency}`)Real output
This is from an actual request against https://www.ebay.com/itm/125575167955 with proxyCountry: 'us':
{
"title": "Portable Wireless Mouse, 2.4GHz Silent with USB Receiver, Optical USB Mouse",
"price": 7.95,
"original_price": null,
"currency": "USD",
"condition": "New",
"availability": "3 available",
"item_number": "125575167955",
"seller_name": "butrad-0",
"seller_feedback": 99.3,
"seller_feedback_count": 15446,
"shipping_cost": "US $15.54",
"returns_accepted": null,
"location": "Flushing, NY, United States",
"images": [
"https://i.ebayimg.com/images/g/jpoAAOSwrf9jVE4k/s-l500.webp",
"https://i.ebayimg.com/images/g/1i8AAOSwSyhjVE5R/s-l500.webp",
"https://i.ebayimg.com/images/g/SqUAAOSwD45jVE5S/s-l500.webp",
"https://i.ebayimg.com/images/g/HSYAAOSwKoBjVE5T/s-l500.webp"
],
"features": [
{ "label": "Brand", "value": "Unbranded" },
{ "label": "Type", "value": "Mini Mouse" },
{ "label": "Maximum DPI", "value": "1600" },
{ "label": "Connectivity", "value": "Wireless" },
{ "label": "Number of Buttons", "value": "4" },
{ "label": "Tracking Method", "value": "Optical" },
{ "label": "Country of Origin", "value": "China" }
],
"description": "..."
}currency occasionally returns "US" rather than "USD" depending on the page version, normalise it in code if you need ISO codes:
const CURRENCY_MAP = { 'US': 'USD', '$': 'USD', '£': 'GBP', '€': 'EUR' }
item.currency = CURRENCY_MAP[item.currency] ?? item.currency
Why proxyCountry: 'us' matters
eBay serves localised content based on the connecting IP's country. Without specifying a country, the proxy routing your request may be in Sweden, Germany, or anywhere else, and you get prices in SEK or EUR with shipping costs shown in local approximations.
Setting proxyCountry: 'us' locks the request to a US residential IP and returns clean USD pricing on ebay.com.
For other eBay regional domains, match the country:
| Domain | proxyCountry |
|---|---|
| ebay.com | 'us' |
| ebay.co.uk | 'gb' |
| ebay.de | 'de' |
| ebay.com.au | 'au' |
| ebay.ca | 'ca' |
Scraping eBay search results
A single search results page returns around 40 listings. Three pages gives you roughly 120 item URLs to enrich with full item page data.
The search results schema wraps the listings array in an object because the Spidra API requires the root schema type to always be 'object'. If you use the JSON Schema Generator on an array of results, wrap the output before passing it to the API.
const SEARCH_SCHEMA = {
type: 'object',
required: ['listings'],
properties: {
listings: {
type: 'array',
items: {
type: 'object',
required: ['title', 'url'],
properties: {
title: { type: 'string' },
url: { type: 'string' },
price: { type: ['number', 'null'] },
condition: { type: ['string', 'null'] },
shipping_cost: { type: ['string', 'null'] },
sold_count: { type: ['integer', 'null'] },
sponsored: { type: ['boolean', 'null'] },
thumbnail: { type: ['string', 'null'] }
}
}
}
}
}REST API
Scraping a search page follows exactly the same pattern as an item page. You swap in the search URL and search schema.
const submitRes = await fetch(`${BASE}/scrape`, {
method: 'POST',
headers: HEADERS,
body: JSON.stringify({
urls: [{ url: 'https://www.ebay.com/sch/i.html?_nkw=wireless+mouse' }],
prompt: 'Extract all product listing cards on this search results page',
output: 'json',
useProxy: true,
proxyCountry: 'us',
schema: SEARCH_SCHEMA,
}),
})
const { jobId } = await submitRes.json()
// poll as normal...Node.js SDK
The SDK wraps the submit-and-poll cycle into a single await. Pass the same URL and schema you would use in the REST version.
const job = await spidra.scrape.run({
urls: [{ url: 'https://www.ebay.com/sch/i.html?_nkw=wireless+mouse' }],
prompt: 'Extract all product listing cards on this search results page',
output: 'json',
schema: SEARCH_SCHEMA,
useProxy: true,
proxyCountry: 'us',
})
const { listings } = job.result.content
console.log(`Got ${listings.length} listings`)What comes back
From a real test against https://www.ebay.com/sch/i.html?_nkw=wireless+mouse, 40 listings came back. A sample:
{
"listings": [
{
"title": "Wireless Mouse 2.4G Optical Cordless for PC Laptop Computer",
"url": "https://www.ebay.com/itm/166383500910",
"price": 8.88,
"condition": "Brand New",
"shipping_cost": "Free International Shipping",
"sold_count": 2575,
"sponsored": null,
"thumbnail": "https://i.ebayimg.com/images/g/3BkAAOSwL4Jm9RSR/s-l500.webp"
},
{
"title": "Inland ic210 Wireless Keyboard and Mouse Combo",
"url": "https://www.ebay.com/itm/267100499182",
"price": 38.85,
"condition": "Pre-Owned",
"shipping_cost": "+$29.60 shipping",
"sold_count": 52,
"sponsored": null,
"thumbnail": "https://i.ebayimg.com/images/g/6V4AAOSwr8lnYuqN/s-l500.webp"
}
]
}Collecting item URLs across multiple pages
eBay search pagination uses _pgn= in the query string. Page 2 of any search is the same URL with &_pgn=2 appended. This function collects item URLs across as many pages as you specify and deduplicates across pages so the same listing appearing on multiple pages is only stored once.
async function collectItemUrls(keyword, pages = 3) {
const urls = []
const seen = new Set()
for (let page = 1; page <= pages; page++) {
const pageUrl = `https://www.ebay.com/sch/i.html?_nkw=${encodeURIComponent(keyword)}&_pgn=${page}`
console.log(`Collecting page ${page}...`)
const job = await spidra.scrape.run({
urls: [{ url: pageUrl }],
prompt: 'Extract all product listing cards on this search results page',
output: 'json',
schema: SEARCH_SCHEMA,
useProxy: true,
proxyCountry: 'us',
})
if (job.result.aiExtractionFailed) {
console.warn(`Page ${page}: extraction failed, skipping`)
continue
}
for (const listing of job.result.content?.listings ?? []) {
if (listing.url && !seen.has(listing.url)) {
urls.push(listing.url)
seen.add(listing.url)
}
}
console.log(` ${urls.length} unique URLs so far`)
}
return urls
}Batch scraping: 50 item pages in parallel
Once you have a list of item URLs, the batch endpoint processes up to 50 at a time in parallel. Each URL runs in its own independent worker.
async function scrapeItems(itemUrls) {
const results = []
const chunkSize = 50
for (let i = 0; i < itemUrls.length; i += chunkSize) {
const chunk = itemUrls.slice(i, i + chunkSize)
const batchNum = Math.floor(i / chunkSize) + 1
const total = Math.ceil(itemUrls.length / chunkSize)
console.log(`Batch ${batchNum}/${total}: submitting ${chunk.length} items`)
const batch = await spidra.batch.run({
urls: chunk,
prompt: PROMPT,
output: 'json',
schema: ITEM_SCHEMA,
useProxy: true,
proxyCountry: 'us',
})
for (const item of batch.items) {
if (item.status === 'completed' && item.result) {
results.push(item.result)
} else {
console.warn(`Failed: ${item.url}`)
console.warn(`Reason: ${item.error}`)
}
}
console.log(` ${batch.completedCount}/${batch.totalUrls} succeeded`)
}
return results
}item.result is the extracted data directly, an object matching your schema. Check item.status === 'completed' before accessing it because a failed item has item.result as null. Failed items do not stop the rest of the batch; they run independently.
The full pipeline
Search results feed into batch item scraping. Putting both stages together:
import { SpidraClient } from 'spidra'
import { writeFileSync } from 'fs'
import * as os from 'os'
const spidra = new SpidraClient({ apiKey: process.env.SPIDRA_API_KEY })
const itemUrls = await collectItemUrls('wireless mouse', 2)
console.log(`\nCollected ${itemUrls.length} item URLs`)
const items = await scrapeItems(itemUrls)
console.log(`Scraped ${items.length} items`)
const jsonl = items.map(item => JSON.stringify(item)).join(os.EOL)
writeFileSync('ebay_items.jsonl', jsonl)
console.log('Saved to ebay_items.jsonl')JSONL stores one JSON object per line. You can stream it, append to it, and load subsets without reading the whole file into memory. It imports cleanly into most databases and data pipelines.
Price monitoring pipeline
eBay's multi-seller model makes it useful for price monitoring. The same product appears across dozens of listings at different prices and conditions. Running a daily batch against a fixed list of item numbers surfaces price drops, sellers adjusting their listings, and condition changes.
import { readFileSync, writeFileSync, existsSync, mkdirSync } from 'fs'
const PRICE_SCHEMA = {
type: 'object',
required: ['item_number', 'title', 'price', 'condition'],
properties: {
item_number: { type: 'string' },
title: { type: 'string' },
price: { type: ['number', 'null'] },
currency: { type: ['string', 'null'] },
condition: { type: 'string' },
availability: { type: ['string', 'null'] },
seller_name: { type: ['string', 'null'] },
}
}
// Confirmed working item numbers from testing
const WATCHED_ITEMS = [
'125575167955', // Portable Wireless Mouse, $7.95
'125575135033', // Related listing, same seller
'166383500910', // High-volume listing, 2,575+ sold
]
function loadSnapshot(path = 'data/ebay_prices.json') {
if (!existsSync(path)) return {}
return JSON.parse(readFileSync(path, 'utf8'))
}
function saveSnapshot(data, path = 'data/ebay_prices.json') {
mkdirSync(path.split('/').slice(0, -1).join('/'), { recursive: true })
writeFileSync(path, JSON.stringify(data, null, 2))
}
async function fetchPrices(itemNumbers) {
const batch = await spidra.batch.run({
urls: itemNumbers.map(n => `https://www.ebay.com/itm/${n}`),
prompt: 'Extract the item number, title, price, condition and availability',
output: 'json',
schema: PRICE_SCHEMA,
useProxy: true,
proxyCountry: 'us',
})
const results = {}
for (const item of batch.items) {
if (item.status === 'completed' && item.result) {
const n = item.result.item_number
if (n) results[n] = { ...item.result, checkedAt: new Date().toISOString() }
}
}
return results
}
function findChanges(previous, current, thresholdPct = 3) {
return Object.entries(current)
.map(([n, data]) => {
const curr = data.price
const prev = previous[n]?.price
if (!curr || !prev || prev === 0) return null
const pct = ((curr - prev) / prev) * 100
if (Math.abs(pct) < thresholdPct) return null
return {
itemNumber: n,
title: (data.title ?? '').slice(0, 70),
seller: data.seller_name,
condition: data.condition,
prevPrice: prev,
currPrice: curr,
changePct: Math.round(pct * 10) / 10,
direction: pct > 0 ? 'up' : 'down',
}
})
.filter(Boolean)
.sort((a, b) => Math.abs(b.changePct) - Math.abs(a.changePct))
}
const previous = loadSnapshot()
const current = await fetchPrices(WATCHED_ITEMS)
saveSnapshot(current)
const changes = findChanges(previous, current)
if (changes.length > 0) {
console.log(`\n${changes.length} price changes:`)
for (const c of changes) {
const sign = c.direction === 'up' ? '+' : ''
console.log(` ${c.title}`)
console.log(` Seller: ${c.seller} | Condition: ${c.condition}`)
console.log(` $${c.prevPrice} to $${c.currPrice} (${sign}${c.changePct}%)`)
}
} else {
console.log('No significant price changes')
}Schedule this with cron or any job queue and you have a live price feed across any eBay category.
Exporting to CSV
For flat data analysis or sharing with teammates who prefer a spreadsheet:
import { readFileSync, writeFileSync } from 'fs'
const items = readFileSync('ebay_items.jsonl', 'utf8')
.split('\n').filter(Boolean)
.map(line => JSON.parse(line))
const fields = [
'title', 'price', 'currency', 'condition', 'availability',
'item_number', 'seller_name', 'seller_feedback',
'seller_feedback_count', 'shipping_cost', 'location'
]
const escape = val => {
const s = String(val ?? '')
return s.includes(',') || s.includes('\n')
? `"${s.replace(/"/g, '""')}"`
: s
}
const csv = [
fields.join(','),
...items.map(item => fields.map(f => escape(item[f])).join(','))
].join('\n')
writeFileSync('ebay_items.csv', csv)
console.log(`Exported ${items.length} rows to ebay_items.csv`)The features, images, and description fields are left out of the CSV intentionally. They stay in the JSONL for detailed work. The CSV stays clean enough to open directly in Excel or Google Sheets without formatting issues.
