How to Scrape GOAT and Flight Club Sneaker Marketplace Data (2026)

GOAT and Flight Club run two of the tightest anti-bot stacks in sneaker resale, and scraping either platform in 2026 means dealing with Cloudflare, dynamic pricing APIs, and fingerprint-heavy JavaScript rendering. whether you’re building a price tracker, arb bot, or market analysis tool, here’s what actually works.

Why GOAT and Flight Club Are Hard to Scrape

Both platforms share infrastructure under the Flight Club/GOAT Group parent. that means similar bot mitigation: Cloudflare with JS challenge, device fingerprinting via PerimeterX (now HUMAN Security), and rate limiting tied to IP reputation scores rather than raw request volume.

GOAT’s product pages are server-side rendered but pricing data loads via XHR calls to api.goat.com. Flight Club leans heavier on client-side rendering, so a raw HTTP request to a listing URL returns a shell with no data. if you’re also working on StockX, note that StockX exposes cleaner API endpoints — GOAT is messier.

key signals that your requests are being flagged:

403 with cf-mitigated: challenge header
429 with x-ratelimit-reset set to 60+ seconds
200 responses returning empty JSON { "products": [] }
sudden redirect to /blocked with a Cloudflare ray ID

The API-First Approach (Recommended)

GOAT’s internal API is documented enough from traffic capture that you don’t need a browser for most data. the base endpoint is https://ac.cnstrc.com/ for search (Constructor.io powers their search layer) and https://www.goat.com/api/v1/ for product details and pricing.

a minimal fetch for a product’s ask/bid data looks like this:

import httpx

headers = {
    "User-Agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 17_4 like Mac OS X) AppleWebKit/605.1.15",
    "x-px-authorization": "",  # leave blank initially, rotate if blocked
    "Accept": "application/json",
    "Referer": "https://www.goat.com/",
}

r = httpx.get(
    "https://www.goat.com/api/v1/product_templates/air-jordan-1-retro-high-og-chicago-2015",
    headers=headers,
    timeout=10,
)
data = r.json()
lowest_ask = data["productTemplate"]["sizeRange"]

mobile user-agents clear Cloudflare easier than desktop strings for GOAT specifically. rotate through 6-8 realistic iOS/Android UA strings and you’ll reduce your 403 rate significantly before you even add proxy rotation.

for Flight Club, the product slug maps 1:1 with GOAT’s, but the pricing endpoint differs — Flight Club shows consignment ask prices, GOAT shows market asks. if you want both, you need two separate request chains.

Proxy and Session Management

residential proxies are non-negotiable here. datacenter IPs get flagged within 10-20 requests. Singapore and US IPs perform best for GOAT (their primary markets). you’ll need a pool of at least 50-100 rotating residential IPs to sustain meaningful throughput.

proxy type	GOAT success rate	cost/GB	recommended for
datacenter	~12%	$0.50-1	not viable
residential rotating	~74%	$3-8	standard scraping
mobile residential	~89%	$10-20	high-value targets
ISP (static residential)	~81%	$5-12	session-based flows

mobile proxies outperform because HUMAN Security’s fingerprinting weights carrier signals. the tradeoff is cost — mobile bandwidth runs 3-5x more than residential. for a price tracker hitting 1000 SKUs/day, mobile makes sense. for bulk catalog scraping, start residential and switch to mobile only for retries.

session stickiness matters too. GOAT associates cookies with a session fingerprint. use the same IP for the full cookie lifetime (typically 30 min) rather than rotating per request. this is the same principle you’d apply when scraping Mercari, where session continuity also cuts your block rate substantially.

Handling Cloudflare and PerimeterX

two separate challenges, two separate solutions.

Cloudflare JS challenge can be solved with:

Playwright or Puppeteer with stealth plugins (playwright-stealth, puppeteer-extra-plugin-stealth)
Cloudflare solvers via third-party APIs (Capsolver, NopeCHA, 2captcha) — cost is ~$1-3 per 1000 solves
Scraping Browser services (Brightdata Scraping Browser, Oxylabs Web Unblocker) — these handle CF internally, you just get clean HTML

PerimeterX / HUMAN is harder. it fingerprints canvas, WebGL, audio context, and timing patterns. solving it without a real browser is unreliable. if you’re seeing empty responses even after CF passes, HUMAN is intercepting the API call.

for Flight Club specifically, a headless Chromium with stealth + residential proxy gets you through both layers. the cost is speed — a browser-rendered scrape takes 3-6 seconds per page vs 0.3-0.8s for a direct API call.

the same anti-bot stack appears on secondary market platforms like Grailed and Stadium Goods, so any solver infrastructure you build here transfers directly.

Data Points You Can Extract

once you’re past the bot wall, GOAT’s API is actually generous with data. a single product template response includes:

all size variants with individual ask/bid prices
historical sale prices (last 72h, 30d, 90d buckets)
condition breakdown (new, used, 9.5/10, etc.)
availability by geography
product metadata (colorway, SKU, release date, retail price)

Flight Club responses are leaner — ask prices only, no bid side, no volume. for bid-side data you have to use GOAT.

structured extraction for price tracking (GOAT):

def extract_prices(product_template: dict) -> list[dict]:
    rows = []
    for size_option in product_template.get("sizeOptions", []):
        rows.append({
            "sku": product_template["sku"],
            "size": size_option["presentation"],
            "lowest_ask": size_option.get("lowestAsk"),
            "highest_bid": size_option.get("highestBid"),
            "last_sale": size_option.get("lastSale"),
        })
    return rows

for a deeper dive into how price tracking architectures across marketplaces fit together, the pillar guide on scraping StockX and GOAT for sneaker price tracking covers the cross-platform comparison, normalization layer, and storage schema in detail.

Rate Limits and Crawl Budgeting

GOAT rate limits by IP + session fingerprint combo. empirical limits from 2026 testing:

authenticated sessions: ~120 req/min sustained, burst to 200
anonymous sessions: ~30 req/min before soft throttle, 45 before hard 429
search API (Constructor.io endpoint): ~60 req/min with residential IPs

recommended crawl config for a 10k SKU catalog:

4-6 concurrent workers
1.5-2s delay between requests per worker
exponential backoff on 429: start at 5s, cap at 120s
cache product metadata (colorway, SKU, retail) for 7 days — it doesn’t change
refresh pricing every 15-30 min for active SKUs, daily for long-tail

this is more conservative than what you’d use for something like Reverb, which has lighter bot mitigation and tolerates higher throughput, but for GOAT the slower pace pays off in uptime.

Bottom Line

for GOAT, start with the mobile-UA direct API approach and residential proxies — you’ll cover 70-80% of use cases without a browser. add Playwright with stealth only for Flight Club or when HUMAN blocks your API calls. mobile proxies are worth the cost for production price trackers. DRT will keep coverage updated as GOAT’s anti-bot stack evolves through 2026.