I’ll write this directly.
OpenTable is one of the most data-rich targets in the restaurant tech space, and if you need to scrape OpenTable at scale — whether for reservation intelligence, market research, or competitive analysis — the approach matters more than the tooling. the platform serves 60,000+ restaurants across the US, Canada, UK, Australia, and Japan, and it exposes two distinct data layers: restaurant profile data (listings, ratings, cuisine tags, price tier) and live availability slots (time blocks by date and party size). each layer requires a different strategy.
What Data You Can Actually Extract
before writing a single line of code, map out what you need. OpenTable’s restaurant profiles contain:
- restaurant name, cuisine type, neighborhood, and price range ($/$$/$$$/$$$$)
- aggregate rating and review count
- hours of operation and contact details
- amenities tags (outdoor seating, private dining, valet)
- OpenTable restaurant ID (the
ridparameter — keep this, it anchors everything else)
availability data is separate. it returns time slots, party size options, and occasionally experience/menu upsells. if you’re building a reservation intelligence product or tracking booking patterns across a city, this is the layer you want.
for comparison, delivery platforms like Uber Eats or DoorDash expose menu-level pricing that OpenTable doesn’t — OpenTable is reservation-first, so pricing data here means price tier, not item costs.
Two Endpoints Worth Targeting
OpenTable’s frontend is a Next.js SPA, but the underlying APIs are accessible directly via XHR. two endpoints do most of the heavy lifting:
Search/listings: GET https://www.opentable.com/s/ with query params (metroId, term, covers, dateTime) returns paginated restaurant cards. each card includes JSON-LD Restaurant schema markup embedded in the HTML — this is the easiest parsing target for bulk profile collection.
Availability: GET https://www.opentable.com/dapi/fe/restref/client?rid={restaurant_id}&... returns a JSON payload with time slots. you need the rid from the listing step first.
import httpx
HEADERS = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
"Accept": "application/json, text/plain, */*",
"Accept-Language": "en-US,en;q=0.9",
"Referer": "https://www.opentable.com/",
"OT-Agent-Id": "", # leave blank, mimics unauthenticated browser
}
def get_availability(rid: str, date: str, covers: int = 2) -> dict:
url = "https://www.opentable.com/dapi/fe/restref/client"
params = {
"rid": rid,
"restref": rid,
"covers": covers,
"dateTime": f"{date}T19:00",
"lang": "en-US",
}
r = httpx.get(url, params=params, headers=HEADERS, timeout=15)
r.raise_for_status()
return r.json()for availability at scale, raw httpx works fine at low concurrency. for the search/listings layer, Playwright with stealth patches handles fingerprinting better when you’re hitting 500+ pages per run.
Anti-Bot Measures and How to Handle Them
OpenTable runs Cloudflare in front of its main pages and applies behavioral fingerprinting on the availability API. the practical rate limits are around 30-50 requests per minute per IP before you start seeing soft 429s or empty-slot responses that aren’t actually empty.
the mitigation stack in order of importance:
- rotate residential or mobile proxies — datacenter IPs get flagged within minutes on the availability endpoint
- randomize request timing with jitter (0.8-2.5s between calls, not fixed intervals)
- set realistic browser headers including
sec-ch-ua,sec-fetch-dest, andReferer - for Playwright-based scraping, use
playwright-stealthorrebrowser-patchesto mask navigator properties - avoid scraping the same
ridmore than once every 4-6 hours per IP
residential proxies sourced from SG or US ISPs perform best here. the full proxy selection breakdown for reservation platforms is covered in How to Use Proxies for Restaurant Reservation Platforms (OpenTable, Resy), including proxy type comparisons and geo-targeting considerations.
Choosing Your Scraping Approach
| approach | best for | speed | anti-bot resistance | cost |
|---|---|---|---|---|
| httpx + direct API | availability JSON at scale | fast | medium | low |
| Playwright + stealth | listings page + JS rendering | slow | high | medium |
| Scrapy + middleware | bulk profile collection | fast | medium | low |
| Commercial API (e.g. ScraperAPI, Oxylabs) | quick prototypes | medium | high | high |
for most use cases, a hybrid works: Scrapy for crawling restaurant profiles via the search endpoint (JSON-LD parsing is clean and stable), then httpx with rotating proxies for the availability layer on a selected subset of rids.
if you’re also collecting data from Asian markets, Foodpanda and Just Eat / Takeaway expose similar structured data — the anti-bot posture is lighter than OpenTable, which makes them useful benchmarks when you’re building a multi-platform restaurant intelligence pipeline.
Storing and Structuring the Output
once you have raw JSON from both endpoints, normalization matters. a clean schema looks like:
restaurant_id(string, OpenTable’srid)name,cuisine_list(array),neighborhood,city,price_tier(1-4)rating(float),review_count(int)scraped_at(ISO 8601 timestamp)availability_slots(array of{time, covers_available, experience_id})
store availability snapshots with a timestamp, not as live state — availability changes by the minute, so historical snapshots have more analytical value than a single read. a postgres table with a composite index on (restaurant_id, scraped_at) handles time-series queries well.
key fields to watch for data quality issues:
ridvsrestref: OpenTable sometimes uses both interchangeably; pin toridas your canonical key- price tier: returned as a string like
"PRICE_POINT_3"in some API versions — normalize to integer at ingest - availability response: empty
[]slots can mean “no availability” OR rate-limited response — distinguish by checking theavailableboolean at the top level of the response
Bottom Line
for restaurant profile data, parse the JSON-LD from OpenTable’s search results — it’s stable and requires no JavaScript rendering. for availability data, call the /dapi/fe/restref/client endpoint directly with rotating residential proxies and keep your request rate under 40/min per IP. if you’re building a multi-platform restaurant dataset, DRT covers the full stack across delivery and reservation platforms — the same patterns here transfer directly to peer platforms with minor endpoint adjustments.