How to Scrape Grubhub Menu Data Across Cities (2026)

—

Grubhub sits on top of one of the richest food-delivery datasets in the US, and if you need to scrape Grubhub for menu pricing, item availability, or restaurant density across cities, the good news is the data is structured and accessible. the bad news: Grubhub’s anti-bot stack has gotten meaningfully tighter since 2024.

How Grubhub structures its data

Grubhub organizes data in three layers: city/market, restaurant listing, and menu.

at the city level, each market has a slug (e.g., chicago-il, new-york-ny) that scopes search results geographically. restaurant listings include name, cuisine tags, address, rating, review count, delivery fee, and estimated delivery time. menu data goes deeper: categories, item names, descriptions, prices, modifiers (size, extras), and availability windows.

the key fields you’ll want:

restaurant_id (internal numeric ID, stable across requests)
menu_category_id and menu_item_id
price (in cents, divide by 100)
availability (lunch/dinner windows, sometimes day-of-week flags)
delivery_zone (polygon or radius, relevant for multi-city work)

Grubhub also exposes is_orderable, which flags restaurants that are live vs. listed-but-closed. that’s useful for filtering before scraping menus.

The API approach vs HTML scraping

Don’t scrape the HTML. Grubhub’s frontend is React-rendered, so raw HTML gives you almost nothing without a headless browser. instead, target the internal JSON API that the webapp calls.

the main endpoints are:

https://api-gtm.grubhub.com/restaurants/search — takes lat/lon + radius, returns restaurant list
https://api-gtm.grubhub.com/restaurants/{restaurant_id}/menu — full menu JSON
https://api-gtm.grubhub.com/restaurants/{restaurant_id} — restaurant metadata

these endpoints return clean JSON with no HTML parsing needed. compared to scraping DoorDash restaurant menus and pricing, Grubhub’s API is slightly more permissive in terms of payload structure, but stricter on request headers.

the search endpoint accepts pageSize up to 100 and supports pagination via offset. you can also filter by cuisine, delivery fee max, and sort by rating or distance.

approach	complexity	data quality	speed
HTML scraping	high	poor (React SSR)	slow
internal API (JSON)	medium	excellent	fast
official API (none)	n/a	n/a	n/a
headless browser	high	good	very slow

Grubhub has no public API for third-party access, so the internal API route is your only real option for structured data at scale.

Anti-bot measures and how to handle them

Grubhub uses a combination of rate limiting, TLS fingerprinting, and behavior analysis. hitting the API without proper headers returns 403s within a few dozen requests. rotating IPs alone won’t fix it if your TLS fingerprint screams Python requests.

what actually works:

use httpx with HTTP/2 support, which more closely matches browser TLS fingerprints than requests
set realistic headers: User-Agent, Accept-Language, Referer (set to https://www.grubhub.com/), and x-csrf-token (pull from an initial page load)
rotate residential proxies, not datacenter IPs — Grubhub blocks datacenter CIDR blocks aggressively
add 2-5 second jitter between requests per session
keep sessions alive with cookies from an initial homepage hit

similar challenges come up when you scrape Uber Eats restaurant listings at scale, though Uber Eats has a different fingerprint profile. and if you’re expanding coverage to European platforms, the same residential proxy approach applies when you scrape Deliveroo restaurant menus in the UK and EU.

Python code: fetching Grubhub restaurant listings

here’s a realistic example hitting the search endpoint with proper headers and session reuse:

import httpx
import time
import random

BASE_URL = "https://api-gtm.grubhub.com"

HEADERS = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
    "Accept": "application/json",
    "Accept-Language": "en-US,en;q=0.9",
    "Referer": "https://www.grubhub.com/",
    "Origin": "https://www.grubhub.com",
}

def fetch_restaurants(lat: float, lon: float, radius_meters: int = 5000, page_size: int = 100, offset: int = 0, proxy: str = None):
    params = {
        "orderMethod": "delivery",
        "locationMode": "DELIVERY",
        "pageSize": page_size,
        "hideHateos": True,
        "latitude": lat,
        "longitude": lon,
        "radius": radius_meters,
        "offset": offset,
    }

    proxies = {"https://": proxy} if proxy else None

    with httpx.Client(http2=True, headers=HEADERS, proxies=proxies, timeout=15) as client:
        resp = client.get(f"{BASE_URL}/restaurants/search", params=params)
        resp.raise_for_status()
        time.sleep(random.uniform(2, 5))
        return resp.json()

# example: downtown Chicago
data = fetch_restaurants(lat=41.8781, lon=-87.6298, proxy="http://user:pass@residential-proxy:8080")
restaurants = data.get("search_result", {}).get("results", [])
print(f"fetched {len(restaurants)} restaurants")

for menu data, swap the endpoint to /restaurants/{restaurant_id}/menu and parse the menu_category_list key. each category has an item_list array with price, name, and modifier groups.

Scaling across cities and storing the data

the cleanest multi-city approach is a lat/lon grid. pick a city center, then tile outward with overlapping radius circles (5km radius, 4km step) to avoid gaps. for a city like LA, you’ll need 15-20 tiles to cover the metro. for NYC, closer to 30.

store results in Postgres with a schema like:

restaurants(id, grubhub_id, city, name, lat, lon, rating, scraped_at)
menu_items(id, restaurant_id, category, name, price_cents, scraped_at)

index grubhub_id as unique to deduplicate on re-runs. run incremental scrapes daily for pricing volatility — menu prices on Grubhub shift frequently, especially for items with demand-based pricing.

for the job queue, use Celery with Redis or a simple Postgres-backed queue. 10-15 concurrent workers with a shared rotating proxy pool handles ~50 cities overnight without triggering hard bans.

if you’re also pulling data from Asian markets, the same grid-based architecture translates well when you scrape Foodpanda menu data across Asia and the EU or scrape GrabFood restaurant and menu data. the proxy and rate-limit logic is nearly identical across these platforms.

expect roughly 800 restaurants per major US city on average, with 20-60 menu items each. a full 50-city dataset runs 40,000-80,000 restaurants and 2-4 million menu rows — manageable in Postgres with proper partitioning by city or scrape date.

Bottom line

scraping Grubhub at scale is doable with httpx, residential proxies, and the internal JSON API — skip the HTML layer entirely. the biggest failure mode is fingerprint mismatch, not IP bans, so invest in realistic headers and HTTP/2 before you scale. dataresearchtools.com covers Grubhub alongside the full food-delivery ecosystem if you need benchmarks or want to track how the anti-bot landscape evolves through 2026.