How to Scrape DoorDash Restaurants and Menus (2026)

How to Scrape DoorDash Restaurants and Menus (2026)

DoorDash serves over 700,000 restaurants across the US, Canada, and Australia — and if you’re doing competitive pricing research, building a food delivery aggregator, or tracking menu trends across cities, scraping DoorDash restaurant menus is a very real engineering task. this article covers how DoorDash’s frontend and API actually work in 2026, what tools cut through their bot defenses, and how to extract structured menu and pricing data at scale without burning your IP pool.

how DoorDash serves its menu data

DoorDash is a Next.js app. the menu page you see in a browser is server-side rendered, but the actual item data comes from a graphQL endpoint: consumer-mobile-bff.doordash.com/consumer/graphql. same endpoint their mobile apps use.

the good news: that API returns clean JSON with item names, prices, descriptions, calories, and modifiers (size, add-ons, etc). the bad news: it requires authenticated session cookies and a valid x-channel-id header that rotates. unauthenticated requests get a 401 or a silent empty response. which is worse than a 401, honestly.

the HTML rendered pages also include a __NEXT_DATA__ JSON blob embedded in a <script> tag. for light scraping this is often easier to parse than reverse-engineering the graphQL schema, because it has most of the same data and doesn’t require auth headers.

import httpx
from bs4 import BeautifulSoup
import json

def extract_next_data(url: str, headers: dict) -> dict:
    r = httpx.get(url, headers=headers, follow_redirects=True)
    soup = BeautifulSoup(r.text, "html.parser")
    tag = soup.find("script", id="__NEXT_DATA__")
    if not tag:
        return {}
    return json.loads(tag.string)

headers = {
    "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
    "accept-language": "en-US,en;q=0.9",
}
data = extract_next_data("https://www.doordash.com/store/mcdonalds-new-york-12345/", headers)

the __NEXT_DATA__ path to menu items is props.pageProps.storeMenuProps.menuBook.categories[]. items sit inside each category with price (in cents), name, description, and imageUrl.

anti-bot defenses you’ll actually hit

DoorDash runs Cloudflare plus an internal bot management layer that fingerprints TLS JA3/JA4, HTTP/2 settings, and browser behavior. at moderate scale (200+ requests/hour from one IP) you’ll see:

  • 403 Forbidden with a Cloudflare challenge page
  • 429 Too Many Requests on the graphQL endpoint
  • silent 200 responses returning empty menu data (the sneaky one)
  • captcha interstitials on restaurant listing pages

rotating proxies help, but the fingerprinting layer means bare requests from httpx or requests get flagged even with valid cookies. you need TLS fingerprinting to match a real browser’s Client Hello. curl-cffi is the standard fix in 2026 — it mimics the TLS handshake of Chrome or Safari.

from curl_cffi import requests as cffi_requests

r = cffi_requests.get(
    "https://www.doordash.com/store/some-restaurant-99999/",
    impersonate="chrome120",
    headers=headers,
)

for serious scale, headless browsers (Playwright with stealth) or commercial scraping APIs handle the fingerprinting for you. similar defenses show up across the food delivery space: if you’re also pulling from Grubhub, the approach in How to Scrape Grubhub Menu Data Across Cities (2026) covers that platform’s specific quirks in more detail.

scraping at scale: proxies, rate limits, and infrastructure

a single residential IP running 50 requests/hour stays under the radar for light single-restaurant monitoring. anything broader needs IP rotation.

approachcostscaledetection risk
datacenter proxies~$1-3/GBhigh throughputhigh (easy fingerprint)
residential proxies~$5-15/GBmedium-highmedium
mobile proxies~$15-40/GBmediumlow (best for auth’d flows)
scraping API (Apify, ScraperAPI, Brightdata)~$1-5/1000 reqelasticlow (managed for you)

for menu data without login, residential proxies at 1 request per 3-5 seconds per IP are stable. if you need to scrape authenticated cart and pricing flows (DoorDash shows different prices with DashPass vs without), mobile proxies are more reliable because they share the same IP type the DoorDash mobile app actually uses.

session management matters too. DoorDash sets dd_access_token and dd_refresh_token cookies on login. if you’re scraping at scale with real accounts, rotate accounts and cookies together, not just IPs. a valid account with a stale cookie from a different IP triggers a re-auth challenge.

Uber Eats has a similar account-cookie coupling but is generally more tolerant of IP switches. there’s a full breakdown in How to Scrape Uber Eats Restaurant Listings at Scale (2026) if you’re running a multi-platform pipeline.

structuring the output: menus, modifiers, and pricing

raw DoorDash menu data is nested. a basic burger listing can have 4 levels of modifier groups (size, toppings, sauces, add-ons), and each modifier has its own price delta. flattening this into a usable schema takes some thought upfront or you’ll regret it later.

a clean schema for analysis looks like this:

  • restaurant_id (DoorDash internal store ID)
  • restaurant_name
  • scraped_at (unix timestamp)
  • item_id, item_name, category
  • base_price_cents
  • modifier_group_name, modifier_name, modifier_price_delta_cents
  • item_calories, item_image_url

steps to build a clean pipeline:

  1. fetch restaurant listing pages or search results to collect store IDs and slugs
  2. for each store ID, fetch the menu page or graphQL endpoint
  3. extract __NEXT_DATA__ or parse the graphQL response
  4. flatten the nested modifier structure into rows (one row per item-modifier combo)
  5. write to postgres or parquet with scraped_at for time-series tracking
  6. schedule via Airflow or a simple cron every 6-24 hours depending on refresh needs

DoorDash does update prices dynamically — surge pricing on busy nights is documented — so if you’re doing price intelligence the timestamp matters more than most fields.

if your scope extends beyond the US market, How to Scrape Deliveroo Restaurant Menus UK + EU (2026) and How to Scrape Foodpanda Menu Data Asia + EU (2026) cover the regional equivalents with their own anti-bot quirks.

legal and rate-limit considerations

DoorDash’s terms of service prohibit scraping. enforcement has been inconsistent, but the hiQ vs LinkedIn precedent and its follow-on cases suggest that scraping publicly visiable data is defensible for research under the CFAA in the US. that said, this isn’t legal advice, and commercial use cases should run the ToS language by a lawyer first.

practically: DoorDash doesn’t gate its menus behind a login (you can see prices without an account). scraping publicly visible pages for research, price monitoring, or competitive analysis sits in a different risk bucket than scraping behind auth — similar to how How to Scrape Temu Product Data and Pricing in 2026 (Anti-Bot Guide) frames public product pricing.

and slow down. there’s no reason to hit 100 requests per second when 2-5 is fine for most use cases. fast and aggressive gets your proxy pool flagged and your costs up. slow and steady keeps you out of their radar entirely.

Bottom line

for most menu scraping needs, the __NEXT_DATA__ approach with curl-cffi and residential proxy rotation is the fastest path to production-ready data. if you need DashPass pricing or modifier-level accuracy at high volume, the graphQL endpoint with proper session management is worth the extra setup. DRT covers the full stack of food delivery scraping across platforms — check the sibling guides above if your pipeline spans more than one market.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
message me on telegram

Resources

Proxy Signals Podcast
Operator-level insights on mobile proxies and access infrastructure.

Multi-Account Proxies: Setup, Types, Tools & Mistakes (2026)