How to Scrape Hotels.com Pricing Across Markets (2026)

Hotels.com pricing is a moving target: the same room can show three different rates depending on your IP location, currency, and whether you’re logged in. If you’re building a travel price intelligence feed or competitor monitoring tool, scraping Hotels.com across markets is one of the harder OTA targets in 2026 — not because the data is unavailable, but because their bot detection stack has matured significantly since Expedia Group unified its tech under one platform.

How Hotels.com Structures Its Pricing API

Hotels.com was fully migrated onto the Expedia Group platform by late 2023. That matters for scraping because the underlying data contract is now shared with Expedia’s own properties. Pricing calls go through a GraphQL layer that sits behind Cloudflare with turnstile challenges on most search entry points.

The two endpoints worth targeting:

  • /api/graphql — the primary search and availability endpoint
  • /api/v4/properties/listing — returns price summaries per property for list views

The GraphQL endpoint requires an x-pp-experiments header and a Client-Info header with a generated session signature. These rotate, but the rotation period is long enough (typically 20-30 minutes) that a rotating session approach works well. If you’ve already worked through How to Scrape Expedia Hotel Inventory in 2026, the header shapes are almost identical — that guide is worth reading first since it covers the shared auth model in depth.

Geo-Localized Pricing: Why IP Origin Matters More Than Currency Param

Hotels.com serves different base prices by IP geolocation, independent of the currency query parameter. A search from a Singapore IP for a Bangkok hotel will return a different rack rate than the same search from a US IP — often 8-18% lower on the Singapore side due to regional pricing agreements and tax display rules.

The practical implication: to build a true multi-market price comparison, you need residential or mobile IPs in each target market, not just a currency switcher. Datacenter IPs get flagged immediately on Hotels.com searches. The platform runs fingerprint checks that include IP reputation, TLS fingerprint, and header consistency — failing any one of these returns a 403 or a silent price distortion (the page loads but prices are inflated or missing).

MarketIP Type RequiredAvg Price Delta vs USCAPTCHA Rate (DC IP)
SingaporeResidential/Mobile-12%~90%
GermanyResidential-8%~85%
USAny residentialbaseline~60%
JapanMobile preferred-15%~95%
BrazilResidential+4%~80%

For Asia-Pacific pricing specifically, it’s also worth cross-referencing with How to Scrape Agoda Hotel Pricing for Asia (2026), since Agoda often has tighter inventory data for SEA markets and the price deltas tell a more complete story.

Session Management and Anti-Bot Evasion

Hotels.com’s bot detection relies heavily on behavioral signals in the first 3 seconds of a session: mouse movement patterns, scroll velocity, and Time-to-First-Byte on page resources. A raw HTTP client hitting the GraphQL endpoint directly without establishing a browser session first will get rate-limited within 5-10 requests.

The working pattern in 2026:

  1. Spin up a Playwright or Puppeteer instance with a residential proxy attached
  2. Load the Hotels.com homepage and wait 2-4 seconds (simulate landing)
  3. Trigger a search via UI interaction, not direct URL navigation
  4. Intercept the outgoing GraphQL request to capture the full headers and cookies
  5. Reuse that header bundle in a lightweight httpx client for the actual data collection loop
import httpx
import json

# Headers captured from browser session intercept
HOTELS_HEADERS = {
    "x-pp-experiments": "your-captured-value",
    "Client-Info": "your-session-signature",
    "Content-Type": "application/json",
    "Accept": "application/json",
}

GRAPHQL_URL = "https://www.hotels.com/api/graphql"

def fetch_hotel_prices(destination: str, checkin: str, checkout: str, proxy: str):
    payload = {
        "operationName": "PropertySearch",
        "variables": {
            "destination": {"regionName": destination},
            "dateRange": {"checkInDate": checkin, "checkOutDate": checkout},
            "rooms": [{"adults": 2}],
        },
        "query": "query PropertySearch($destination: ...) { ... }",
    }
    with httpx.Client(proxies=proxy, timeout=15) as client:
        r = client.post(GRAPHQL_URL, json=payload, headers=HOTELS_HEADERS)
        return r.json()

The full GraphQL query body can be captured from browser DevTools during a live search — it’s not obfuscated. Refresh it every 48 hours since the schema evolves with A/B tests.

Parsing the Response: What Actually Contains the Price

The GraphQL response nests price data three levels deep: data.propertySearch.properties.items[].price.options[].formattedDisplayPrice. The options array often contains multiple price variants (member rates, mobile-only rates, pay-now vs. pay-later). Don’t take index 0 blindly — filter for options[].type == "STANDALONE" to get the baseline rack rate.

Taxes are a separate node (taxesAndFees) and are not included in formattedDisplayPrice by default for most EU and APAC markets. Your comparison logic needs to normalize for this. For Asian inventory comparisons, How to Scrape Trip.com (Asia Inventory) at Scale (2026) covers how Trip.com handles tax-inclusive vs. exclusive display, which is a useful reference when building a unified price schema.

The fields worth extracting per property:

  • propertyId — stable across markets, usable as a join key
  • name, starRating, guestReviews.rating
  • price.options[].formattedDisplayPrice
  • price.options[].strikeOut (original price, useful for discount tracking)
  • availability.minRoomsLeft — useful for urgency signal modeling

Scaling Across Markets Without Getting Blocked

For multi-market coverage at scale, a per-market IP pool is non-negotiable. A single proxy rotating through global IPs will contaminate your price data with mixed geolocations. Structure your collection jobs by market, with each market running on a dedicated pool of residential IPs from that country.

Rate limits on Hotels.com’s GraphQL endpoint sit around 120-150 requests per IP per hour before soft throttling kicks in. At that cadence, a 10-IP pool per market handles roughly 1,200 property checks per hour per market, which is sufficient for most competitive monitoring use cases.

If you’re also pulling flight data to pair with hotel pricing, How to Scrape Kayak Flight + Hotel Data (2026) covers the flight-side collection and how to join it with OTA hotel records into a unified travel package dataset.

For error handling, the response codes you’ll actually see in production:

  • 200 + empty items array — geo mismatch or low inventory (not a bug)
  • 403 — session invalid or IP flagged, rotate both
  • 429 — rate limit hit, back off 90 seconds minimum
  • 200 + distorted prices — silent block, verify against known properties

If you’re building out broader data collection pipelines and want a comparison of scraping frameworks and review data sources, How to Scrape G2.com and Capterra SaaS Reviews Programmatically is a useful reference for seeing how the same session management and pagination patterns apply across completely different site types.

Bottom Line

Hotels.com is a viable scraping target in 2026 if you treat geo-accurate residential IPs as infrastructure, not an afterthought, and build your session layer around browser-intercepted headers rather than hand-crafting auth. The Expedia Group platform convergence actually simplifies things slightly — patterns that work on Hotels.com translate directly to other Expedia properties. DRT will keep this guide updated as the GraphQL schema evolves through 2026.

Related guides on dataresearchtools.com

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top

Resources

Proxy Signals Podcast
Operator-level insights on mobile proxies and access infrastructure.

Multi-Account Proxies: Setup, Types, Tools & Mistakes (2026)