how to scrape Booking.com hotel prices (2026 anti-bot guide)
Booking.com is one of the harder travel sites to scrape in 2026 because it sits behind Akamai Bot Manager plus its own dynamic pricing layer. you need residential proxies, a real headless browser like Playwright, careful rate limits, and the right strategy for handling per-session price tokens. this guide walks through working Python code, the Akamai-specific gotchas, and what data you can actually extract reliably.
what you can scrape from Booking.com
Booking.com pages are deeply dynamic, but the high-value fields are stable enough for production scrapers.
| field | location | difficulty |
|---|---|---|
| hotel name, location | search results, hotel page | easy |
| star rating, review score | search results | easy |
| nightly price (with dates) | search results | medium |
| total price + taxes | hotel page | medium |
| room types and inclusions | hotel page | medium |
| availability calendar | hotel page (dynamic) | hard |
| review text + reviewer location | reviews tab | hard |
| photos | hotel page | easy |
most price-monitoring use cases need only the first three. for anything more complex, expect more anti-bot friction. for the broader use case, our price monitoring proxy guide covers infrastructure decisions.
the Akamai problem
Booking.com runs Akamai Bot Manager which inspects three things on every request: TLS fingerprint (JA3/JA4), HTTP/2 fingerprint, and a per-session token called _abck that gets validated against a sensor payload generated by client-side JavaScript.
plain requests or httpx will fail because the TLS fingerprint reveals Python instantly. even curl gets blocked. you need either a real browser (Playwright, Puppeteer) or a TLS-impersonating client like curl_cffi.
if you want a deeper look at Akamai itself, our Akamai bypass guide covers the mechanism in detail. the same techniques apply directly to Booking.com.
install the stack
pip install playwright curl_cffi parsel
playwright install chromium
Playwright launches a real Chromium browser. curl_cffi impersonates Chrome’s TLS fingerprint for the lighter price-checks where you don’t need full JavaScript rendering. parsel parses the HTML.
scrape search results with Playwright
start with a search page that returns hotels for a city and date range.
import asyncio
from playwright.async_api import async_playwright
from parsel import Selector
PROXY = {'server': 'http://gateway.example.com:8000', 'username': 'u', 'password': 'p'}
URL = 'https://www.booking.com/searchresults.html?ss=Singapore&checkin=2026-06-01&checkout=2026-06-03&group_adults=2'
async def scrape_search():
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True, proxy=PROXY)
ctx = await browser.new_context(
user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
locale='en-US',
viewport={'width': 1366, 'height': 900},
)
page = await ctx.new_page()
await page.goto(URL, wait_until='networkidle')
await page.wait_for_selector('div[data-testid="property-card"]', timeout=15000)
html = await page.content()
await browser.close()
sel = Selector(text=html)
hotels = []
for card in sel.css('div[data-testid="property-card"]'):
hotels.append({
'name': card.css('div[data-testid="title"]::text').get(''),
'location': card.css('span[data-testid="address"]::text').get(''),
'score': card.css('div[data-testid="review-score"] div::text').get(''),
'price': card.css('span[data-testid="price-and-discounted-price"]::text').get(''),
'url': card.css('a[data-testid="title-link"]::attr(href)').get(''),
})
return hotels
asyncio.run(scrape_search())
data-testid selectors are the most stable. CSS class names on Booking.com change frequently because their build pipeline auto-generates them. the testid attributes survive UI tweaks because they’re part of the QA harness.
handle the cookie consent banner
first-time visitors see a GDPR banner that blocks page interactions. dismiss it before doing anything.
try:
await page.click('button#onetrust-accept-btn-handler', timeout=3000)
except:
pass
wrap it in a try/except because the banner only shows for fresh sessions. if your proxy gives you a sticky session, the cookie persists and the banner doesn’t appear on the next request.
rotate proxies and sessions
Booking.com’s per-IP rate limit is roughly 30-60 requests per hour before Akamai starts challenging you. with a residential pool, you want sticky sessions of 10-15 minutes per IP, then rotate.
async def scrape_many_cities(cities):
async with async_playwright() as p:
for city in cities:
session_id = f'session-{city}'
proxy = {
'server': 'http://gateway.example.com:8000',
'username': f'user-session-{session_id}',
'password': 'pass',
}
browser = await p.chromium.launch(headless=True, proxy=proxy)
ctx = await browser.new_context()
page = await ctx.new_page()
# scrape this city
...
await browser.close()
await asyncio.sleep(5)
most residential proxy providers let you specify a session ID in the username (user-session-XXX). same session ID = same IP. change the ID to rotate. our Akamai bypass guide covers the fingerprinting layer in more depth.
handle pagination
Booking.com paginates with offset query params. add &offset=25 (or 50, 75, etc.) to the search URL.
for offset in range(0, 250, 25):
url = f'{base_url}&offset={offset}'
await page.goto(url, wait_until='networkidle')
# extract cards
each page returns roughly 25 results. don’t paginate past 1,000 results from the same search; Akamai flags deep pagination as automation. for big crawls, split by city + date pair instead.
scrape an individual hotel page
hotel pages contain pricing per room type, availability, and amenity details.
async def scrape_hotel(url):
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True, proxy=PROXY)
ctx = await browser.new_context(locale='en-US')
page = await ctx.new_page()
await page.goto(url, wait_until='networkidle')
await page.wait_for_selector('h2[data-testid="property-header-name"]', timeout=15000)
html = await page.content()
await browser.close()
sel = Selector(text=html)
return {
'name': sel.css('h2[data-testid="property-header-name"]::text').get(''),
'address': sel.css('span[data-testid="address"]::text').get(''),
'rating': sel.css('div[data-testid="review-score-component"] div::text').get(''),
'rooms': sel.css('table#hprt-table tr.hprt-table-row').getall(),
}
the room table is the trickiest part because the markup is legacy (table-based, lots of nested rows for room options). parse it row by row and join with the room-name column.
handle dynamic prices
prices on Booking.com depend on cookies, locale, and currency settings. the same hotel can show different prices to different users in the same city. for accurate price monitoring you need to:
- set a consistent locale and currency at session start
- use the same IP geolocation for repeat scrapes (US IP = USD by default)
- include explicit
&selected_currency=USDin the URL - compare like-for-like by date range and occupancy
if your IP rotates between countries mid-session, prices will jump because the currency conversion changes. residential pools with country-targeting solve this.
faq
can I scrape Booking.com without Playwright?
yes for static-looking pages, no for anything that involves dynamic price tokens. curl_cffi with Chrome’s TLS fingerprint can fetch some search result HTML, but room-level pricing and availability requires the JavaScript runtime. start with Playwright for reliability, optimize to lighter clients only after you understand which pages are safely fetchable.
what proxies do I need?
residential is the minimum. mobile is overkill unless you’re scraping at very high volume. datacenter IPs get blocked instantly because Akamai recognizes the AS numbers. for provider picks, see our provider comparison.
how do I avoid the Akamai _abck challenge?
use a real browser (Playwright with stealth), don’t disable JavaScript, keep cookies across requests in the same session, and respect rate limits. headers alone won’t pass; the sensor payload requires real DOM execution.
is scraping Booking.com legal?
public price data is generally legal to collect, but Booking.com’s terms of service prohibit automated access. for personal research, low risk. for commercial use, consult a lawyer and consider their official Booking.com Affiliate Partner Program for hotel data instead. for related legal context, our web scraping legal guide covers the broader rules.
why are my scraped prices different from what I see in my browser?
prices vary by IP geolocation, currency, device type, and even browsing history. always pin currency, locale, viewport size, and user agent. if your scraper IP is in Singapore but you want US-resident pricing, get a US residential IP.
how often does Booking.com change selectors?
data-testid attributes are stable across most updates. CSS class names rotate with each deploy (often weekly). build parsers around testids, not classes, and you’ll cut maintenance to once per quarter instead of weekly.
conclusion
scraping Booking.com works in 2026 if you bring real browser automation, residential proxies, and respect for the per-IP rate limits. Akamai is the main obstacle and Playwright (or any real Chromium) handles it transparently as long as you don’t disable JavaScript or strip cookies.
focus on the data-testid selectors, pin your locale and currency, and rotate sticky sessions every 10-15 minutes. that combination keeps you under the radar while collecting clean price data at meaningful volume.
if you’re doing this commercially, consider Booking.com’s official affiliate API or the managed scraping APIs that handle the anti-bot for you. for personal research and price-comparison side projects, the Playwright approach in this guide is plenty.