How to Scrape Booking.com Hotel Data with Proxies
Booking.com is the world’s largest online travel agency for accommodations, listing over 28 million properties across 220+ countries. For travel businesses, hospitality analysts, and price comparison platforms, Booking.com’s data is invaluable — but extracting it reliably requires the right proxy infrastructure.
This guide walks through the technical approach to scraping Booking.com for hotel prices, availability, reviews, and property details using DataResearchTools mobile proxies.
What Data Can You Extract from Booking.com
Available Data Points
Booking.com’s property listings contain rich, structured data:
Property Information:
- Hotel name and star rating
- Address and coordinates
- Property type (hotel, hostel, apartment, villa)
- Amenities and facilities list
- Photos (URLs)
- Check-in/check-out policies
Pricing Data:
- Room rates by room type
- Pricing for different date ranges
- Taxes and fees breakdown
- Discounts (Genius member, mobile deals, early booker)
- Cancellation policy details
- Breakfast inclusion status
Review Data:
- Overall score and category scores (cleanliness, location, staff, etc.)
- Individual review text and ratings
- Reviewer nationality and travel type
- Review date
Availability Data:
- Room availability by date
- Last booking indicators (“Only 2 rooms left!”)
- Sold-out status
Why This Data Matters
| Use Case | Key Data Needed | Business Value |
|---|---|---|
| Price comparison platform | Rates, availability, room types | Consumer transparency |
| Hotel revenue management | Competitor rates, occupancy signals | Pricing optimization |
| Market research | Pricing trends, review sentiment | Investment decisions |
| Travel agency | Rates across OTAs, direct rates | Best-price sourcing |
| Hospitality consulting | Benchmarking data, market positioning | Client advisory |
Booking.com’s Anti-Scraping Measures
Booking.com employs one of the most sophisticated anti-scraping systems in the travel industry.
Detection Layers
Layer 1: IP Reputation
- Datacenter IP ranges are blocked or heavily restricted.
- IPs with high request volumes are flagged and throttled.
- Known proxy/VPN IP ranges receive degraded service or blocks.
Layer 2: Browser Fingerprinting
- JavaScript-based fingerprinting checks for headless browser signatures.
- Canvas fingerprint, WebGL renderer, and font enumeration are analyzed.
- Inconsistencies between reported user-agent and actual browser capabilities trigger flags.
Layer 3: Behavioral Analysis
- Request timing patterns are analyzed for automation signatures.
- Navigation flow is evaluated — bots that skip directly to search results are flagged.
- Mouse movement and scroll behavior may be monitored on some pages.
Layer 4: CAPTCHA and Verification
- Suspected bots receive CAPTCHA challenges.
- Persistent flagged IPs may receive phone verification prompts.
- Some blocked requests return HTTP 429 (Too Many Requests) with extended cooldown periods.
Why Mobile Proxies Bypass These Defenses
DataResearchTools mobile proxies are effective against Booking.com’s defenses because:
- IP reputation: Mobile carrier IPs have the highest trust scores. These IPs are used by millions of real Booking.com users daily.
- IP classification: Anti-bot systems classify mobile IPs as legitimate user traffic, not proxy traffic.
- Shared IP pools: Mobile IPs are naturally shared among many users via carrier-grade NAT, so multiple requests from the same IP are expected behavior.
- Geo-accuracy: Mobile IPs from DataResearchTools resolve to the correct country, ensuring localized pricing without geo-detection flags.
Step-by-Step Scraping Setup
Prerequisites
- Python 3.8+ with Playwright or Selenium
- DataResearchTools mobile proxy credentials
- A database for storing results (PostgreSQL recommended)
Step 1: Configure Proxy Connection
Set up your DataResearchTools mobile proxy for Booking.com scraping:
PROXY_CONFIG = {
"server": "http://sg.dataresearchtools.com:10001",
"username": "your_username",
"password": "your_password"
}For geo-targeted pricing, configure proxies for each target country:
| Country | Use Case | Endpoint |
|---|---|---|
| Singapore | SGD pricing, SG promotions | sg.dataresearchtools.com |
| Thailand | THB pricing, TH promotions | th.dataresearchtools.com |
| Malaysia | MYR pricing, MY promotions | my.dataresearchtools.com |
| Indonesia | IDR pricing, ID promotions | id.dataresearchtools.com |
Step 2: Set Up Browser Automation
Booking.com requires full JavaScript rendering. A simple HTTP request will not return complete data.
Playwright setup (recommended):
from playwright.async_api import async_playwright
async def create_browser(proxy_config):
pw = await async_playwright().start()
browser = await pw.chromium.launch(
headless=True,
proxy={
"server": proxy_config["server"],
"username": proxy_config["username"],
"password": proxy_config["password"]
}
)
context = await browser.new_context(
user_agent="Mozilla/5.0 (Linux; Android 14; SM-S928B) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/122.0.0.0 Mobile Safari/537.36",
viewport={"width": 412, "height": 915},
locale="en-SG"
)
return browser, contextKey configuration points:
- Use a mobile user-agent string matching a popular device.
- Set viewport to a mobile device resolution.
- Set locale to match your proxy country.
Step 3: Navigate and Search
Mimic natural user behavior when navigating Booking.com:
- Visit the homepage first — do not jump directly to a search URL.
- Wait 2-3 seconds after page load before interacting.
- Enter search parameters through the search form (destination, dates, guests).
- Submit the search and wait for results to load.
async def search_hotels(page, destination, checkin, checkout, guests=2):
# Navigate to homepage
await page.goto("https://www.booking.com")
await page.wait_for_timeout(2000)
# Close any popup/overlay
try:
await page.click('[aria-label="Dismiss sign-in info."]', timeout=3000)
except:
pass
# Enter destination
await page.click('[data-testid="destination-container"]')
await page.fill('input[name="ss"]', destination)
await page.wait_for_timeout(1500)
# Select from autocomplete
await page.click('[data-testid="autocomplete-result"]')
# Set dates and search
# (Date picker interaction varies by Booking.com's current UI)
# ...
await page.click('[data-testid="submit-button"]')
await page.wait_for_load_state("networkidle")Step 4: Extract Listing Data
Once search results load, extract key data points:
From search results page:
- Property name and link
- Star rating
- Review score and count
- Price per night
- Distance from center
- Key amenities
From individual property pages (for detailed data):
- Full room type list with prices
- Detailed amenity breakdown
- All reviews
- Cancellation policies
- Photos
Step 5: Handle Pagination
Booking.com search results are paginated, typically showing 25 properties per page.
- Scroll to the bottom of each page to trigger lazy-loaded content.
- Click the “Next page” button or load the next offset parameter.
- Maintain the same proxy session (sticky IP) throughout pagination to avoid detection.
- Add 3-5 second delays between page loads.
Optimizing Your Scraping Strategy
Session Management
Sticky sessions are essential for Booking.com scraping:
- Maintain the same DataResearchTools mobile IP throughout a complete search session (search + pagination + property detail pages).
- Session duration of 15-20 minutes works well.
- Rotate to a new IP between different search sessions (different destinations or date ranges).
Request Pacing
Booking.com is sensitive to high request rates. Recommended pacing:
| Action | Delay |
|---|---|
| Between page loads | 5-8 seconds |
| Between form interactions | 1-3 seconds |
| Between search sessions | 15-30 seconds |
| Between property detail visits | 4-7 seconds |
Handling Booking.com’s Currency and Language
Booking.com automatically sets currency and language based on your IP location. With DataResearchTools mobile proxies:
- A Singapore proxy shows prices in SGD by default.
- A Thai proxy shows prices in THB by default.
- Language is set based on IP but can be overridden via URL parameters or cookie settings.
To compare prices across markets, run the same search from different country proxies and record the currency-specific prices.
Managing Genius and Member Pricing
Booking.com offers discounted “Genius” pricing to frequent users. To capture both public and Genius prices:
- Public pricing: Scrape without logging in. This shows the base rate available to all users.
- Genius pricing: Log into a Genius-qualified account through the proxy to see member discounts.
Note: Maintaining Booking.com accounts for scraping purposes requires careful session management. Each account should be associated with a consistent proxy IP from a single country.
Data Extraction Patterns
Extracting Price Data
Booking.com’s price display includes multiple components:
| Component | Where to Find | Notes |
|---|---|---|
| Original price | Strikethrough text near the final price | Only present when discounted |
| Final price | Prominent price display | Per night or total, varies by UI |
| Taxes and fees | “Includes taxes and fees” or separate line | May be included or additional |
| Genius discount | Tagged with Genius icon | Only visible to logged-in Genius users |
| Mobile discount | “Mobile-only price” tag | Visible with mobile user-agent |
Extracting Review Data
Reviews are loaded dynamically and may require scrolling or clicking “Show more” buttons:
- Overall score is typically in the property header area.
- Category scores (cleanliness, comfort, location, etc.) are in the review summary section.
- Individual reviews are paginated, usually 10-25 per page.
- Each review includes: score, text, reviewer country, travel type, room type, and date.
Extracting Availability Signals
Booking.com shows several availability indicators that provide valuable market intelligence:
- “Only X rooms left at this price” — indicates high demand.
- “Booked X times in the last 24 hours” — demand signal.
- Sold-out dates in the calendar — occupancy indicator.
- “Limited supply in your area” banner — area-wide demand signal.
Scaling Booking.com Scraping
Parallel Scraping by Country
Run separate scraping processes for each target country simultaneously:
Process 1: SG proxy → Booking.com search → Singapore hotels
Process 2: TH proxy → Booking.com search → Bangkok hotels
Process 3: MY proxy → Booking.com search → KL hotels
Process 4: ID proxy → Booking.com search → Bali hotelsEach process uses its own DataResearchTools proxy endpoint and operates independently.
Incremental vs. Full Crawls
- Full crawl: Scrape all properties in a destination, including all room types and details. Run weekly or monthly.
- Price check: Quick scrape of pricing for known properties. Run daily or multiple times per day.
- Availability check: Check date-specific availability for monitored properties. Run daily for upcoming dates.
Data Freshness Strategy
| Data Type | Refresh Frequency | Rationale |
|---|---|---|
| Prices | Every 6-12 hours | Prices change frequently |
| Availability | Daily | Rooms sell out daily |
| Reviews | Weekly | New reviews trickle in |
| Property details | Monthly | Amenities change rarely |
Common Issues and Solutions
Issue: Empty or Partial Results
Cause: Page did not fully render before data extraction attempted.
Solution: Increase wait times. Use wait_for_selector to confirm key elements are present before extracting.
Issue: Different Prices on Repeat Checks
Cause: Booking.com’s dynamic pricing is working as intended — prices genuinely change.
Solution: Record timestamps with every price data point. Multiple checks per day establish the price range, not a single “correct” price.
Issue: CAPTCHA After Many Requests
Cause: Request volume exceeded Booking.com’s per-IP threshold.
Solution: Reduce request rate. Rotate to a new DataResearchTools mobile IP. Ensure delays are randomized, not fixed intervals.
Issue: Redirect to Country-Specific Domain
Cause: Booking.com may redirect based on IP to a country-specific version (e.g., booking.com/sg).
Solution: This is expected behavior with country-specific proxies. It confirms your geo-targeting is working correctly.
Legal and Ethical Considerations
Booking.com’s terms of service prohibit automated data collection. Users should:
- Consult with legal counsel about applicable laws in their jurisdiction.
- Implement respectful scraping rates that do not impact platform performance.
- Avoid collecting personal data from reviews beyond what is publicly displayed.
- Consider Booking.com’s affiliate API as a complementary data source for permitted use cases.
Conclusion
Scraping Booking.com for hotel data is technically demanding due to the platform’s sophisticated anti-bot systems, dynamic JavaScript rendering, and geo-targeted pricing. DataResearchTools mobile proxies address the core challenge by providing trusted mobile carrier IPs that Booking.com’s systems treat as legitimate user traffic.
The combination of mobile proxies with proper browser automation, realistic request pacing, and country-specific geo-targeting enables reliable data collection from the world’s largest accommodation platform. Whether you need price intelligence for a handful of competitor hotels or market-wide data across multiple SEA destinations, the approach outlined in this guide provides a solid foundation.
Start with a small set of properties in a single destination, validate your data extraction against manual checks, and scale your operation as you confirm reliability.
- Mobile Proxies for Travel Data: Airfare, Hotels, and Price Intelligence
- Airfare Price Monitoring with Mobile Proxies: Track Flight Prices in Real Time
- Scraping Expedia for Travel Price Comparison with Proxies
- How to Scrape Airbnb Listings and Prices with Mobile Proxies
- Hotel Price Comparison Automation: Proxy Setup for Travel Aggregators
- How Travel Sites Show Different Prices by Location (and How to Check)
- Airfare Price Monitoring with Mobile Proxies: Track Flight Prices in Real Time
- Airline Ticket Price Tracking: Build a Fare Alert System with Proxies
- How to Access Region-Locked Ticket Sales with Mobile Proxies
- How to Avoid IP Bans on Ticketing Platforms: Proxy Rotation Strategies
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- Airfare Price Monitoring with Mobile Proxies: Track Flight Prices in Real Time
- Airline Ticket Price Tracking: Build a Fare Alert System with Proxies
- How to Access Region-Locked Ticket Sales with Mobile Proxies
- How to Avoid IP Bans on Ticketing Platforms: Proxy Rotation Strategies
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Airfare Price Monitoring with Mobile Proxies: Track Flight Prices in Real Time
- Airline Ticket Price Tracking: Build a Fare Alert System with Proxies
- How to Access Region-Locked Ticket Sales with Mobile Proxies
- How to Avoid IP Bans on Ticketing Platforms: Proxy Rotation Strategies
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Airfare Price Monitoring with Mobile Proxies: Track Flight Prices in Real Time
- Airline Ticket Price Tracking: Build a Fare Alert System with Proxies
- How to Access Region-Locked Ticket Sales with Mobile Proxies
- How to Avoid IP Bans on Ticketing Platforms: Proxy Rotation Strategies
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
Related Reading
- Airfare Price Monitoring with Mobile Proxies: Track Flight Prices in Real Time
- Airline Ticket Price Tracking: Build a Fare Alert System with Proxies
- How to Access Region-Locked Ticket Sales with Mobile Proxies
- How to Avoid IP Bans on Ticketing Platforms: Proxy Rotation Strategies
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked