Real estate data scraping isn’t just an American pursuit. Property markets in the UK, Australia, Canada, Germany, and dozens of other countries operate through their own listing portals, each with unique data structures, anti-bot defenses, and legal frameworks. If you’re an international investor, a proptech startup expanding globally, or a data provider serving clients across borders, you need a scraping strategy that accounts for the distinct challenges of non-US real estate platforms.
This guide covers how to scrape the world’s major property portals — from Rightmove and Zoopla in the UK to Domain.com.au and REA Group in Australia — including the geo-specific proxy configurations, data normalization techniques, and legal considerations you’ll need for each region.
Major International Real Estate Portals by Region
Before diving into technical setup, it helps to understand the landscape. Unlike the US market, which is dominated by Zillow, Realtor.com, and Redfin, most countries have their own dominant platforms with distinct data models.
United Kingdom
The UK property market revolves around two main portals: Rightmove and Zoopla. Rightmove is the clear market leader with over 80% of UK estate agent listings. Both platforms list properties for sale and rent, with detailed property descriptions, floorplans, Energy Performance Certificate (EPC) ratings, and council tax bands — data fields that don’t exist in US listings.
Australia
Australia’s market is split between Domain.com.au and realestate.com.au (owned by REA Group). A unique feature of Australian property data is the prevalence of auction results — auctions are the dominant sale method in Sydney and Melbourne, making auction clearance rates a critical market indicator that has no US equivalent.
Canada
Canadian real estate data is notoriously fragmented across regional MLS boards (CREA, TREB, GVR) with aggressive data access restrictions. Realtor.ca is the primary public-facing portal, but data depth varies significantly by province.
Europe
European markets are served by country-specific portals: Immobilienscout24 (Germany), SeLoger and Leboncoin (France), Idealista (Spain), Immobiliare.it (Italy), and Funda (Netherlands). Pan-European aggregators like Green-Acres exist but typically offer less data depth than local platforms.
| Country | Primary Portal(s) | Key Data Differences from US | Anti-Bot Difficulty |
|---|---|---|---|
| United Kingdom | Rightmove, Zoopla | EPC ratings, council tax bands, leasehold vs freehold | Moderate-High |
| Australia | Domain.com.au, realestate.com.au | Auction results, land size in sqm, strata fees | High |
| Canada | Realtor.ca | Provincial variations, bilingual listings (Quebec) | High |
| Germany | ImmobilienScout24 | Energy certificate, warm/cold rent, Wohnfläche | Very High |
| France | SeLoger, Leboncoin | DPE energy ratings, Carrez law measurements | Moderate |
| Spain | Idealista | Cadastral reference, community fees, built vs usable area | Moderate |
| Netherlands | Funda | WOZ value (gov valuation), energy labels | High |
| Japan | Suumo, Homes.co.jp | Building age critical, tsubo measurement, floor number | Low-Moderate |
Why Geo-Specific Proxies Are Essential for International Scraping
International real estate portals are designed to serve local users. Unlike e-commerce sites that welcome global traffic, property portals often restrict or modify content based on the visitor’s geographic location. This makes geo-targeted proxies not just helpful but mandatory for effective scraping.
Content Gating by Geography
Several major portals actively block or limit access from non-local IP addresses. Rightmove, for example, may serve CAPTCHAs more aggressively to non-UK traffic. Domain.com.au may limit search result depth for international visitors. ImmobilienScout24 in Germany is particularly aggressive, often blocking non-German IPs entirely during high-traffic periods.
Currency and Localization Issues
Some portals automatically convert prices or change language based on the requester’s IP location. If you’re scraping Australian property prices with a US-based proxy, some sites might display USD-converted prices instead of the native AUD values — introducing conversion rate noise into your dataset. Using Australian proxies ensures you collect native-currency pricing consistently.
For a comprehensive overview of how proxy location affects data access, see our guide on the best proxy server countries and geo-location strategies.
Recommended Proxy Configuration by Region
| Target Region | Proxy Type | Pool Size (per portal) | Rotation Strategy |
|---|---|---|---|
| UK (Rightmove, Zoopla) | UK residential rotating | 50-100 IPs | Rotate per request, 5-10s delay |
| Australia (Domain, REA) | Australian residential or ISP | 30-50 IPs | Rotate per session, 8-15s delay |
| Canada (Realtor.ca) | Canadian residential | 50-100 IPs | Rotate per request, 10-20s delay |
| Germany (ImmobilienScout24) | German residential or mobile | 100+ IPs | Rotate per request, 10-15s delay |
| France (SeLoger) | French residential | 30-50 IPs | Rotate per session, 5-10s delay |
| Spain (Idealista) | Spanish residential | 30-50 IPs | Rotate per request, 8-12s delay |
| Japan (Suumo) | Japanese ISP or residential | 20-30 IPs | Rotate per session, 3-8s delay |
Data Normalization Across International Markets
Collecting data from global portals is only half the challenge. Making that data comparable requires careful normalization across several dimensions.
Area Measurements
The US uses square feet; virtually every other country uses square meters. But the complexity goes deeper:
- Japan uses “tsubo” (approximately 3.3 sqm) alongside square meters for land, and “jo” (tatami mat size) for room dimensions
- France uses “Carrez law” area (which excludes spaces under 1.8m ceiling height) — a legally mandated measurement that differs from gross floor area
- Germany distinguishes between Wohnfläche (living area) and Nutzfläche (usable area), calculated according to specific regulations
- Australia commonly reports internal area, external area, and total area separately
Currency Normalization
Simply converting all prices to USD introduces exchange rate volatility. Better approaches include:
- Storing prices in native currency alongside the exchange rate at the time of collection
- Using purchasing power parity (PPP) adjustments for cross-country comparisons
- Normalizing to price-per-square-meter in local currency for within-country analysis
- Applying rolling 30-day average exchange rates rather than spot rates to smooth volatility
Property Type Classification
Property type taxonomies vary dramatically across cultures. A UK “terraced house” is roughly equivalent to a US “townhouse” but not exactly. Australian “units” encompass what Americans would split into condos and apartments. German “Eigentumswohnung” (owner-occupied flat) has no direct US equivalent in terms of legal structure. Build a mapping table that translates local property types into a standardized taxonomy for your database.
Tenure and Ownership Models
The US primarily operates on a freehold ownership model. But many countries have ownership structures that fundamentally affect property valuation:
- UK leasehold: A large percentage of UK flats are leasehold, meaning the buyer owns the property for a fixed term (often 99-999 years) but not the land. Remaining lease length dramatically affects value.
- Australian strata: Strata-titled properties come with quarterly levies that can run into thousands of dollars — effectively a permanent carrying cost that reduces property value.
- German Erbpacht: Hereditary building rights where you own the building but lease the land — common in some cities and significantly impacts pricing.
Scraping Strategies for Major International Portals
Rightmove (UK)
Rightmove structures its listings with well-organized HTML and predictable URL patterns. Search results can be paginated by area, price range, and property type. Key fields to extract include asking price, property type, number of bedrooms, EPC rating, council tax band, and the listing agent. Rightmove has moderate anti-bot protections — residential UK proxies with reasonable request delays (5-10 seconds) typically maintain reliable access.
Domain.com.au (Australia)
Domain uses a more complex frontend with significant JavaScript rendering. Headless browser scraping (Puppeteer or Playwright) is typically required. Australian auction data — including guide prices, auction dates, and results — is particularly valuable and unique to this market. Domain’s anti-bot measures are relatively aggressive; Australian residential proxies are strongly recommended.
ImmobilienScout24 (Germany)
ImmobilienScout24 (IS24) is one of the most challenging real estate portals to scrape globally. It employs advanced bot detection including browser fingerprinting, behavioral analysis, and aggressive IP-based blocking. German residential or mobile proxies are essential, and you should expect to use headless browsers with realistic fingerprinting. Request delays of 10-15 seconds between pages are the minimum to avoid detection.
Understanding how platforms use geographic data to modify their responses is covered in detail in our article on geo-based price discrimination and proxies.
Regional Legal Considerations
Data scraping legality varies significantly across jurisdictions. While this is not legal advice, here are the key frameworks to be aware of:
European Union — GDPR
The General Data Protection Regulation applies to any personal data collected from EU-based websites, even if your company is based outside the EU. Property listings that include agent names, seller names, or identifiable photos may constitute personal data. The legitimate interest basis may apply for market analysis, but you should document your data processing rationale and implement data minimization practices.
United Kingdom — UK GDPR and Database Rights
Post-Brexit, the UK operates under its own version of GDPR plus the sui generis database right under the Copyright, Designs and Patents Act. This database right protects substantial investment in compiling databases — potentially relevant to scraping entire portal databases. Focus on extracting factual property data rather than reproducing the portal’s full database structure.
Australia — Competition and Consumer Act
Australia lacks a specific data scraping law but the Competition and Consumer Act and copyright law apply. The 2020 iiNet case established that linking to or indexing publicly available content is generally permissible, but wholesale reproduction may not be. Australian courts are more conservative than US courts on data scraping issues.
| Jurisdiction | Key Legal Framework | Risk Level for Property Data Scraping | Key Consideration |
|---|---|---|---|
| United States | CFAA, state laws | Lower (post hiQ v. LinkedIn) | Respect Terms of Service, no authentication bypass |
| European Union | GDPR, Database Directive | Moderate | Personal data handling, database right protections |
| United Kingdom | UK GDPR, CDPA | Moderate | Database right applies to substantial data extraction |
| Australia | Copyright Act, CCA | Moderate-High | Conservative courts, limited fair use doctrine |
| Canada | PIPEDA, Copyright Act | Moderate | Evolving case law on scraping scope |
| Germany | GDPR, UrhG, UWG | Higher | Strong database protection and unfair competition laws |
Building a Multi-Country Scraping Pipeline
Architecture Recommendations
For scraping across multiple international portals, structure your pipeline with these components:
- Per-country scraper modules: Each portal needs its own parser tuned to its specific HTML structure, data fields, and pagination patterns
- Geo-specific proxy pools: Maintain separate proxy pools for each target country, managed through a centralized proxy rotation service
- Unified data schema: Define a canonical property data model that accommodates all country-specific fields while maintaining a common core (location, price, area, property type)
- Currency service: Integrate a currency conversion service that records exchange rates at collection time and supports multiple normalization methods
- Monitoring per region: Track success rates, block rates, and data quality metrics separately for each country and portal
Scheduling Across Time Zones
Real estate portals are busiest during local business hours and evenings. Schedule your scraping to run during each portal’s off-peak hours — typically 2:00 AM to 6:00 AM local time. This means your scraping jobs for UK, Australian, and US portals will run at different UTC times, which actually works in your favor by distributing your infrastructure load throughout the day.
Frequently Asked Questions
Do I need proxies from the same country as the real estate portal I’m scraping?
In most cases, yes. Major international portals like Rightmove (UK), Domain.com.au (Australia), and ImmobilienScout24 (Germany) all exhibit some form of geographic content filtering. Using proxies from the portal’s home country ensures you see the same content as local users, receive prices in the correct currency, and encounter less aggressive anti-bot challenges. Some portals will outright block or heavily CAPTCHA non-local IPs.
How do I handle non-English property listings?
For non-English portals, you have two options: scrape in the native language and translate afterward, or use the portal’s English language version if available. Scraping in the native language is generally preferred because English versions often contain fewer listings and less detail. For translation, machine translation APIs work well for structured property data (addresses, features) though listing descriptions may require more sophisticated NLP processing. Store the original language text alongside any translations for data integrity.
Which international markets are easiest to scrape?
Japanese portals (Suumo, Homes.co.jp) tend to have the lightest anti-bot protections, followed by Spanish (Idealista) and French (SeLoger) platforms. UK portals (Rightmove, Zoopla) fall in the moderate range. Australian (Domain, realestate.com.au) and German (ImmobilienScout24) portals are among the most challenging, with sophisticated bot detection and aggressive blocking of non-residential IPs.
How do I normalize property sizes across countries that use different measurement systems?
Store all area measurements in their native format alongside a converted value. Use square meters as your standard unit (1 square foot equals approximately 0.0929 square meters). However, be aware that different countries measure property area differently — French Carrez measurements exclude spaces under 1.8m ceiling height, German Wohnfläche follows specific calculation standards, and Japanese tsubo measurements relate to traditional mat sizing. Document which measurement standard each source uses so downstream analysis can account for these differences.
Can I scrape real estate portals protected by GDPR?
GDPR does not outright prohibit web scraping. It regulates how personal data is processed. Property listings that contain only property characteristics and pricing are generally not personal data. However, listings that include agent or seller names, photos of identifiable individuals, or other personal identifiers require you to have a lawful basis for processing. The “legitimate interest” basis may apply for market analysis, but you must conduct a legitimate interest assessment, implement data minimization, and be prepared to honor data subject access requests. Consulting with a privacy attorney familiar with your target jurisdictions is strongly recommended.