Government Land Registry and Property Records Scraping Guide

Government Land Registry and Property Records Scraping Guide

Land registry and property records contain some of the most commercially valuable government data available. Ownership information, transaction histories, property valuations, zoning designations, and encumbrance records drive decisions in real estate investment, due diligence, financial lending, urban planning, and market research.

Across Southeast Asia, governments are progressively digitizing land records, creating opportunities for automated data collection that was impossible a decade ago. This guide explains how to scrape land registry and property records using proxy infrastructure.

Commercial Value of Property Records Data

Real Estate Investment

Property records enable investors to:

  • Identify ownership patterns and large portfolio holders
  • Track transaction volumes and price trends by area
  • Discover off-market opportunities through ownership research
  • Assess property histories including liens, mortgages, and disputes
  • Monitor land banking activities and development speculation

Due Diligence

Property records are essential for:

  • Verifying property ownership before transactions
  • Checking for encumbrances, liens, and mortgages
  • Confirming property boundaries and dimensions
  • Reviewing zoning and permitted use designations
  • Identifying potential environmental or legal issues

Financial Services

Banks and financial institutions use property records for:

  • Mortgage underwriting and property valuation
  • Collateral verification for secured lending
  • Portfolio risk assessment based on property market data
  • Anti-money laundering checks on property transactions

Market Research

Property data supports:

  • Real estate market analysis and trend reporting
  • Urban development pattern research
  • Property tax assessment analysis
  • Housing affordability studies

Land Registry Systems in ASEAN

Singapore

Singapore Land Authority (SLA)

  • INLIS (Integrated Land Information Service): Online title search service
  • SLA OneMap: Geographic property data, land parcels, planning zones
  • URA Space: Development charge data, planning parameters
  • HDB Resale Price Index: Public housing transaction data

Singapore has one of the most advanced land registry systems in the world. INLIS provides comprehensive title information, though detailed searches require payment. Public datasets on HDB transactions and URA planning data are freely available.

Indonesia

BPN (Badan Pertanahan Nasional)

  • ATR/BPN (Kementerian ATR/BPN): National land agency
  • KiosK BPN: Land certificate information service
  • Sentuh Tanahku App: Digital land information access

Indonesia’s land registry system is complex due to multiple title types (Hak Milik, Hak Guna Bangunan, Hak Pakai) and ongoing digitization efforts. Many records are still paper-based, though digital access is expanding.

Philippines

Land Registration Authority (LRA)

  • eLRA: Electronic land registration system
  • Register of Deeds: Title registration offices
  • BIR eCAR: Electronic Certificate Authorizing Registration

The Philippines is transitioning from the Torrens title system to digital registration. Some records are available online, but many still require in-person searches.

Thailand

Department of Lands

  • Land title deed (Chanote): Highest form of land ownership document
  • Online verification services for some land data
  • Provincial land offices maintain records

Malaysia

Department of Land and Mines (JKPTG) / State Land Offices

  • eTanah: Electronic land administration system
  • MyDaftarHarta: Property registration portal
  • JPPH (Valuation and Property Services): Transaction data and indices

Vietnam

MONRE / Provincial Land Registration Offices

  • Land Use Rights Certificates (LURC): The primary ownership document
  • National Land Database: Under development for comprehensive digital access

Technical Challenges

Authentication Requirements

Most land registry systems require user authentication:

  • Paid subscription services (Singapore INLIS)
  • Government-issued digital IDs
  • Registered user accounts with identity verification
  • Per-query fees for detailed information

Anti-Scraping Measures

Land registries implement strong protections:

  • CAPTCHAs on search interfaces
  • Rate limiting per account and IP
  • Session management with short timeouts
  • Query volume monitoring and restriction

Data Sensitivity

Property records contain personal information (owner names, addresses). Handle this data with appropriate care regarding privacy regulations.

Complex Search Interfaces

Property searches often require specific identifiers:

  • Title numbers or certificate references
  • Lot and plan numbers
  • Cadastral references
  • Postal codes or geographic identifiers

Proxy Strategy for Property Records

Why Proxies Are Needed

Even with authenticated access, proxy infrastructure helps by:

  • Distributing search volumes across multiple IPs to avoid rate limits
  • Providing geographic authenticity for local land registry access
  • Enabling concurrent searches across multiple registries
  • Maintaining session stability for complex multi-step searches

DataResearchTools provides mobile proxies across all ASEAN countries with native carrier IPs that land registry websites recognize as legitimate local traffic.

Country-Specific Proxy Configuration

class PropertyProxyManager:
    """Manage proxies for property record scraping."""

    def __init__(self, proxy_manager):
        self.proxy_manager = proxy_manager

    REGISTRY_CONFIGS = {
        'SG': {
            'session_type': 'sticky',
            'session_duration': 600,
            'delay_range': (3, 7),
            'max_concurrent': 3
        },
        'ID': {
            'session_type': 'sticky',
            'session_duration': 300,
            'delay_range': (5, 10),
            'max_concurrent': 2
        },
        'PH': {
            'session_type': 'sticky',
            'session_duration': 300,
            'delay_range': (4, 8),
            'max_concurrent': 2
        },
        'TH': {
            'session_type': 'rotating',
            'delay_range': (5, 10),
            'max_concurrent': 2
        },
        'MY': {
            'session_type': 'sticky',
            'session_duration': 600,
            'delay_range': (3, 6),
            'max_concurrent': 3
        },
        'VN': {
            'session_type': 'rotating',
            'delay_range': (5, 12),
            'max_concurrent': 1
        }
    }

    def get_config(self, country):
        return self.REGISTRY_CONFIGS.get(country, self.REGISTRY_CONFIGS['SG'])

    def get_proxy(self, country, session_id=None):
        config = self.get_config(country)
        if config['session_type'] == 'sticky' and session_id:
            return self.proxy_manager.get_sticky_proxy(
                country=country,
                session_id=session_id,
                duration=config['session_duration']
            )
        return self.proxy_manager.get_proxy_for_country(country)

Building Property Record Scrapers

Singapore Public Property Data

Singapore provides several publicly accessible property datasets:

class SingaporePropertyScraper:
    """Scrape publicly available Singapore property data."""

    def __init__(self, proxy_manager):
        self.proxy_manager = proxy_manager

    def fetch_hdb_resale_transactions(self):
        """Fetch HDB resale transaction data from data.gov.sg."""
        proxy = self.proxy_manager.get_proxy_for_country('SG')

        response = requests.get(
            "https://data.gov.sg/api/action/datastore_search",
            params={
                'resource_id': 'f1765b54-a209-4718-8d38-a39237f502b3',
                'limit': 1000,
                'sort': 'month desc'
            },
            proxies=proxy,
            timeout=30
        )

        if response.status_code == 200:
            return response.json().get('result', {}).get('records', [])
        return []

    def fetch_private_property_transactions(self):
        """Fetch private property transaction data from URA."""
        proxy = self.proxy_manager.get_proxy_for_country('SG')
        session = requests.Session()
        session.proxies = proxy

        response = session.get(
            "https://www.ura.gov.sg/realEstateIIWeb/transaction/search.action",
            headers={
                'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)',
                'Accept-Language': 'en-SG,en;q=0.9'
            },
            timeout=30
        )

        return self._parse_ura_transactions(response.text)

    def fetch_planning_data(self, area=None):
        """Fetch planning and zoning data from URA."""
        proxy = self.proxy_manager.get_proxy_for_country('SG')

        params = {}
        if area:
            params['planningArea'] = area

        response = requests.get(
            "https://www.onemap.gov.sg/api/public/themesvc/retrieveTheme",
            params={**params, 'queryName': 'Master_Plan_Land_Use'},
            proxies=proxy,
            timeout=30
        )

        return response.json() if response.status_code == 200 else None

Malaysia Property Transaction Data

class MalaysiaPropertyScraper:
    """Scrape Malaysian property transaction data."""

    def __init__(self, proxy_manager):
        self.proxy_manager = proxy_manager

    def fetch_jpph_data(self, state=None, year=None):
        """Fetch property transaction data from JPPH (Valuation Department)."""
        proxy = self.proxy_manager.get_proxy_for_country('MY')
        session = requests.Session()
        session.proxies = proxy

        response = session.get(
            "https://napic.jpph.gov.my/portal",
            params={
                'state': state,
                'year': year
            },
            headers={
                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)',
                'Accept-Language': 'ms-MY,ms;q=0.9,en;q=0.8'
            },
            timeout=30
        )

        return self._parse_jpph_data(response.text)

    def fetch_property_index(self):
        """Fetch Malaysian House Price Index data."""
        proxy = self.proxy_manager.get_proxy_for_country('MY')

        response = requests.get(
            "https://napic.jpph.gov.my/portal/web/guest/main",
            proxies=proxy,
            timeout=30
        )

        return self._parse_index_data(response.text)

Indonesia Land Data

class IndonesiaLandScraper:
    """Scrape Indonesian land and property data."""

    def __init__(self, proxy_manager):
        self.proxy_manager = proxy_manager

    def search_land_certificates(self, province, district):
        """Search for land certificate information."""
        proxy = self.proxy_manager.get_proxy_for_country('ID')
        session = requests.Session()
        session.proxies = proxy

        response = session.get(
            "https://www.atrbpn.go.id/",
            headers={
                'User-Agent': 'Mozilla/5.0 (Linux; Android 13)',
                'Accept-Language': 'id-ID,id;q=0.9'
            },
            timeout=30
        )

        return self._parse_land_data(response.text)

    def fetch_land_value_zones(self, city):
        """Fetch NJOP (land value assessment) zone data."""
        proxy = self.proxy_manager.get_proxy_for_country('ID')
        session = requests.Session()
        session.proxies = proxy

        # NJOP data is typically published by local tax offices
        response = session.get(
            f"https://bapenda.{city}.go.id/njop",
            timeout=30
        )

        return self._parse_njop_data(response.text)

Data Normalization

Unified Property Record Schema

class PropertyRecordNormalizer:
    """Normalize property records across ASEAN countries."""

    def normalize(self, raw_record, country):
        """Normalize a property record into unified format."""
        return {
            'record_id': self._generate_id(raw_record, country),
            'country': country,
            'property': {
                'type': self._normalize_type(raw_record.get('type', ''), country),
                'address': raw_record.get('address', ''),
                'area_sqm': self._convert_area(
                    raw_record.get('area'), raw_record.get('area_unit', 'sqm')
                ),
                'land_area_sqm': self._convert_area(
                    raw_record.get('land_area'), raw_record.get('area_unit', 'sqm')
                ),
                'zoning': raw_record.get('zoning', ''),
                'title_type': self._normalize_title_type(
                    raw_record.get('title_type', ''), country
                )
            },
            'transaction': {
                'type': raw_record.get('transaction_type', ''),
                'date': raw_record.get('transaction_date'),
                'price_local': raw_record.get('price'),
                'currency': self._get_currency(country),
                'price_usd': self._to_usd(
                    raw_record.get('price', 0), country
                ),
                'price_per_sqm': self._calc_psm(raw_record)
            },
            'ownership': {
                'owner_type': raw_record.get('owner_type', ''),
                'tenure': raw_record.get('tenure', '')
            },
            'location': {
                'latitude': raw_record.get('lat'),
                'longitude': raw_record.get('lng'),
                'district': raw_record.get('district', ''),
                'city': raw_record.get('city', ''),
                'state_province': raw_record.get('state', '')
            },
            'source': raw_record.get('source', ''),
            'scraped_at': datetime.utcnow().isoformat()
        }

    def _normalize_title_type(self, title_type, country):
        """Normalize title types across countries."""
        title_map = {
            'ID': {
                'hak milik': 'freehold',
                'hak guna bangunan': 'building_right',
                'hak guna usaha': 'cultivation_right',
                'hak pakai': 'use_right'
            },
            'SG': {
                'freehold': 'freehold',
                '999-year': 'long_lease',
                '99-year': 'leasehold'
            },
            'MY': {
                'freehold': 'freehold',
                'leasehold': 'leasehold'
            }
        }

        country_map = title_map.get(country, {})
        return country_map.get(title_type.lower(), title_type)

Analysis and Intelligence

Market Analysis

class PropertyMarketAnalyzer:
    """Analyze property market data from collected records."""

    def price_trend_analysis(self, db, country, city, property_type, months=24):
        """Analyze price trends for a specific market segment."""
        transactions = db.get_transactions(
            country=country, city=city,
            property_type=property_type, months=months
        )

        monthly_data = {}
        for tx in transactions:
            month = tx['transaction_date'][:7]
            if month not in monthly_data:
                monthly_data[month] = []
            if tx.get('price_per_sqm'):
                monthly_data[month].append(tx['price_per_sqm'])

        trend = []
        for month in sorted(monthly_data.keys()):
            prices = monthly_data[month]
            trend.append({
                'month': month,
                'median_psm': sorted(prices)[len(prices)//2],
                'avg_psm': sum(prices) / len(prices),
                'volume': len(prices),
                'total_value': sum(prices)
            })

        return trend

    def hotspot_analysis(self, db, country, property_type, months=6):
        """Identify property market hotspots by transaction volume and growth."""
        transactions = db.get_transactions(
            country=country, property_type=property_type, months=months
        )

        district_data = {}
        for tx in transactions:
            district = tx.get('district', 'Unknown')
            if district not in district_data:
                district_data[district] = {
                    'transactions': 0,
                    'total_value': 0,
                    'prices_psm': []
                }
            district_data[district]['transactions'] += 1
            district_data[district]['total_value'] += tx.get('price_local', 0)
            if tx.get('price_per_sqm'):
                district_data[district]['prices_psm'].append(tx['price_per_sqm'])

        hotspots = []
        for district, data in district_data.items():
            hotspots.append({
                'district': district,
                'transactions': data['transactions'],
                'total_value': data['total_value'],
                'avg_psm': sum(data['prices_psm']) / len(data['prices_psm']) if data['prices_psm'] else 0
            })

        return sorted(hotspots, key=lambda x: x['transactions'], reverse=True)

    def cross_country_comparison(self, db, property_type, city_pairs):
        """Compare property markets across ASEAN cities."""
        comparison = []
        for country, city in city_pairs:
            trend = self.price_trend_analysis(db, country, city, property_type, 12)
            if trend:
                latest = trend[-1]
                comparison.append({
                    'country': country,
                    'city': city,
                    'median_psm_usd': self._to_usd(latest['median_psm'], country),
                    'volume': latest['volume'],
                    'yoy_change': self._calc_yoy(trend)
                })

        return sorted(comparison, key=lambda x: x['median_psm_usd'])

Ownership Pattern Analysis

def analyze_ownership_patterns(db, area, country):
    """Analyze property ownership patterns in an area."""
    records = db.get_ownership_records(area=area, country=country)

    analysis = {
        'total_properties': len(records),
        'by_owner_type': {},
        'by_tenure': {},
        'concentration': [],
        'foreign_ownership': 0
    }

    owner_counts = {}
    for record in records:
        owner = record.get('owner_name', 'Unknown')
        owner_counts[owner] = owner_counts.get(owner, 0) + 1

        owner_type = record.get('owner_type', 'individual')
        analysis['by_owner_type'][owner_type] = \
            analysis['by_owner_type'].get(owner_type, 0) + 1

        tenure = record.get('tenure', 'unknown')
        analysis['by_tenure'][tenure] = \
            analysis['by_tenure'][tenure] = analysis['by_tenure'].get(tenure, 0) + 1

    # Top property holders
    analysis['concentration'] = sorted(
        [{'owner': k, 'properties': v} for k, v in owner_counts.items()],
        key=lambda x: x['properties'],
        reverse=True
    )[:20]

    return analysis

Legal and Ethical Considerations

Public vs. Restricted Data

Property records exist on a spectrum of accessibility:

  • Fully public: Transaction statistics, price indices, zoning data
  • Semi-public: Basic title information, registered owner names
  • Restricted: Detailed personal information, financial details of transactions
  • Confidential: National security-related properties, protected person data

Always respect the boundaries between public and restricted data.

Privacy Regulations

Property records contain personal data. Comply with:

  • Singapore PDPA: Personal data protection requirements
  • Indonesia UU PDP: Data protection law
  • Philippines DPA: Data Privacy Act
  • Malaysia PDPA: Personal Data Protection Act

Responsible Use

  • Collect only data necessary for your legitimate purpose
  • Implement access controls on stored property data
  • Do not create comprehensive individual profiles from property records
  • Follow each jurisdiction’s property data access rules

DataResearchTools for Property Data

DataResearchTools provides the proxy infrastructure for property record collection:

  • ASEAN-wide mobile proxies for accessing land registry and property portals across the region
  • Sticky sessions for navigating complex property search interfaces
  • Native carrier IPs that land registry websites trust as legitimate local access
  • High reliability for consistent property data monitoring
  • Reasonable request distribution that avoids overwhelming government property databases

Our proxy infrastructure is designed for the specific requirements of government website access, including the sensitive nature of land registry portals.

Getting Started

Phase 1: Public Data Collection

Start with freely available public property datasets:

  • Singapore HDB resale transactions and URA data
  • Malaysia JPPH property index
  • Publicly available transaction statistics

Phase 2: Expanded Monitoring

Add title search capabilities, zoning data monitoring, and transaction alert systems.

Phase 3: Market Intelligence

Build market analysis tools, ownership pattern detection, and cross-country comparison capabilities.

Conclusion

Land registry and property records across Southeast Asia are increasingly accessible through digital platforms. Building automated collection capabilities with DataResearchTools’ proxy infrastructure enables organizations to transform scattered government property data into comprehensive market intelligence.

Whether you are an investor seeking market opportunities, a financial institution verifying collateral, or a researcher analyzing housing markets, systematic property data collection is the foundation of effective analysis. Start with publicly available data in your primary markets, build robust collection pipelines, and expand to create an ASEAN-wide property intelligence capability that drives better real estate decisions.


Related Reading

Scroll to Top