How to Scrape JobsDB Hong Kong + Thailand Job Listings (2026)

JobsDB is one of the dominant job boards across Hong Kong and Thailand, and scraping it in 2026 means dealing with Cloudflare, dynamic React rendering, and aggressive rate limiting that has tightened significantly over the past year. if you’re building a salary benchmarking tool, a recruitment pipeline, or a regional labor market dataset, this guide covers exactly what you need to get clean data at scale.

what JobsDB’s architecture looks like in 2026

JobsDB (owned by SEEK Group) serves both hk.jobsdb.com and th.jobsdb.com from the same underlying platform. both endpoints run a React SPA backed by a private REST API at https://xapi.supercharge-srp.co. the search results page renders server-side HTML for the first paint (good for basic crawling), but job details, pagination, and filters are loaded via XHR calls to that xapi endpoint.

Cloudflare is active on both domains with bot score challenges. a plain requests.get() will return a 403 or a JS challenge page within a few requests. you need either a headless browser or a residential proxy with proper TLS fingerprinting. this is the same class of protection you’ll find on Seek.com.au Australian job listings, which makes sense given they’re the same parent company.

two realistic approaches: HTML parsing vs. the private API

approach 1: scrape the rendered HTML

the search results page at https://hk.jobsdb.com/jobs-in-hong-kong and https://th.jobsdb.com/jobs-in-thailand does ship structured HTML for the listing cards. each card contains title, company, location, salary (when disclosed), and a job URL. use Playwright with a stealth plugin to get past Cloudflare:

from playwright.async_api import async_playwright
import asyncio

async def fetch_jobsdb_page(url: str, proxy: str) -> str:
    async with async_playwright() as p:
        browser = await p.chromium.launch(
            proxy={"server": proxy},
            args=["--disable-blink-features=AutomationControlled"]
        )
        context = await browser.new_context(
            user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
                       "AppleWebKit/537.36 (KHTML, like Gecko) "
                       "Chrome/124.0.0.0 Safari/537.36",
            locale="en-HK"
        )
        page = await context.new_page()
        await page.goto(url, wait_until="networkidle", timeout=30000)
        content = await page.content()
        await browser.close()
        return content

parse the returned HTML with BeautifulSoup. job cards sit inside [data-automation="jobListing"] attributes, which have been stable since 2024.

approach 2: intercept the xapi calls

the cleaner path for bulk collection is intercepting the API calls the browser makes. open DevTools on a search page and filter by xapi.supercharge-srp.co. you’ll see a call like:

GET https://xapi.supercharge-srp.co/job-search/graphql?...

it’s a GraphQL endpoint. the query accepts country, locale, keyword, pageSize (max 30), and page parameters. responses are clean JSON with full job metadata including job ID, advertiser name, salary range, work type, and posting date. replaying these requests directly with httpx is significantly faster than driving a browser, but you’ll need a residential IP that has already passed Cloudflare’s bot score threshold for the session cookie to be valid.

proxy setup and rate limits

JobsDB enforces rate limits at the IP level. based on testing in Q1 2026, the thresholds look roughly like this:

proxy typerequests before blocksession reuserecommended for
datacenter5-10nonot viable
ISP/static residential40-80yeslow-volume monitoring
rotating residential200-500/IProtate every 20-30 reqbulk collection
mobile (4G SG/HK)300-600/IPyeshigh-trust, highest success

mobile proxies from Singapore or Hong Kong carry the highest trust score with Cloudflare on .hk.jobsdb.com. if you’re targeting Thailand listings specifically, Thai mobile IPs perform noticeably better on th.jobsdb.com than SG IPs. rotate your proxy on every new search query, not mid-pagination — mid-pagination IP switches will break your session cookie and trigger re-challenges.

for comparison, the rate limit patterns on Monster job listings and SimplyHired job postings are more forgiving, making JobsDB one of the stricter targets in the APAC job board space.

pagination and data extraction

JobsDB caps search results at 26 pages (30 jobs per page = 780 jobs max per query). to collect beyond that, segment your queries by:

  1. job category (there are ~40 top-level categories available in the filter menu)
  2. posting date range (use postedAt filters: last 24h, 3 days, 7 days, 30 days)
  3. location district (HK has 18 districts; Thailand queries can filter by province)
  4. salary band (where disclosed)

combining category + date range slices gets you full coverage without hitting the 780-job ceiling on any single query. this is the same segmentation strategy that works for JobStreet Southeast Asia listings, which runs on a nearly identical SEEK platform underneath.

key fields available in the API response:

  • jobId, title, advertiser.description (company name)
  • salary.minimum, salary.maximum, salary.label
  • workTypes array (full-time, part-time, contract, casual)
  • listingDate, expiryDate
  • locations array with region/district hierarchy
  • bulletPoints (the short selling points shown in listing cards)

salary data is only present when the advertiser discloses it, which runs about 35% of HK listings and ~25% of Thailand listings based on a May 2026 sample.

handling errors and anti-bot responses

a well-structured scraper needs to handle these response patterns:

  • 403 with Cloudflare challenge page: rotate proxy, re-initialize browser context, retry with 15-30s delay
  • 429 Too Many Requests: back off 60-120s, reduce concurrency from your current IP
  • 200 with empty results array: usually a Cloudflare JS challenge pass-through, not a genuine empty result
  • GraphQL error with UNAUTHENTICATED: your session cookie has expired, re-authenticate via browser init

keep concurrency at 2-3 parallel workers per residential IP pool to stay below detection thresholds. more aggressive parallelism is possible with mobile proxies, but test incrementally. the same discipline applies when you’re scraping Indeed job listings with proxies, which is the gold standard reference for job board proxy rotation patterns.

quick error-handling checklist:

  • validate response is JSON before parsing (Cloudflare HTML slips through on 200s)
  • check data.jobs.total against expected page count before paginating
  • log jobId + listingDate to a dedup store so re-runs don’t double-insert
  • store raw API responses alongside parsed records for schema change recovery

bottom line

JobsDB Hong Kong and Thailand are scrappable in 2026 with a Playwright-based browser session for session bootstrapping combined with direct GraphQL API calls for bulk collection. use rotating residential or mobile proxies, segment queries by category and date range to beat the 780-job ceiling, and build robust error handling for Cloudflare 200/403 ambiguity. DRT covers the full landscape of APAC and global job board scraping — if you’re building a multi-market labor dataset, start with this site’s guides on the other major regional platforms before assuming your stack generalizes cleanly.

Related guides on dataresearchtools.com

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top

Resources

Proxy Signals Podcast
Operator-level insights on mobile proxies and access infrastructure.

Multi-Account Proxies: Setup, Types, Tools & Mistakes (2026)