JobStreet is one of the most-scraped job boards in Southeast Asia, and for good reason: it covers Singapore, Malaysia, Philippines, Indonesia, and Vietnam under one domain structure, making it a high-value target for recruiters building talent pipelines, researchers tracking hiring trends, and HR tech companies benchmarking salaries. Scraping JobStreet in 2026 is harder than it looks. Seek’s acquisition of JobStreet brought infrastructure upgrades that tightened bot detection, and naive Requests-based scrapers now hit walls within minutes.
What JobStreet’s Anti-Bot Stack Looks Like in 2026
JobStreet runs Cloudflare with JavaScript challenges on most listing pages. The search results endpoint (/en/job-search/) renders server-side HTML, but the pagination and filter state are managed via URL query parameters, which is good news for structured scraping. The job detail pages load salary and company data dynamically via XHR calls to internal APIs.
The key fingerprinting vectors to watch:
- TLS fingerprint (JA3/JA4): Node’s default
httpsmodule and Python’srequestsboth fail here - Browser challenge cookies:
cf_clearanceis required on subsequent requests - Rate limiting: more than 30 requests per minute from one IP triggers a 429 or silent redirect to a CAPTCHA wall
- Headless detection:
navigator.webdriverchecks are active on job detail pages
If you’ve already dealt with similar Cloudflare setups on other regional platforms, the pattern will feel familiar. The approach for How to Scrape Seek.com.au Australian Job Listings in 2026 maps closely to what works on JobStreet, since both are Seek-owned properties with shared infrastructure.
Recommended Scraping Stack
For most production use cases, you want one of three approaches depending on volume and budget:
| Approach | Best for | Cost | Reliability |
|---|---|---|---|
| Playwright + residential proxy | Low-to-mid volume, full data | $50-200/mo | High |
| Scraping API (Apify/Zyte) | Medium volume, managed | $100-400/mo | High |
| Raw HTTP + TLS spoofing (curl-cffi) | High volume, cost-sensitive | $20-80/mo | Medium |
For most teams scraping fewer than 10,000 listings per day, Playwright with a residential proxy rotation is the most reliable path. For higher volumes, Zyte’s Smart Browser handles Cloudflare challenges automatically and exposes a simple HTTP API.
The curl-cffi library in Python is worth knowing about: it mimics browser TLS fingerprints at the HTTP layer without spinning up a full browser, which cuts infrastructure cost significantly. It works on JobStreet’s listing index pages but breaks on detail pages that fire JavaScript challenges.
Extracting Listing Data
JobStreet’s search results page at https://www.jobstreet.com.sg/en/job-search/ accepts query parameters for keyword, location, and classification. The HTML structure is stable enough to parse with BeautifulSoup after you’ve passed the Cloudflare check.
Here’s a minimal working pattern using Playwright:
from playwright.async_api import async_playwright
import asyncio, json
async def scrape_jobstreet(keyword: str, location: str, pages: int = 5):
results = []
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
context = await browser.new_context(
user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
proxy={"server": "http://YOUR_RESIDENTIAL_PROXY:PORT",
"username": "user", "password": "pass"}
)
page = await context.new_page()
for pg in range(1, pages + 1):
url = f"https://www.jobstreet.com.sg/en/job-search/?keywords={keyword}&location={location}&page={pg}"
await page.goto(url, wait_until="domcontentloaded")
await page.wait_for_selector("[data-automation='jobListing']", timeout=15000)
cards = await page.query_selector_all("[data-automation='jobListing']")
for card in cards:
title = await card.query_selector("[data-automation='jobTitle']")
company = await card.query_selector("[data-automation='jobCompany']")
results.append({
"title": await title.inner_text() if title else None,
"company": await company.inner_text() if company else None,
})
await browser.close()
return resultsThe data-automation attributes on JobStreet are relatively stable, though they’ve changed twice since 2023, so build your selectors around them but keep a fallback XPath for the job ID embedded in the listing URL.
For salary data, it’s not always in the card. You’ll need to visit individual job detail pages, where salary appears in a tagged with data-automation="job-detail-salary". On listings where salary is hidden, that element is absent entirely rather than masked, so a simple None check is sufficient.
Country and Language Variants
JobStreet operates slightly differently per country, with separate domains and locale-specific URL structures:
- Singapore:
jobstreet.com.sg/en/job-search/ - Malaysia:
jobstreet.com.my/en/job-search/ - Philippines:
jobstreet.com.ph/en/job-search/ - Indonesia:
jobstreet.co.id/id/job-search/(Indonesian-language default) - Vietnam:
jobstreet.com.vn/en/job-search/
Indonesia and Vietnam are worth noting because they use country-code TLDs rather than .com subdomains, and the Indonesian version defaults to Bahasa Indonesia, meaning classification labels and location names won’t match if you’re trying to normalize across countries. Run separate scrapers per domain and normalize at the data layer.
If you’re building a multi-platform SEA hiring dataset, also look at what’s available on How to Scrape JobsDB Hong Kong + Thailand Job Listings (2026), since JobsDB covers markets that JobStreet doesn’t.
Proxy and Rate Limit Strategy
Residential proxies are non-negotiable for production JobStreet scraping. Datacenter IPs are blocked at the ASN level on Cloudflare’s radar within hours. Singapore residential IPs work best for .com.sg, Malaysian IPs for .com.my, and so on — using a geo-matched proxy reduces the chance of CAPTCHA challenges from location mismatch signals.
Rate limits to stay within:
- No more than 1 request per 3 seconds per IP
- Rotate IP after every 15-20 requests at most
- Add random 1-4 second jitter between requests
For comparison, How to Scrape Foundit (Monster India) Job Postings in 2026 covers a platform with looser rate limiting than JobStreet, so if you’re running a multi-board scraper, JobStreet should dictate your throttle ceiling.
If you’re operating at the data infrastructure level and need to understand how proxy sourcing works behind the scenes, the How to Scrape Cars.com Vehicle Listings and Dealer Data (2026) guide has a solid breakdown of residential proxy pool management that applies to any high-volume scraping operation.
For teams managing proprietary data sources alongside public scraping, How to Scrape Tracxn Free Tier Pages in 2026 covers a useful adjacent case: extracting structured company-level data that pairs well with hiring signal data from job boards.
Storing and Structuring the Output
Raw JobStreet data needs cleanup before it’s useful. Common issues:
- Salary ranges are free-text strings (“SGD 4,000 – SGD 7,000 per month”), so you’ll need a regex normalization pass
- Company names include legal suffixes (“Pte. Ltd.”, “Sdn. Bhd.”) that fragment grouping if not stripped
- Location fields mix district names with MRT station references in Singapore
A minimal Postgres schema:
CREATE TABLE jobstreet_listings (
id TEXT PRIMARY KEY, -- extracted from URL
country CHAR(2),
title TEXT,
company TEXT,
salary_min INT,
salary_max INT,
currency CHAR(3),
location TEXT,
posted_at DATE,
scraped_at TIMESTAMPTZ DEFAULT NOW()
);Deduplication on the URL-embedded job ID is reliable. JobStreet reuses IDs across reposts, so a composite key of (id, posted_at) is safer if you’re tracking repost frequency.
Bottom Line
JobStreet is scrapeable in 2026 with Playwright plus residential proxies, but you need geo-matched IPs, conservative rate limits, and selector monitoring since the data-automation attributes drift. For most teams, Zyte Smart Browser is the fastest path to stable production data. DRT covers scraping setups like this across dozens of platforms, so if you’re building a broader SEA data pipeline, check the rest of the job board coverage in the series.