Ashby has quietly become the ATS of choice for fast-growing startups and Series B+ companies, which makes scraping Ashby career sites a high-signal move for talent pipeline builders, recruiting agencies, and competitive intelligence teams. The problem is that Ashby’s job boards aren’t served from a single domain — each company self-hosts under a pattern like jobs.ashbyhq.com/ — and the rendering is React-based, which trips up naive scrapers.
How Ashby Job Pages Are Structured
Every Ashby career site follows a predictable URL schema:
https://jobs.ashbyhq.com/{company-slug}
https://jobs.ashbyhq.com/{company-slug}/{job-id}The listing page renders a JSON payload into the DOM, but Ashby also exposes a public API endpoint that returns structured job data without JavaScript rendering:
GET https://api.ashbyhq.com/posting-api/job-board/{company-slug}This is the cleanest extraction path. The response is JSON with fields like title, team, location, isRemote, employmentType, and applicationFormDefinition. No authentication required, no browser needed.
import httpx, json
SLUG = "linear" # replace with target company slug
resp = httpx.get(f"https://api.ashbyhq.com/posting-api/job-board/{SLUG}", timeout=15)
data = resp.json()
for job in data.get("jobs", []):
print(job["title"], "|", job.get("location", {}).get("name"), "|", job["id"])Run this against a list of target slugs and you have a structured talent pipeline feed in minutes.
Finding Company Slugs at Scale
The slug discovery problem is where most pipelines break. There’s no public directory of all Ashby customers, so you need to build your own list.
Three approaches that work in 2026:
- Google dork:
site:jobs.ashbyhq.comreturns thousands of indexed subpaths. Paginate through results and extract the slug from the URL path. - LinkedIn scrape: Filter companies by ATS tech stack using tools like Clay or Phantom Buster, which surface ATS provider from careers page redirects.
- Common Crawl: Query the March 2026 crawl for
jobs.ashbyhq.comhostnames and extract unique slugs from theurlcolumn in Athena or BigQuery.
For a talent agency scraping 500+ companies, a seeded Common Crawl query gives the highest coverage per compute dollar.
Anti-Bot Behaviour and Rate Limits
The posting API (api.ashbyhq.com/posting-api) is intentionally public and low-friction. Ashby wants jobs indexed. That said, hammering it with concurrent requests will get your IP soft-blocked within minutes.
Realistic limits from testing:
| Behaviour | Observed limit |
|---|---|
| Concurrent requests (same IP) | ~5 before 429s appear |
| Requests per minute (single IP) | ~60 sustained |
| Cooldown after 429 | 30-90 seconds |
| User-agent rejection | Not enforced on API |
| Bot detection on HTML pages | Cloudflare Turnstile (varies by company) |
The HTML job listing pages (jobs.ashbyhq.com) are a different story. Some companies enable Cloudflare Turnstile on the front-end, which means rendering them requires a headless browser or a Turnstile solver. For bulk data extraction, stick to the API — avoid the HTML path entirely unless you need application form fields that aren’t exposed in the JSON.
Rotate IPs per company slug, not per request. A residential proxy pool with 1 request per slug per session keeps your fingerprint clean and stays well within Ashby’s tolerance. If you’re also scraping other ATS platforms in the same pipeline — say, Recruitee or Personio — use separate proxy sessions per provider to avoid cross-contamination of block signals.
Normalising Ashby Data for Cross-ATS Pipelines
Raw Ashby output doesn’t map cleanly to other ATS schemas. If you’re building a unified talent intelligence feed that also pulls from iCIMS or Taleo, normalisation is the unglamorous work that determines whether your pipeline is actually useful.
Ashby-specific fields to watch:
location.namecan be"Remote", a city, or a hybrid string like"New York, NY (Hybrid)"— parse these consistentlyemploymentTypeuses Ashby’s own enum:"FullTime","PartTime","Contract","Temporary"— remap to your schemateamis a nested object withidandname, not a flat stringcompensationTierappears only when the company has salary transparency enabled — treat it as optional
A canonical schema across ATS providers should use ISO 3166-1 alpha-2 for country codes, a remote_type enum (full, hybrid, none), and Unix timestamps for posted_at. Ashby’s createdAt field is UTC ISO 8601, which is straightforward to convert.
The same normalisation discipline applies when you’re pulling structured data from completely different verticals — the schema design lessons in How to Scrape Latin American Real Estate Sites cover multi-source field unification patterns that transfer directly to multi-ATS pipelines.
Running This at Scale
For a production pipeline covering 1,000+ Ashby companies, the architecture is straightforward:
- Orchestration: Temporal or a simple cron on a VPS — Ashby jobs don’t change by the minute, so daily or twice-daily refreshes are enough
- Queue: Redis or SQS with one task per company slug
- Workers: 10-20 concurrent workers, each with a dedicated residential IP session
- Storage: Postgres with a
ats_jobstable and a(company_slug, job_id, scraped_at)composite key for deduplication - Change detection: Hash the job list per slug on each run and only emit events when the hash changes
Short bullet checklist before going to production:
- Confirm the slug list covers your target company set (test 10 manually)
- Set
httpxtimeout to 15s and retry twice with exponential backoff on 5xx - Log 429s with the slug and timestamp — patterns reveal which companies have extra rate protection
- Store raw JSON alongside normalised rows — Ashby’s schema has changed twice in the past 18 months
For monitoring, track the ratio of slugs returning zero jobs vs. a non-empty list. A sudden spike in zero-job responses usually means your IP pool is blocked, not that all your targets froze hiring simultaneously.
Bottom Line
The Ashby posting API is genuinely scraper-friendly — use it instead of rendering HTML, rotate IPs at the slug level, and invest the saved complexity into normalisation and deduplication. If you’re building a serious multi-ATS talent pipeline, Ashby is one of the easier integrations; the harder work is schema consistency across providers. DRT covers ATS scraping patterns, proxy infrastructure, and data pipeline design in depth — the same principles apply whether you’re pulling from five job boards or five hundred.