Draft Rewrite
Bun pitched itself as a drop-in Node.js replacement when it launched, but for web scraping engineers the real question is whether runtime speed gains survive contact with I/O-heavy, anti-bot-aware crawl workloads. In 2026, with Bun 1.x stable and a growing ecosystem, the honest answer is: mostly yes, with real caveats.
What Bun actually changes for scrapers
Bun is a JavaScript runtime built on JavaScriptCore (Safari’s engine) rather than V8. It ships with a native HTTP client, a fast SQLite driver, and a built-in test runner. For scrapers, the relevant gains are startup time and raw HTTP throughput.
Node.js cold-starts a simple fetch loop in roughly 80-120ms. Bun does it in 20-35ms. That gap doesn’t matter much for a long-running crawler, but it adds up fast for serverless or short-lived scraping jobs that spin up per-domain. Bun’s native fetch also benchmarks at roughly 2x the throughput of Node’s undici on concurrent requests in controlled tests, though real-world gains depend heavily on proxy latency and the target server.
If your stack is Rust-native and you need async patterns Node can’t express cleanly, Web Scraping with Reqwest + Tokio in Rust: Async Patterns (2026) is worth reading alongside this — the architectural philosophy is similar even if the language isn’t.
Bun for static HTML scraping
For sites that return complete HTML (no JavaScript rendering needed), Bun + cheerio or node-html-parser works well. Here’s a minimal concurrent scraper that processes a list of URLs with configurable concurrency:
import { parseHTML } from "linkedom";
const urls = [
"https://example.com/products/1",
"https://example.com/products/2",
"https://example.com/products/3",
];
const CONCURRENCY = 10;
async function scrape(url: string) {
const res = await fetch(url, {
headers: { "User-Agent": "Mozilla/5.0 (compatible; BunBot/1.0)" },
});
const html = await res.text();
const { document } = parseHTML(html);
return {
url,
title: document.querySelector("h1")?.textContent?.trim(),
price: document.querySelector("[data-price]")?.getAttribute("data-price"),
};
}
const results = [];
for (let i = 0; i < urls.length; i += CONCURRENCY) {
const batch = urls.slice(i, i + CONCURRENCY).map(scrape);
results.push(...(await Promise.all(batch)));
}
await Bun.write("output.json", JSON.stringify(results, null, 2));Bun.write is noticeably faster than Node’s fs.writeFile for large output files. And linkedom parses HTML without DOM overhead, which helps when you’re processing thousands of pages per run.
Bun vs Node.js vs Deno
| Runtime | Cold start | Native fetch | TypeScript | Playwright support | Best for |
|---|---|---|---|---|---|
| Node.js 22 | ~100ms | via undici | compile step | full | mature ecosystem, long-running crawlers |
| Bun 1.x | ~25ms | built-in | native | partial (via pw-bun) | short-lived jobs, high-throughput fetch |
| Deno 2.x | ~60ms | built-in | native | full | security-sensitive scraping, WASM |
Playwright compatibility is the biggest caveat. The playwright npm package installs and runs under Bun, but pw-bun (the native Bun driver) is still experimental as of early 2026 and occasionally breaks on browser version updates. For browser-based scraping, Node’s still the safer bet. If you’re deciding between browser automation frameworks anyway, Cypress vs Playwright for Web Scraping: When to Pick Each (2026) covers the tradeoffs in depth.
When Bun wins (and when it doesn’t)
Bun’s speed advantages show up most clearly in these scenarios:
- High-volume static scraping: 10,000+ pages per run with pure HTTP fetch, where startup and per-request overhead compound
- Serverless scrape workers: AWS Lambda or Fly.io machines where cold-start cost hits your bill directly
- Local dev iteration: native TypeScript execution means no
ts-nodeortsxwrapper, which genuinely cuts iteration time - SQLite result storage: Bun’s built-in
bun:sqliteis 3-4x faster thanbetter-sqlite3under Node for bulk inserts
But Bun doesn’t win everywhere. It loses on:
- Headless browser scraping (Playwright compatibility is still rough)
- Scraping pipelines that depend on C-native Node addons — some break under Bun’s FFI layer
- Teams with heavy
expressmiddleware stacks, where migration cost outweighs runtime gains - Anti-bot bypass requiring browser fingerprint matching (DevTools protocol reliability under Bun is unproven at scale)
For browser-based extraction without a JavaScript runtime overhead at all, Go Web Scraping with chromedp: Headless Chrome in Pure Go (2026) shows how a compiled language handles the same problem without any runtime dependency. And if concurrency is your primary concern rather than raw speed, Elixir Web Scraping with Crawly: BEAM Concurrency for Scrapers (2026) makes a strong case that the BEAM actor model beats any single-threaded runtime for fault-tolerant crawlers.
Proxy and anti-bot considerations
Bun’s fetch respects standard proxy environment variables (HTTP_PROXY, HTTPS_PROXY), but it doesn’t yet support SOCKS5 proxies natively. For residential or mobile proxy rotation, you need a wrapper:
import { ProxyAgent } from "undici";
const agent = new ProxyAgent("http://user:pass@proxy.example.com:8080");
const res = await fetch("https://target.com/data", {
// @ts-ignore -- Bun accepts dispatcher as fetch option
dispatcher: agent,
});undici works under Bun because the Node.js compatibility layer is good enough for most npm packages. SOCKS5 native support is on the roadmap but not shipped yet — worth knowing before you commit to Bun if your proxy infrastructure relies on it.
For more complex setups — rotating proxies, anti-bot APIs, managed scraping endpoints — the Best Web Scraping APIs for Developers 2026: Build Scrapers Faster roundup covers services that handle fingerprinting and IP management so you can focus on parsing logic rather than infastructure.
Bottom line
Use Bun for static HTTP scraping where throughput and startup time matter, particuarly for serverless or batch workloads. The performance gains are real, and Node compatibility is solid enough for most fetch-based pipelines. Hold off on migrating browser-based scrapers until Playwright’s native Bun driver stabilizes. DRT will keep covering Bun’s ecosystem as it matures through late 2026.
Related guides on dataresearchtools.com
- Go Web Scraping with chromedp: Headless Chrome in Pure Go (2026)
- Elixir Web Scraping with Crawly: BEAM Concurrency for Scrapers (2026)
- Cypress vs Playwright for Web Scraping: When to Pick Each (2026)
- Web Scraping with Reqwest + Tokio in Rust: Async Patterns (2026)
- Pillar: Best Web Scraping APIs for Developers 2026: Build Scrapers Faster