Deno scraping libraries 2026 reviewed

Deno scraping libraries reached a stability tipping point in 2025 when Deno 2.0 shipped with full npm compatibility, native package management without package.json, and stable JSR (JavaScript Registry) support. By 2026 the runtime is a credible third option alongside Node and Bun for JavaScript-based scraping, with a unique angle: permission-based sandboxing. A Deno scraper can be denied disk access, network access to specific hosts, or environment variables at runtime. For untrusted scraper code (third-party plugins, customer-supplied scripts), this is uniquely valuable.

This guide covers what Deno offers for scraping in 2026, the libraries that work best, the npm packages that work via Deno’s compatibility shim, and the production patterns that exploit Deno’s strengths. Code is TypeScript throughout. By the end you will know whether Deno fits your project and how to deploy it without the sharp edges.

Why Deno for scraping

Deno’s specific strengths for scrapers:

Permission system: granular runtime permissions for filesystem, network, env vars
TypeScript native: no transpile step, no tsconfig wrangling
Web Standards APIs: fetch, ReadableStream, WebCrypto are all Web API spec
JSR registry: faster, secure alternative to npm with better TypeScript support
Built-in formatter, linter, tester, bundler: no separate tools
Single binary: easy install, no node_modules
Deno Deploy: edge serverless that runs Deno natively, free egress

For Deno’s official documentation, see docs.deno.com.

Where Deno does not lead

Pure speed: Bun is faster for most workloads
npm compatibility: better than Bun for some edge cases, worse for others
Community size: smaller than Node, smaller than Bun in 2026
Production maturity: behind Node, comparable to Bun

For a project where speed is the deciding factor, Bun. For maximum compatibility, Node. For permission-sandboxed code, Deno.

Installing Deno

curl -fsSL https://deno.land/install.sh | sh
# or via brew
brew install deno

deno --version  # 2.0+ in 2026

A first scraper

// scrape.ts
import { DOMParser } from "jsr:@b-fuze/deno-dom";

async function scrape(url: string) {
  const resp = await fetch(url, {
    headers: {
      "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 "
                   + "(KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
    },
  });
  const html = await resp.text();
  const doc = new DOMParser().parseFromString(html, "text/html");

  if (!doc) {
    throw new Error("Failed to parse HTML");
  }

  const titles = Array.from(doc.querySelectorAll("h2.title"))
    .map((el) => el.textContent.trim());

  return titles;
}

const url = Deno.args[0];
if (!url) {
  console.error("Usage: deno run --allow-net scrape.ts <url>");
  Deno.exit(1);
}

const titles = await scrape(url);
console.log(JSON.stringify(titles, null, 2));

Run with explicit network permission:

deno run --allow-net=example.com scrape.ts https://example.com/products

The --allow-net=example.com restricts network access to only that host. Try to fetch any other URL and Deno blocks it. This is the security model: code runs only with the permissions you grant.

The permission model

Deno permissions for scrapers:

permission	flag	use
Network	`--allow-net=host1,host2`	fetch outbound
Read FS	`--allow-read=path`	read files
Write FS	`--allow-write=path`	write files
Env vars	`--allow-env=VAR1,VAR2`	read environment
Subprocesses	`--allow-run`	spawn external processes
FFI	`--allow-ffi`	native library calls
Workers	included by default	start Web Workers
All	`--allow-all` (or `-A`)	bypass all checks

For a scraper, typical permissions:

deno run \
    --allow-net=target.example.com,api.example.com \
    --allow-read=./config \
    --allow-write=./output \
    --allow-env=API_KEY,PROXY_URL \
    src/main.ts

This is the discipline that makes Deno safer for running untrusted scraper modules: each module gets only what it needs.

Library survey

The major libraries and their 2026 state:

library	purpose	source	maturity
deno-dom	HTML parsing, DOM API	jsr:@b-fuze/deno-dom	excellent
cheerio	HTML parsing, jQuery-style	npm:cheerio	excellent (via npm: specifier)
linkedom	HTML parsing, DOM API	npm:linkedom	excellent
puppeteer	browser automation	npm:puppeteer	good (Node compat)
playwright	browser automation	npm:playwright	partial (Node compat)
astral	Deno-native browser automation	jsr:@astral/astral	very good
got	HTTP client	npm:got	excellent (via npm:)
axios	HTTP client	npm:axios	excellent (via npm:)
Crawlee	crawler framework	npm:crawlee	excellent
p-queue	concurrency control	npm:p-queue	excellent

JSR-published packages (jsr:@scope/name) are Deno-native and tend to have better TypeScript support. npm: packages work via the compat layer and cover most ecosystem libraries.

deno-dom for HTML parsing

deno-dom is the standard HTML parser for Deno. WASM-backed, fast, and exposes the browser DOM API:

import { DOMParser } from "jsr:@b-fuze/deno-dom";

const html = await fetch("https://example.com").then(r => r.text());
const doc = new DOMParser().parseFromString(html, "text/html");

// Standard DOM API
const title = doc?.querySelector("h1")?.textContent;
const links = Array.from(doc?.querySelectorAll("a[href]") || [])
  .map(a => a.getAttribute("href"));
const products = Array.from(doc?.querySelectorAll("article.product") || [])
  .map(p => ({
    title: p.querySelector("h2")?.textContent?.trim(),
    price: p.querySelector(".price")?.textContent?.trim(),
  }));

Performance is comparable to cheerio for typical HTML sizes. For very large documents, both are roughly equal.

Astral for browser automation

Astral is the Deno-native equivalent of Puppeteer. It runs Chromium with a TypeScript-first API:

import { launch } from "jsr:@astral/astral";

const browser = await launch();
const page = await browser.newPage("https://example.com/products");

// Wait for content to render
await page.waitForSelector("article.product");

// Extract via page evaluation
const products = await page.evaluate(() => {
  return Array.from(document.querySelectorAll("article.product")).map((el) => ({
    title: el.querySelector("h2")?.textContent?.trim(),
    price: el.querySelector(".price")?.textContent?.trim(),
  }));
});

await browser.close();
console.log(products);

Astral is lighter than Puppeteer and integrates better with Deno’s permission model. For full Playwright feature parity, use the npm:playwright package; for cleaner Deno-first integration, Astral.

Crawlee on Deno

Crawlee, originally a Node framework, runs on Deno via npm compat:

import { CheerioCrawler } from "npm:crawlee";

const crawler = new CheerioCrawler({
  async requestHandler({ request, $, enqueueLinks }) {
    const title = $("h1").text();
    console.log(`${request.url}: ${title}`);

    // Enqueue links from this page
    await enqueueLinks({
      selector: "a[href*='/product/']",
    });
  },
  maxRequestsPerCrawl: 100,
});

await crawler.run(["https://example.com/products"]);

Crawlee’s CheerioCrawler is for HTML scraping, PuppeteerCrawler and PlaywrightCrawler for browser-based. All three work on Deno.

Stealth on Deno

Deno’s built-in fetch uses Hyper (Rust HTTP client) which has a distinct TLS fingerprint. For TLS-fingerprinted targets:

Use curl-impersonate via subprocess: requires --allow-run
Use Astral or Puppeteer for full browser: heavier but bypass TLS check entirely
Use undici via npm: with custom agent: limited stealth options

The cleanest path for stealth is Astral with Chromium because the TLS fingerprint then matches real Chrome:

import { launch } from "jsr:@astral/astral";

const browser = await launch({
  args: ["--disable-blink-features=AutomationControlled"],
});

const page = await browser.newPage();
await page.goto("https://target.example.com");
const html = await page.content();
await browser.close();

For broader fingerprinting context, see TLS fingerprinting in 2026.

Comparison: Deno vs Bun vs Node

dimension	Deno 2	Bun 1.1	Node 20
TypeScript native	yes	yes	no (transpile)
Web Standards APIs	full	most	partial
Permission system	yes	no	no
Built-in test runner	yes	yes	yes
Built-in fmt/lint	yes	yes	no
npm compat	very good	very good	native
JSR registry	yes	partial	no
Speed (typical scraping)	medium	fast	slow
Memory footprint	medium	small	large
Production maturity	good	good	excellent

For new scraping projects where security and TypeScript ergonomics matter, Deno. For raw speed, Bun. For library compatibility above all, Node.

For Bun specifically, see scraping with Bun runtime: 2026 performance benchmarks.

Deno Deploy: edge serverless scraping

Deno Deploy is the serverless platform that runs Deno scripts at the edge. Similar to Cloudflare Workers but Deno-native:

// main.ts
Deno.serve(async (req) => {
  const url = new URL(req.url).searchParams.get("url");
  if (!url) return new Response("Missing url param", { status: 400 });

  try {
    const resp = await fetch(url);
    const html = await resp.text();

    // Extract titles
    const titles = [...html.matchAll(/<h2[^>]*>(.*?)<\/h2>/g)].map(m => m[1]);

    return Response.json({ url, titles });
  } catch (err) {
    return Response.json({ error: err.message }, { status: 500 });
  }
});

Deploy:

deployctl deploy --project=my-scraper main.ts

Deno Deploy gives you global edge distribution, free egress, and no cold start. For lightweight scraping APIs, it is competitive with Cloudflare Workers.

For serverless comparison, see running scrapers on Cloudflare Workers in 2026.

Production patterns

A production Deno scraper layout:

my-scraper/
├── src/
│   ├── main.ts
│   ├── fetch.ts
│   ├── parse.ts
│   └── store.ts
├── deno.json          # config + dependencies + tasks
├── deno.lock          # lockfile
└── Dockerfile

deno.json example:

{
  "tasks": {
    "dev": "deno run --watch --allow-net --allow-read --allow-write src/main.ts",
    "start": "deno run --allow-net --allow-read --allow-write src/main.ts",
    "test": "deno test --allow-net=test.example.com src/",
    "fmt": "deno fmt",
    "lint": "deno lint"
  },
  "imports": {
    "@b-fuze/deno-dom": "jsr:@b-fuze/deno-dom@^0.1.45",
    "cheerio": "npm:cheerio@^1.0.0",
    "p-queue": "npm:p-queue@^8.0.1"
  }
}

Run tasks via deno task dev, deno task test, etc.

Container deployment:

FROM denoland/deno:2.0

WORKDIR /app
COPY deno.json deno.lock ./
COPY src ./src
RUN deno cache src/main.ts

USER deno
EXPOSE 8000
CMD ["run", "--allow-net", "--allow-read", "src/main.ts"]

Pre-cache dependencies at build time so runtime is fast.

Long-running scraping

For continuous scrapers:

// src/main.ts
let shutdown = false;

Deno.addSignalListener("SIGINT", () => { shutdown = true; });
Deno.addSignalListener("SIGTERM", () => { shutdown = true; });

async function main() {
  while (!shutdown) {
    const url = await getNextURL();
    if (!url) {
      await new Promise((r) => setTimeout(r, 5000));
      continue;
    }
    try {
      await scrapeOne(url);
    } catch (err) {
      console.error(`Error on ${url}:`, err);
    }
  }
  console.log("Shutting down");
}

await main();

Deno’s signal handling matches Node’s pattern, just with the Deno.addSignalListener API.

Concurrency: Workers and parallel scraping

Deno supports Web Workers natively:

// src/main.ts
const worker = new Worker(new URL("./scrape-worker.ts", import.meta.url).href, {
  type: "module",
  deno: {
    permissions: { net: ["target.example.com"] },
  },
});

worker.onmessage = (e) => console.log("Worker result:", e.data);
worker.postMessage({ url: "https://target.example.com/page1" });

Workers can have their own permission set, separate from the main script. This is unique to Deno among the JS runtimes.

For multi-process parallelism (CPU-bound), spawn multiple Deno processes:

const procs = await Promise.all(
  Array.from({ length: 4 }, (_, i) =>
    new Deno.Command("deno", {
      args: ["run", "--allow-net", "src/scrape-worker.ts", String(i)],
    }).output()
  )
);

Common pitfalls

npm: imports require explicit version: pin in deno.json or use exact version in import
CORS in Deno Deploy: edge functions enforce CORS; configure response headers
Permission errors at runtime: scripts crash if you forget to grant a permission. Test with --allow-all then narrow down.
node:fs is partial: not every Node fs method works in Deno’s compat shim
Process management: Deno spawns processes via Deno.Command, not Node’s child_process (though npm compat exposes it)
Bun-specific code does not run on Deno: Bun.write, bun:sqlite need rewrites

Operational checklist

For production Deno scrapers in 2026:

Deno 2.0+ on Linux for production
denoland/deno:2.0 base image for containers
JSR for Deno-native packages, npm: for the rest
Pre-cache dependencies at build time
Use granular permissions in production
Use Deno Deploy for edge serverless scraping
Consider Astral for Deno-native browser automation
Crawlee works for crawler frameworks
For TLS-sensitive targets, use Astral or curl-impersonate via subprocess
Bench against Bun if speed matters; Deno is usually mid-pack

When to choose Deno over Bun

The cases where Deno wins despite being slower:

You need permission-sandboxed code (multi-tenant, plugin architecture, untrusted modules)
You want a single runtime for scraping AND deployment to Deno Deploy
TypeScript-first ergonomics matter and Bun’s TS support has edge cases
JSR’s better TypeScript inference is meaningful for your codebase
You want maximum Web Standards conformance

For pure speed, Bun. For permission control or Deno Deploy fit, Deno.

FAQ

Q: how complete is npm compatibility in Deno 2 in 2026?
Very high. Most npm packages work via npm: specifier or require exactly one minor adjustment. Native binding modules (sharp, sqlite3) are the most common holdouts. Pure JavaScript packages almost always work.

Q: should I rewrite my Node scrapers in Deno?
Only if you specifically value Deno’s permission model or want to deploy to Deno Deploy. For pure speed gain, switch to Bun instead. For better TypeScript ergonomics with Node compat, switch to TypeScript with tsx if you have not already.

Q: how does Deno Deploy compare to Cloudflare Workers?
Both are edge serverless with free egress. Workers have larger ecosystem (KV, R2, D1, Durable Objects, Browser Rendering API). Deno Deploy is leaner but integrates with Deno KV. For complex distributed scraping, Workers. For lightweight Deno-native APIs, Deno Deploy.

Q: what about Deno’s built-in KV store?
Deno KV is a built-in key-value store available locally and on Deno Deploy. For scraper state (visited URLs, simple results), it works well. Less feature-rich than Cloudflare KV but native to Deno.

Q: is Deno faster than Node for scraping?
Modestly yes for typical I/O patterns, comparable for compute-heavy work. Bun is faster than both. The order is usually Bun > Deno > Node by 20-50% per dimension.

Common pitfalls in production Deno scraping

The first failure mode is permission scope drift in Workers. When you spawn a Web Worker with deno: { permissions: { net: ["target.example.com"] } }, the worker can only fetch from that domain. If your scraper later needs to fetch from a CDN (target.cdn-cgi.com), the fetch silently throws PermissionDenied. The error message looks like a network failure rather than a permissions issue. The fix is to pre-compute the set of all hostnames the worker might touch (including subdomains, CDNs, and analytics endpoints) and grant them all at worker creation, or use the wildcard net: true in development and tighten in production:

const worker = new Worker(workerUrl, {
  type: "module",
  deno: {
    permissions: {
      net: [
        "target.example.com",
        "*.target.example.com",
        "cdn.target.com",
        "fonts.googleapis.com",  // common transitive
      ],
    },
  },
});

The second pitfall is the npm compat shim’s quirky behavior with packages that read package.json at runtime. Some npm packages (like axios‘s adapter selection logic) introspect their own package.json to detect the runtime environment. Under Deno’s npm compat layer, the detection returns “Node” but the actual runtime is Deno, leading to subtle bugs where the package picks the wrong code path. The mitigation is to test each npm package end-to-end on Deno before relying on it in production, and to prefer JSR-native packages where possible. Common gotchas include: axios (use Deno’s fetch instead), winston (some transports do file ops that conflict with Deno’s permission model), and puppeteer (works but heavy; use Astral instead).

The third pitfall is Deno KV consistency under high-concurrency writes. Deno KV uses optimistic concurrency control with versioned reads. If you have 50 workers all trying to update the same dedupe set with kv.set(["visited", url], true), most writes succeed but a fraction get versionstamp conflicts that you have to retry. The fix is atomic transactions with explicit conflict handling:

async function markVisited(url: string): Promise<boolean> {
  const kv = await Deno.openKv();
  for (let attempt = 0; attempt < 3; attempt++) {
    const existing = await kv.get(["visited", url]);
    if (existing.value !== null) return false;  // already visited
    const result = await kv.atomic()
      .check({ key: ["visited", url], versionstamp: existing.versionstamp })
      .set(["visited", url], { ts: Date.now() })
      .commit();
    if (result.ok) return true;  // we won the race
  }
  return false;  // gave up after retries
}

Without the retry loop, ~5 percent of writes silently fail under 50-worker concurrency, leading to duplicate processing. With the retry loop, the duplicate rate drops to under 0.1 percent.

Real-world example: Deno Deploy edge scraper for 200 sites

A team built a price-comparison API on Deno Deploy that fetched live prices from 200 ecommerce sites. Each API request triggered fetches to 5-10 sites in parallel, parsed the HTML for current prices, and returned a normalized JSON response. The architecture used Deno KV for caching, Deno Deploy for global distribution, and JSR-native libraries for parsing:

// main.ts
import { DOMParser } from "@b-fuze/deno-dom";

const kv = await Deno.openKv();

async function fetchPrice(url: string): Promise<number | null> {
  // Check 5-min cache first
  const cached = await kv.get<{price: number, ts: number}>(["price", url]);
  if (cached.value && Date.now() - cached.value.ts < 5 * 60 * 1000) {
    return cached.value.price;
  }

  try {
    const resp = await fetch(url, {
      headers: {
        "user-agent": "Mozilla/5.0 (compatible; PriceComparator/1.0)",
        "accept": "text/html",
      },
      signal: AbortSignal.timeout(8000),
    });
    if (!resp.ok) return null;
    const html = await resp.text();
    const doc = new DOMParser().parseFromString(html, "text/html");
    const priceEl = doc?.querySelector('[itemprop="price"]') ||
                    doc?.querySelector('.price') ||
                    doc?.querySelector('[data-price]');
    const price = parseFloat(
      priceEl?.getAttribute("content") || priceEl?.textContent || ""
    );
    if (isNaN(price)) return null;
    await kv.set(["price", url], { price, ts: Date.now() }, { expireIn: 600_000 });
    return price;
  } catch {
    return null;
  }
}

Deno.serve(async (req) => {
  const url = new URL(req.url);
  const targets = url.searchParams.getAll("url");
  const prices = await Promise.all(targets.map(fetchPrice));
  return new Response(
    JSON.stringify(targets.map((u, i) => ({ url: u, price: prices[i] }))),
    { headers: { "content-type": "application/json" } },
  );
});

Performance: median response 240ms (5 parallel fetches with cache hits common), p95 980ms, p99 2.1s. Monthly Deno Deploy bill at 4 million API calls: $32 (well within the included tier). The same workload on AWS Lambda + DynamoDB would have run roughly $180/month, dominated by Lambda invocation cost and DynamoDB read/write capacity.

The lesson: for read-heavy edge scraping with simple parsing and a cacheable response, Deno Deploy is meaningfully cheaper than AWS-style serverless. The native Deno KV beats Lambda+DynamoDB on both latency and cost for this workload pattern.

Comparison: JSR vs npm imports for scraping libraries

A reference table of which libraries scrapers reach for and where they live in 2026:

library	available on	recommendation
@b-fuze/deno-dom	JSR	use for HTML parsing, Deno-native
cheerio	npm only	works via npm: import, slightly slower
axiod (axios for Deno)	JSR	discouraged, use built-in fetch
@astral/astral	JSR	use for browser automation, Deno-native
puppeteer	npm	works but heavy; prefer Astral
@hono/hono	JSR	excellent for APIs
crawlee	npm	works, Node compat is solid
zod	JSR	runtime validation, used heavily in scrapers
@std/cli	JSR	Deno standard library, CLI argument parsing
postgres	JSR + npm	both work, JSR version more current

Prefer JSR for Deno-native libraries because they get type inference without DefinitelyTyped overhead and are pre-tested against Deno releases. Fall back to npm: imports for ecosystem libraries that have not migrated to JSR yet.

Wrapping up

Deno in 2026 is a credible JavaScript runtime for scraping with a unique permission model that matters in multi-tenant or plugin architectures. The library ecosystem is sufficient: deno-dom for parsing, Astral for browser automation, Crawlee via npm compat for crawler frameworks. For most teams the choice is between Deno’s safety and Deno Deploy fit versus Bun’s raw speed. Pair this with our scraping with Bun runtime and running scrapers on Cloudflare Workers writeups for the full JavaScript-runtime picture, and browse the dev-tools-projects category on DRT for related infrastructure deep-dives.