Cost benchmark 2026: AI scraping per 10,000 pages

AI scraping cost in 2026 is the single most asked question by every team evaluating the move from traditional Playwright pipelines to LLM-driven approaches. Engineering managers want a number. CFOs want a number. The honest answer is that the number depends on your target, your model, your proxy mix, and your engineer’s prompt skill, but you can pin it within a tight range with the right benchmarks. That is what this guide gives you.

We ran the same scraping task (extract product title, price, currency, stock from a real ecommerce listing) on 10,000 pages across 5 different AI scraping approaches and 3 traditional baselines. Numbers below are from production runs in March and April 2026, billed by the actual platforms.

What we tested

Target: a mix of 10,000 product pages from Lazada Singapore, Shopee Singapore, and Amazon US. Roughly equal split. Real URLs, real bot defenses, real ecommerce HTML.

Schema: title (string), price (number), currency (3-letter code), in_stock (boolean). All fields required.

Proxies: rotating residential pool with around 50,000 IPs. Same pool for every test.

Compute: each setup ran on its native infrastructure (self-hosted Playwright on a Fargate task, Browserbase via their cloud, Scrapybara via theirs).

Success criterion: returned record passes Pydantic validation against the schema.

Headline numbers

Approach	Total cost per 10,000 pages	Successful extractions	Cost per success
Self-hosted Playwright with hand-tuned selectors	$25	9,640	$0.0026
Self-hosted Playwright with GPT-4o-mini extraction	$90	9,810	$0.0092
Stagehand with GPT-4o-mini	$310	9,650	$0.032
browser-use with GPT-4o-mini	$390	9,580	$0.041
Browserbase + Stagehand with GPT-4o-mini	$410	9,720	$0.042
Browserbase + Stagehand with GPT-4o	$1,950	9,830	$0.198
browser-use with Claude Sonnet 4.5	$510	9,640	$0.053
Anthropic Computer Use with self-hosted browser	$2,100	9,810	$0.214
OpenAI Operator API with self-hosted browser	$2,750	9,720	$0.283

The cheapest approach (hand-tuned Playwright) costs about 100x less than the most expensive (Operator API). Both produce useful data. The decision is about engineering trade-offs, not pure cost.

Cost breakdown components

Every AI scraping cost has four parts:

Compute: the headless browser runtime
Proxy: residential or mobile IP traffic
LLM tokens: the agent loop and extraction
Engineering time: not counted in the table above but real

For 10,000 pages on a typical AI scraping setup:

Component	Cost
Compute (Browserbase or self-hosted)	$40-$80
Proxy (residential, ~5MB per page)	$50-$200
LLM tokens (GPT-4o-mini for extraction)	$30-$60
LLM tokens (full agent loop)	$200-$400

Proxies are the largest line item for many setups, not LLMs. Optimize your proxy mix first.

Methodology notes

The benchmark ran each setup over 4 hours with the same 10,000 URL list. Failures were retried once with the same setup; failures after retry were counted as failures and the cost of both attempts is included in the total.

Each setup ran with default settings of the framework, then a “tuned” pass after spending 30 minutes optimizing prompts, schemas, and timing. The numbers reported are from the tuned pass. Untuned numbers were 30 to 60 percent higher across the board, which is itself a useful data point: out-of-the-box AI scraping is more expensive than necessary.

Currency: all costs in USD as of April 2026. Vendor pricing changes; re-run your own benchmarks before committing to a multi-quarter setup.

Cost-quality trade-off

Cheap setups extract correctly most of the time. Expensive setups extract correctly almost always. The question is whether the last 1-2 percent is worth 10x the cost.

For low-stakes data (price monitoring, casual research), use the cheap setup and accept the error rate. For regulated, decision-critical data (financial information, healthcare), spend the money for the higher-quality extraction.

In our 10,000-page run, the failure modes by approach were:

Approach	Common failure modes
Hand-tuned Playwright	Site layout changed, selector broke
Playwright + LLM extraction	Page rendered late, extraction got placeholder
Stagehand	Agent picked wrong product (related items page)
browser-use	Agent hallucinated price on broken page
Operator/Computer Use	Cost spike on hard pages, occasional retries hit budget

Per-page cost trajectory

Cost per page as you scale from 100 to 10 million pages on each setup:

Approach	100 pages	10K pages	1M pages	10M pages
Hand-tuned Playwright	$0.025	$0.0025	$0.0021	$0.0019
Playwright + LLM extraction	$0.020	$0.009	$0.008	$0.007
Stagehand	$0.040	$0.031	$0.029	$0.028
browser-use	$0.045	$0.039	$0.037	$0.036
Operator/Computer Use	$0.30	$0.275	$0.265	$0.260

The cost curve flattens for AI approaches because the LLM cost dominates and does not benefit much from scale. Hand-tuned Playwright benefits most from scale because the engineering cost amortizes.

The crossover point: at around 1 million pages per month, hand-tuned Playwright with a small LLM extraction layer beats pure AI agent approaches on unit cost. Below that, AI agents save more in engineering time than they cost in tokens.

Real-world cost per workflow

Five common workflows and their realistic cost in 2026:

Workflow	Pages per month	Recommended setup	Monthly cost
Casual price monitoring (5 sites)	5,000	Stagehand + GPT-4o-mini	$200
Competitor catalog tracking	50,000	browser-use + GPT-4o-mini + mobile proxy	$2,000
Lead enrichment from web	100,000	Playwright + LLM extraction	$1,200
Cross-marketplace ecommerce monitoring	1,000,000	Hand-tuned Playwright with LLM fallback	$4,000
News and content aggregation	10,000,000	Hand-tuned Playwright	$20,000

The smaller the volume, the better AI agents look. Above 1 million pages per month, AI agents start to look expensive on unit economics.

Cost spread by target site

Same setup (Stagehand + GPT-4o-mini), different sites in our benchmark:

Site	Avg cost per page	Notes
Hacker News	$0.018	Stable, cheap, no JS rendering needed
Lazada SG	$0.034	Heavy SPA, mobile proxy required
Shopee SG	$0.038	Stronger bot defense than Lazada
Amazon US	$0.029	Big DOM but stable
eBay	$0.026	Mostly static HTML
Booking.com	$0.052	Multi-step navigation
LinkedIn job posts	$0.045	Login-gated, careful pacing
Walmart	$0.031	Routine ecommerce shape

The 3x range across sites is normal. Plan budgets per-site, not per-pipeline.

Cost reduction patterns

Three patterns cut AI scraping cost without hurting quality.

Cache extraction by content hash. If you have seen the page before, reuse the extraction. For sites that change rarely, this can cut LLM cost by 50-80 percent on follow-up runs.

Two-tier model selection. Try GPT-4o-mini first, fall back to GPT-4o on validation failure. About 90 percent of pages succeed on the cheap path; the fallback handles the hard ones.

Trim HTML before extraction. A 800KB page becomes 30KB after stripping scripts, styles, and navigation. Tokens drop proportionally. See our LLM extraction patterns guide for details.

Proxy cost considerations

Proxy traffic is often the biggest single line item. Three considerations:

Proxy type	Cost per GB	Use case
Datacenter	$0.10 – $1.00	Friendly sites, no bot defense
Residential	$4 – $12	Standard ecommerce, social
Mobile carrier	$15 – $35	Hardest defenses, banking, ASEAN ecommerce

For ASEAN scraping with mobile IPs that pass strict carrier-level checks, Singapore mobile proxy is in the $15-$25 per GB range and dominates Singtel/StarHub-protected sites.

For US/EU scraping, Bright Data, Oxylabs, and Smartproxy are the typical residential picks. See our best residential proxy providers 2026 review for current ranking.

Engineering cost (the hidden line item)

Engineering hours per scraper, by setup:

Approach	Initial build	Maintenance per month
Hand-tuned Playwright	4-8 hours per site	1-2 hours per site
Playwright + LLM extraction	2-4 hours per site	0.5 hours per site
Stagehand	30-60 min per site	<0.25 hours per site
browser-use	30-60 min per site	<0.25 hours per site
Operator/Computer Use	30 min per workflow	minimal

At a $100/hour fully-loaded engineering cost, hand-tuned Playwright maintenance for 10 sites runs $1,000-$2,000 per month in engineering time. AI agents can pay for themselves on this line alone.

Hourly engineering cost in detail

We tracked engineer time over the four-hour benchmark window:

Setup	Engineer minutes spent	Engineer cost @ $100/hr
Hand-tuned Playwright	92	$153
Playwright + LLM extraction	51	$85
Stagehand	28	$47
browser-use	26	$43
Browserbase + Stagehand	31	$52
Operator/Computer Use	35	$58

Hand-tuned Playwright wins on per-page cost but loses on engineer cost. For workloads with multiple new sites per quarter, the engineer time savings on AI agents pay for the LLM bill many times over.

Long-tail target cost variance

The 10,000-page benchmark used a balanced mix. Real production workloads have long tails: 5 percent of pages are 10x harder than the median.

Across the benchmark, the per-page cost distribution looked like:

Percentile	GPT-4o-mini cost	GPT-4o cost
p50	$0.027	$0.18
p90	$0.045	$0.31
p99	$0.110	$0.78
max	$0.34	$2.10

The p99 cost is roughly 4x the median. For budget planning, use p99 as the worst-case unit cost and budget total based on expected page count plus a 30 percent safety margin.

Comparison to alternatives

For workflows where AI scraping is overkill, the right answer might be a managed scraping API.

Service	Cost per 10K pages	Best fit
ScraperAPI	$50-$150	Standard ecommerce
ZenRows	$40-$120	JS-heavy with Cloudflare
ScrapingBee	$50-$140	General use
Bright Data Web Scraper API	$80-$200	Enterprise
Apify Actor Marketplace	$30-$100	Pre-built scrapers

Managed APIs hit a sweet spot for teams that do not want to manage infrastructure but also do not need the agentic flexibility. See our best web scraping APIs 2026 for the full ranking.

Decision matrix

Pick your stack based on volume and target shape:

Volume	Target shape	Recommended stack
<10K pages/month	Any	Stagehand + Browserbase or browser-use
10K-100K	Stable	Playwright + LLM extraction
10K-100K	Changing often	Stagehand + LLM extraction
100K-1M	Stable	Self-hosted Playwright + LLM extraction
100K-1M	Changing often	Hybrid: Playwright fast path, browser-use fallback
>1M	Stable	Hand-tuned Playwright
>1M	Changing often	Hybrid + dedicated scraping engineer

ROI analysis: when AI scraping pays back

The right way to evaluate any AI scraping setup is total cost of ownership, not unit cost. A worked example for a 5-engineer scraping team:

Setup	Annual unit cost	Annual engineer cost	Total
Hand-tuned Playwright (10 sites)	$25,000	$120,000 (1 FTE)	$145,000
AI agents on 10 sites	$75,000	$30,000 (0.25 FTE)	$105,000

The AI setup costs more in unit terms but frees three quarters of an engineer’s time. If that engineer is doing other valuable work, the AI setup is a $40,000 annual saving.

Where this calculus breaks: if the engineer would just be sitting idle without the scraper to maintain, the unit cost dominates and Playwright wins. In practice, scraping engineers always have more work than time, so AI agents pay back.

Hidden cost categories

A few costs that benchmark tables typically miss:

Logging and storage. AI scraping produces detailed traces; persisting them for 90 days runs $50 to $200 per million records depending on storage tier.

Observability vendor cost. LangSmith, Arize, Honeycomb, Datadog all charge per span. AI scraping is span-heavy. Budget $100 to $400 per month for a small production deployment.

LLM rate-limit overage. A scraper that hits tier-2 rate limits during a backfill might need to upgrade tier or accept slower throughput. Tier-3 access requires sustained usage, which itself is a cost.

Compliance review. Some legal teams require additional review on AI-driven extraction. Budget engineering and legal review hours on first deployment.

Replatforming. Most teams switch frameworks at least once in the first two years as the AI scraping space evolves. Budget for a half-quarter migration window.

Production observability

Whatever stack you pick, log cost per page in structured form. Fields to capture: timestamp, source URL, model used, input tokens, output tokens, proxy GB, browser session minutes, validation result.

This data lets you spot cost regressions and target sites that are unexpectedly expensive. Most teams discover one or two outlier sites that consume 10x the median cost; once flagged, they can be optimized or moved to a different scraping path.

Cost across model vendors

Same Stagehand setup, different LLM models, same 10,000 pages:

Model	Cost per 10K	Accuracy	p99 latency
GPT-4o-mini	$310	96.5%	4.4 s
GPT-4o	$1,950	98.4%	6.1 s
Claude Haiku 3.5	$370	95.8%	3.9 s
Claude Sonnet 4.5	$2,180	98.7%	7.2 s
Gemini 1.5 Flash	$185	95.0%	3.1 s
Gemini 1.5 Pro	$1,520	97.4%	5.4 s
Llama 3.3 70B (self-host on H100)	$90	92.3%	2.8 s

Headline: Gemini Flash is the cheapest of the top-tier closed-source options. Llama 3.3 self-hosted is even cheaper but accuracy is roughly 4 points lower.

For most production workloads in 2026, the value pick is GPT-4o-mini. The cost-conscious pick is Gemini Flash. The privacy-conscious pick is self-hosted Llama or Qwen.

Cost over 12 months

A worked projection for a hypothetical 100k-pages-per-month workload:

Setup	Year 1 cost	Year 2 cost (with optimization)
Hand-tuned Playwright	$14,400	$14,400
Playwright + LLM	$14,400	$11,500
Stagehand	$42,000	$33,000
browser-use	$50,000	$40,000

Optimization typically cuts AI agent cost by 20 to 30 percent in year two as caching, prompt tuning, and HTML trimming mature. Hand-tuned Playwright cost stays flat because engineer time dominates.

Frequently asked questions

Why is my actual cost higher than these benchmarks?
Three common causes: agent loops on confused pages (set max_iterations), oversized HTML sent to extraction (trim before extracting), or expensive model used by default (downgrade to mini variants).

Do these benchmarks include retries and failures?
Yes. The cost numbers include the cost of failed runs. Successful extractions per 10K is the second column.

What about Gemini-based scraping?
Gemini 1.5 Flash is the cheapest production-quality model in 2026. Substituting Flash for GPT-4o-mini in any of the AI agent setups cuts LLM cost by another 30-50 percent.

How do I forecast cost for a new scraping target?
Run 100 pages, measure cost, multiply by your expected volume. Add 30 percent buffer for retries and harder pages. Re-measure monthly.

Are open-source models a real option for cost control?
Yes for extraction (Llama 3.3 70B, Qwen 2.5 72B work well). Mostly no for full agent loops (current open-source models still trail GPT-4o and Claude Sonnet on tool use reliability).

How do I budget for unexpected target site changes?
Add a 30 percent contingency to your annual cost projection. Sites change formats, bot defenses get harder, and new targets land in scope. Without contingency, a single big site overhaul can wipe out a quarter’s headroom.

Is per-page cost really the right metric?
For commodity scraping, yes. For high-stakes data, cost-per-correct-record is more useful. A 99 percent accurate extraction at $0.04 beats a 96 percent accurate one at $0.01 if errors trigger downstream review at $5 each.

Can I buy AI scraping as a managed service instead of building?
Yes. Apify’s Smart Crawler, Bright Data’s Web Scraper API, and several startups offer managed AI scraping. Per-page cost is roughly 2 to 3x DIY because the vendor adds margin. The trade-off is zero engineer time on infrastructure.

Common cost gotchas

A handful of patterns that drain AI scraping budgets faster than expected.

The agent loops on a confused page, burning 50,000 tokens before timing out. Cap iterations and abort hard.

A spike in a target’s bot defenses doubles the proxy cost overnight. Track per-target proxy cost and alert on changes greater than 30 percent week-over-week.

Verbose logging captures the full screenshot in JSON. The OpenAI API treats long input as long output for cost. Keep logs brief.

Scheduling all scrapes at midnight means hitting peak provider load. Spread across the hour.

Caching keys that include the timestamp instead of content. Every “cache hit” is actually a cache miss.

For broader patterns on the AI scraping stack, browse the AI modern scraping category.