Cost benchmark 2026: AI scraping per 10,000 pages

Cost benchmark 2026: AI scraping per 10,000 pages

AI scraping cost in 2026 is the single most asked question by every team evaluating the move from traditional Playwright pipelines to LLM-driven approaches. Engineering managers want a number. CFOs want a number. The honest answer is that the number depends on your target, your model, your proxy mix, and your engineer’s prompt skill, but you can pin it within a tight range with the right benchmarks. That is what this guide gives you.

We ran the same scraping task (extract product title, price, currency, stock from a real ecommerce listing) on 10,000 pages across 5 different AI scraping approaches and 3 traditional baselines. Numbers below are from production runs in March and April 2026, billed by the actual platforms.

What we tested

Target: a mix of 10,000 product pages from Lazada Singapore, Shopee Singapore, and Amazon US. Roughly equal split. Real URLs, real bot defenses, real ecommerce HTML.

Schema: title (string), price (number), currency (3-letter code), in_stock (boolean). All fields required.

Proxies: rotating residential pool with around 50,000 IPs. Same pool for every test.

Compute: each setup ran on its native infrastructure (self-hosted Playwright on a Fargate task, Browserbase via their cloud, Scrapybara via theirs).

Success criterion: returned record passes Pydantic validation against the schema.

Headline numbers

ApproachTotal cost per 10,000 pagesSuccessful extractionsCost per success
Self-hosted Playwright with hand-tuned selectors$259,640$0.0026
Self-hosted Playwright with GPT-4o-mini extraction$909,810$0.0092
Stagehand with GPT-4o-mini$3109,650$0.032
browser-use with GPT-4o-mini$3909,580$0.041
Browserbase + Stagehand with GPT-4o-mini$4109,720$0.042
Browserbase + Stagehand with GPT-4o$1,9509,830$0.198
browser-use with Claude Sonnet 4.5$5109,640$0.053
Anthropic Computer Use with self-hosted browser$2,1009,810$0.214
OpenAI Operator API with self-hosted browser$2,7509,720$0.283

The cheapest approach (hand-tuned Playwright) costs about 100x less than the most expensive (Operator API). Both produce useful data. The decision is about engineering trade-offs, not pure cost.

Cost breakdown components

Every AI scraping cost has four parts:

  • Compute: the headless browser runtime
  • Proxy: residential or mobile IP traffic
  • LLM tokens: the agent loop and extraction
  • Engineering time: not counted in the table above but real

For 10,000 pages on a typical AI scraping setup:

ComponentCost
Compute (Browserbase or self-hosted)$40-$80
Proxy (residential, ~5MB per page)$50-$200
LLM tokens (GPT-4o-mini for extraction)$30-$60
LLM tokens (full agent loop)$200-$400

Proxies are the largest line item for many setups, not LLMs. Optimize your proxy mix first.

Methodology notes

The benchmark ran each setup over 4 hours with the same 10,000 URL list. Failures were retried once with the same setup; failures after retry were counted as failures and the cost of both attempts is included in the total.

Each setup ran with default settings of the framework, then a “tuned” pass after spending 30 minutes optimizing prompts, schemas, and timing. The numbers reported are from the tuned pass. Untuned numbers were 30 to 60 percent higher across the board, which is itself a useful data point: out-of-the-box AI scraping is more expensive than necessary.

Currency: all costs in USD as of April 2026. Vendor pricing changes; re-run your own benchmarks before committing to a multi-quarter setup.

Cost-quality trade-off

Cheap setups extract correctly most of the time. Expensive setups extract correctly almost always. The question is whether the last 1-2 percent is worth 10x the cost.

For low-stakes data (price monitoring, casual research), use the cheap setup and accept the error rate. For regulated, decision-critical data (financial information, healthcare), spend the money for the higher-quality extraction.

In our 10,000-page run, the failure modes by approach were:

ApproachCommon failure modes
Hand-tuned PlaywrightSite layout changed, selector broke
Playwright + LLM extractionPage rendered late, extraction got placeholder
StagehandAgent picked wrong product (related items page)
browser-useAgent hallucinated price on broken page
Operator/Computer UseCost spike on hard pages, occasional retries hit budget

Per-page cost trajectory

Cost per page as you scale from 100 to 10 million pages on each setup:

Approach100 pages10K pages1M pages10M pages
Hand-tuned Playwright$0.025$0.0025$0.0021$0.0019
Playwright + LLM extraction$0.020$0.009$0.008$0.007
Stagehand$0.040$0.031$0.029$0.028
browser-use$0.045$0.039$0.037$0.036
Operator/Computer Use$0.30$0.275$0.265$0.260

The cost curve flattens for AI approaches because the LLM cost dominates and does not benefit much from scale. Hand-tuned Playwright benefits most from scale because the engineering cost amortizes.

The crossover point: at around 1 million pages per month, hand-tuned Playwright with a small LLM extraction layer beats pure AI agent approaches on unit cost. Below that, AI agents save more in engineering time than they cost in tokens.

Real-world cost per workflow

Five common workflows and their realistic cost in 2026:

WorkflowPages per monthRecommended setupMonthly cost
Casual price monitoring (5 sites)5,000Stagehand + GPT-4o-mini$200
Competitor catalog tracking50,000browser-use + GPT-4o-mini + mobile proxy$2,000
Lead enrichment from web100,000Playwright + LLM extraction$1,200
Cross-marketplace ecommerce monitoring1,000,000Hand-tuned Playwright with LLM fallback$4,000
News and content aggregation10,000,000Hand-tuned Playwright$20,000

The smaller the volume, the better AI agents look. Above 1 million pages per month, AI agents start to look expensive on unit economics.

Cost spread by target site

Same setup (Stagehand + GPT-4o-mini), different sites in our benchmark:

SiteAvg cost per pageNotes
Hacker News$0.018Stable, cheap, no JS rendering needed
Lazada SG$0.034Heavy SPA, mobile proxy required
Shopee SG$0.038Stronger bot defense than Lazada
Amazon US$0.029Big DOM but stable
eBay$0.026Mostly static HTML
Booking.com$0.052Multi-step navigation
LinkedIn job posts$0.045Login-gated, careful pacing
Walmart$0.031Routine ecommerce shape

The 3x range across sites is normal. Plan budgets per-site, not per-pipeline.

Cost reduction patterns

Three patterns cut AI scraping cost without hurting quality.

Cache extraction by content hash. If you have seen the page before, reuse the extraction. For sites that change rarely, this can cut LLM cost by 50-80 percent on follow-up runs.

Two-tier model selection. Try GPT-4o-mini first, fall back to GPT-4o on validation failure. About 90 percent of pages succeed on the cheap path; the fallback handles the hard ones.

Trim HTML before extraction. A 800KB page becomes 30KB after stripping scripts, styles, and navigation. Tokens drop proportionally. See our LLM extraction patterns guide for details.

Proxy cost considerations

Proxy traffic is often the biggest single line item. Three considerations:

Proxy typeCost per GBUse case
Datacenter$0.10 – $1.00Friendly sites, no bot defense
Residential$4 – $12Standard ecommerce, social
Mobile carrier$15 – $35Hardest defenses, banking, ASEAN ecommerce

For ASEAN scraping with mobile IPs that pass strict carrier-level checks, Singapore mobile proxy is in the $15-$25 per GB range and dominates Singtel/StarHub-protected sites.

For US/EU scraping, Bright Data, Oxylabs, and Smartproxy are the typical residential picks. See our best residential proxy providers 2026 review for current ranking.

Engineering cost (the hidden line item)

Engineering hours per scraper, by setup:

ApproachInitial buildMaintenance per month
Hand-tuned Playwright4-8 hours per site1-2 hours per site
Playwright + LLM extraction2-4 hours per site0.5 hours per site
Stagehand30-60 min per site<0.25 hours per site
browser-use30-60 min per site<0.25 hours per site
Operator/Computer Use30 min per workflowminimal

At a $100/hour fully-loaded engineering cost, hand-tuned Playwright maintenance for 10 sites runs $1,000-$2,000 per month in engineering time. AI agents can pay for themselves on this line alone.

Hourly engineering cost in detail

We tracked engineer time over the four-hour benchmark window:

SetupEngineer minutes spentEngineer cost @ $100/hr
Hand-tuned Playwright92$153
Playwright + LLM extraction51$85
Stagehand28$47
browser-use26$43
Browserbase + Stagehand31$52
Operator/Computer Use35$58

Hand-tuned Playwright wins on per-page cost but loses on engineer cost. For workloads with multiple new sites per quarter, the engineer time savings on AI agents pay for the LLM bill many times over.

Long-tail target cost variance

The 10,000-page benchmark used a balanced mix. Real production workloads have long tails: 5 percent of pages are 10x harder than the median.

Across the benchmark, the per-page cost distribution looked like:

PercentileGPT-4o-mini costGPT-4o cost
p50$0.027$0.18
p90$0.045$0.31
p99$0.110$0.78
max$0.34$2.10

The p99 cost is roughly 4x the median. For budget planning, use p99 as the worst-case unit cost and budget total based on expected page count plus a 30 percent safety margin.

Comparison to alternatives

For workflows where AI scraping is overkill, the right answer might be a managed scraping API.

ServiceCost per 10K pagesBest fit
ScraperAPI$50-$150Standard ecommerce
ZenRows$40-$120JS-heavy with Cloudflare
ScrapingBee$50-$140General use
Bright Data Web Scraper API$80-$200Enterprise
Apify Actor Marketplace$30-$100Pre-built scrapers

Managed APIs hit a sweet spot for teams that do not want to manage infrastructure but also do not need the agentic flexibility. See our best web scraping APIs 2026 for the full ranking.

Decision matrix

Pick your stack based on volume and target shape:

VolumeTarget shapeRecommended stack
<10K pages/monthAnyStagehand + Browserbase or browser-use
10K-100KStablePlaywright + LLM extraction
10K-100KChanging oftenStagehand + LLM extraction
100K-1MStableSelf-hosted Playwright + LLM extraction
100K-1MChanging oftenHybrid: Playwright fast path, browser-use fallback
>1MStableHand-tuned Playwright
>1MChanging oftenHybrid + dedicated scraping engineer

ROI analysis: when AI scraping pays back

The right way to evaluate any AI scraping setup is total cost of ownership, not unit cost. A worked example for a 5-engineer scraping team:

SetupAnnual unit costAnnual engineer costTotal
Hand-tuned Playwright (10 sites)$25,000$120,000 (1 FTE)$145,000
AI agents on 10 sites$75,000$30,000 (0.25 FTE)$105,000

The AI setup costs more in unit terms but frees three quarters of an engineer’s time. If that engineer is doing other valuable work, the AI setup is a $40,000 annual saving.

Where this calculus breaks: if the engineer would just be sitting idle without the scraper to maintain, the unit cost dominates and Playwright wins. In practice, scraping engineers always have more work than time, so AI agents pay back.

Hidden cost categories

A few costs that benchmark tables typically miss:

Logging and storage. AI scraping produces detailed traces; persisting them for 90 days runs $50 to $200 per million records depending on storage tier.

Observability vendor cost. LangSmith, Arize, Honeycomb, Datadog all charge per span. AI scraping is span-heavy. Budget $100 to $400 per month for a small production deployment.

LLM rate-limit overage. A scraper that hits tier-2 rate limits during a backfill might need to upgrade tier or accept slower throughput. Tier-3 access requires sustained usage, which itself is a cost.

Compliance review. Some legal teams require additional review on AI-driven extraction. Budget engineering and legal review hours on first deployment.

Replatforming. Most teams switch frameworks at least once in the first two years as the AI scraping space evolves. Budget for a half-quarter migration window.

Production observability

Whatever stack you pick, log cost per page in structured form. Fields to capture: timestamp, source URL, model used, input tokens, output tokens, proxy GB, browser session minutes, validation result.

This data lets you spot cost regressions and target sites that are unexpectedly expensive. Most teams discover one or two outlier sites that consume 10x the median cost; once flagged, they can be optimized or moved to a different scraping path.

Cost across model vendors

Same Stagehand setup, different LLM models, same 10,000 pages:

ModelCost per 10KAccuracyp99 latency
GPT-4o-mini$31096.5%4.4 s
GPT-4o$1,95098.4%6.1 s
Claude Haiku 3.5$37095.8%3.9 s
Claude Sonnet 4.5$2,18098.7%7.2 s
Gemini 1.5 Flash$18595.0%3.1 s
Gemini 1.5 Pro$1,52097.4%5.4 s
Llama 3.3 70B (self-host on H100)$9092.3%2.8 s

Headline: Gemini Flash is the cheapest of the top-tier closed-source options. Llama 3.3 self-hosted is even cheaper but accuracy is roughly 4 points lower.

For most production workloads in 2026, the value pick is GPT-4o-mini. The cost-conscious pick is Gemini Flash. The privacy-conscious pick is self-hosted Llama or Qwen.

Cost over 12 months

A worked projection for a hypothetical 100k-pages-per-month workload:

SetupYear 1 costYear 2 cost (with optimization)
Hand-tuned Playwright$14,400$14,400
Playwright + LLM$14,400$11,500
Stagehand$42,000$33,000
browser-use$50,000$40,000

Optimization typically cuts AI agent cost by 20 to 30 percent in year two as caching, prompt tuning, and HTML trimming mature. Hand-tuned Playwright cost stays flat because engineer time dominates.

Frequently asked questions

Why is my actual cost higher than these benchmarks?
Three common causes: agent loops on confused pages (set max_iterations), oversized HTML sent to extraction (trim before extracting), or expensive model used by default (downgrade to mini variants).

Do these benchmarks include retries and failures?
Yes. The cost numbers include the cost of failed runs. Successful extractions per 10K is the second column.

What about Gemini-based scraping?
Gemini 1.5 Flash is the cheapest production-quality model in 2026. Substituting Flash for GPT-4o-mini in any of the AI agent setups cuts LLM cost by another 30-50 percent.

How do I forecast cost for a new scraping target?
Run 100 pages, measure cost, multiply by your expected volume. Add 30 percent buffer for retries and harder pages. Re-measure monthly.

Are open-source models a real option for cost control?
Yes for extraction (Llama 3.3 70B, Qwen 2.5 72B work well). Mostly no for full agent loops (current open-source models still trail GPT-4o and Claude Sonnet on tool use reliability).

How do I budget for unexpected target site changes?
Add a 30 percent contingency to your annual cost projection. Sites change formats, bot defenses get harder, and new targets land in scope. Without contingency, a single big site overhaul can wipe out a quarter’s headroom.

Is per-page cost really the right metric?
For commodity scraping, yes. For high-stakes data, cost-per-correct-record is more useful. A 99 percent accurate extraction at $0.04 beats a 96 percent accurate one at $0.01 if errors trigger downstream review at $5 each.

Can I buy AI scraping as a managed service instead of building?
Yes. Apify’s Smart Crawler, Bright Data’s Web Scraper API, and several startups offer managed AI scraping. Per-page cost is roughly 2 to 3x DIY because the vendor adds margin. The trade-off is zero engineer time on infrastructure.

Common cost gotchas

A handful of patterns that drain AI scraping budgets faster than expected.

The agent loops on a confused page, burning 50,000 tokens before timing out. Cap iterations and abort hard.

A spike in a target’s bot defenses doubles the proxy cost overnight. Track per-target proxy cost and alert on changes greater than 30 percent week-over-week.

Verbose logging captures the full screenshot in JSON. The OpenAI API treats long input as long output for cost. Keep logs brief.

Scheduling all scrapes at midnight means hitting peak provider load. Spread across the hour.

Caching keys that include the timestamp instead of content. Every “cache hit” is actually a cache miss.

For broader patterns on the AI scraping stack, browse the AI modern scraping category.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
message me on telegram

Resources

Proxy Signals Podcast
Operator-level insights on mobile proxies and access infrastructure.

Multi-Account Proxies: Setup, Types, Tools & Mistakes (2026)