HTTP 429 Too Many Requests: Backoff Strategies for Scrapers

—

HTTP 429 Too Many Requests is the most common wall scrapers hit, and most teams handle it wrong the first time. They catch the status code, sleep for a flat 5 seconds, retry, and wonder why they keep getting banned. The real fix is a layered backoff strategy that respects rate limit signals, randomizes timing, and pairs with proxy rotation so you are not hammering the same IP repeatedly. This guide covers what actually works in 2026.

Why Flat Sleeps Do Not Work

A flat retry sleep has two failure modes. First, if your rate is already too high, sleeping 5 seconds and resuming at the same rate just delays the next 429. Second, deterministic sleep patterns are easy for bot-detection systems to fingerprint. Akamai Bot Manager and Cloudflare’s bot score both flag traffic that resumes at predictable intervals after 429s.

The root issue is that 429 handling is not just about slowing down. It is about communicating to the target server that you are a responsible client. The pillar guide on 429 rate limiting covers the full error taxonomy, but for scraping specifically, the key insight is that you need to respect Retry-After headers when present, and fall back to exponential backoff with jitter when they are absent.

Exponential Backoff with Full Jitter

Exponential backoff means doubling your wait time on each successive failure. Full jitter adds a random fraction so that concurrent workers do not synchronize and slam the server at the same moment (known as the thundering herd problem).

Here is a minimal Python implementation:

import time
import random

def backoff_sleep(attempt: int, base: float = 1.0, cap: float = 120.0):
    sleep = min(cap, base * (2 ** attempt))
    jitter = random.uniform(0, sleep)
    time.sleep(jitter)

def fetch_with_retry(url: str, session, max_attempts: int = 6):
    for attempt in range(max_attempts):
        resp = session.get(url)
        if resp.status_code == 429:
            retry_after = resp.headers.get("Retry-After")
            if retry_after:
                time.sleep(float(retry_after) + random.uniform(0.5, 2.0))
            else:
                backoff_sleep(attempt)
            continue
        resp.raise_for_status()
        return resp
    raise Exception(f"Failed after {max_attempts} attempts: {url}")

Key details: the Retry-After check comes first because it is always more accurate than your own estimate. Adding 0.5 to 2 seconds of jitter on top of the server-provided delay prevents synchronized retries across your worker pool.

Proxy Rotation Strategy

Backoff alone will not save you on high-volume jobs. If you are rotating through the same 5 proxies with a 30-second backoff, the server still sees 5 IPs hammering it. Effective proxy rotation means:

Using a pool large enough that each IP is used infrequently relative to the target’s per-IP rate limit
Retiring IPs that receive a 429 for at least the duration of the Retry-After window
Preferring residential or mobile IPs for consumer-facing targets (e-commerce, travel, social)

For B2B data collection at scale, proxy-integrated tools handle this IP retirement automatically and are often worth the cost over managing your own pool. For tightly rate-limited targets like ticket platforms, where per-IP limits are enforced aggressively, the live ticket price monitoring guide has specific proxy recommendations.

Backoff + Proxy Pairing

The correct model is: on 429, retire the current IP and apply backoff before reassigning a new IP to that task. If you retire the IP but immediately reassign a fresh one at full speed, you are just cycling burn through your pool.

Concurrency Limiting and Token Bucket Rate Control

Most scraping frameworks let you set a global concurrency limit. That is not the same as rate limiting. You can have 10 concurrent workers, each firing 1 request per second, for a total of 10 RPS. If the target allows 5 RPS across your proxy pool, you will 429 constantly regardless of backoff.

A token bucket controls the actual request rate. Each request consumes a token, tokens replenish at a fixed rate, and requests that cannot get a token wait. Libraries like ratelimiter (Python) or bottleneck (Node.js) implement this in a few lines.

Concurrency Model	Controls Parallelism	Controls Request Rate	Correct for 429 Prevention
`asyncio.Semaphore`	yes	no	partial
Token bucket (`ratelimiter`)	no	yes	yes
Both combined	yes	yes	best
Flat sleep between requests	no	loosely	weak

The combination is the right default. Semaphore prevents unbounded coroutine spawning. Token bucket enforces the actual throughput ceiling you have measured for the target.

Reading the Target: Adaptive Rate Detection

Some targets publish rate limits in response headers. Others do not. For targets that do, look for:

X-RateLimit-Limit: total requests allowed in the window
X-RateLimit-Remaining: how many are left
X-RateLimit-Reset: Unix timestamp when the window resets

When X-RateLimit-Remaining drops below 10% of the limit, slow down preemptively rather than waiting for the 429. This keeps your scraper in the “good client” zone that bot detection systems treat less aggressively.

For targets without these headers, the signal is 429 frequency itself. Track your 429 rate over a rolling 60-second window. If it exceeds 5%, halve your request rate. If it drops to zero for 120 seconds, increase by 20%. This converges on the effective limit without hardcoding it.

Selectors and page structure changes are a related signal. If you are scraping structured search output like Google Shopping and the response structure shifts, that is often a soft block before a hard 429 — the Google Shopping scraping guide using the sh-dgr__content selector shows what stable selector anchors look like on a target that rate limits heavily. For review scraping on consumer platforms, the Airbnb review scraping guide using data-review-id covers how session management interacts with rate limit windows.

Retry Budgets and Failure Accounting

One thing teams skip: bounding total retries across the entire job, not just per request. If your job has 10,000 URLs and you allow 6 retries each, you could make 60,000 requests before the job fails. Set a job-level retry budget.

Calculate expected total requests: URLs multiplied by expected retries given your observed 429 rate
Set a hard cap: if total 429s exceed 15% of total attempts, abort and alert
Log every 429 with timestamp, IP, URL pattern, and response headers
Use that log to tune per-domain rate limits for future runs

This logging discipline also tells you which targets are getting harder over time. A target that needed a 2% retry budget six months ago and now needs 12% is tightening its defenses. You adjust strategy before it becomes a blocker.

Bottom Line

Use exponential backoff with full jitter, respect Retry-After headers when present, pair IP retirement with backoff on every 429, and add a token bucket to control actual request rate rather than just concurrency. If you are running volume jobs and managing your own proxy pool is slowing you down, most of the better scraping platforms now handle 429-aware rotation natively. DRT covers that tooling landscape regularly as the space evolves.