—
HTTP 429 Too Many Requests is the most common wall scrapers hit, and most teams handle it wrong the first time. They catch the status code, sleep for a flat 5 seconds, retry, and wonder why they keep getting banned. The real fix is a layered backoff strategy that respects rate limit signals, randomizes timing, and pairs with proxy rotation so you are not hammering the same IP repeatedly. This guide covers what actually works in 2026.
Why Flat Sleeps Do Not Work
A flat retry sleep has two failure modes. First, if your rate is already too high, sleeping 5 seconds and resuming at the same rate just delays the next 429. Second, deterministic sleep patterns are easy for bot-detection systems to fingerprint. Akamai Bot Manager and Cloudflare’s bot score both flag traffic that resumes at predictable intervals after 429s.
The root issue is that 429 handling is not just about slowing down. It is about communicating to the target server that you are a responsible client. The pillar guide on 429 rate limiting covers the full error taxonomy, but for scraping specifically, the key insight is that you need to respect Retry-After headers when present, and fall back to exponential backoff with jitter when they are absent.
Exponential Backoff with Full Jitter
Exponential backoff means doubling your wait time on each successive failure. Full jitter adds a random fraction so that concurrent workers do not synchronize and slam the server at the same moment (known as the thundering herd problem).
Here is a minimal Python implementation:
import time
import random
def backoff_sleep(attempt: int, base: float = 1.0, cap: float = 120.0):
sleep = min(cap, base * (2 ** attempt))
jitter = random.uniform(0, sleep)
time.sleep(jitter)
def fetch_with_retry(url: str, session, max_attempts: int = 6):
for attempt in range(max_attempts):
resp = session.get(url)
if resp.status_code == 429:
retry_after = resp.headers.get("Retry-After")
if retry_after:
time.sleep(float(retry_after) + random.uniform(0.5, 2.0))
else:
backoff_sleep(attempt)
continue
resp.raise_for_status()
return resp
raise Exception(f"Failed after {max_attempts} attempts: {url}")Key details: the Retry-After check comes first because it is always more accurate than your own estimate. Adding 0.5 to 2 seconds of jitter on top of the server-provided delay prevents synchronized retries across your worker pool.
Proxy Rotation Strategy
Backoff alone will not save you on high-volume jobs. If you are rotating through the same 5 proxies with a 30-second backoff, the server still sees 5 IPs hammering it. Effective proxy rotation means:
- Using a pool large enough that each IP is used infrequently relative to the target’s per-IP rate limit
- Retiring IPs that receive a 429 for at least the duration of the
Retry-Afterwindow - Preferring residential or mobile IPs for consumer-facing targets (e-commerce, travel, social)
For B2B data collection at scale, proxy-integrated tools handle this IP retirement automatically and are often worth the cost over managing your own pool. For tightly rate-limited targets like ticket platforms, where per-IP limits are enforced aggressively, the live ticket price monitoring guide has specific proxy recommendations.
Backoff + Proxy Pairing
The correct model is: on 429, retire the current IP and apply backoff before reassigning a new IP to that task. If you retire the IP but immediately reassign a fresh one at full speed, you are just cycling burn through your pool.
Concurrency Limiting and Token Bucket Rate Control
Most scraping frameworks let you set a global concurrency limit. That is not the same as rate limiting. You can have 10 concurrent workers, each firing 1 request per second, for a total of 10 RPS. If the target allows 5 RPS across your proxy pool, you will 429 constantly regardless of backoff.
A token bucket controls the actual request rate. Each request consumes a token, tokens replenish at a fixed rate, and requests that cannot get a token wait. Libraries like ratelimiter (Python) or bottleneck (Node.js) implement this in a few lines.
| Concurrency Model | Controls Parallelism | Controls Request Rate | Correct for 429 Prevention |
|---|---|---|---|
asyncio.Semaphore | yes | no | partial |
Token bucket (ratelimiter) | no | yes | yes |
| Both combined | yes | yes | best |
| Flat sleep between requests | no | loosely | weak |
The combination is the right default. Semaphore prevents unbounded coroutine spawning. Token bucket enforces the actual throughput ceiling you have measured for the target.
Reading the Target: Adaptive Rate Detection
Some targets publish rate limits in response headers. Others do not. For targets that do, look for:
X-RateLimit-Limit: total requests allowed in the windowX-RateLimit-Remaining: how many are leftX-RateLimit-Reset: Unix timestamp when the window resets
When X-RateLimit-Remaining drops below 10% of the limit, slow down preemptively rather than waiting for the 429. This keeps your scraper in the “good client” zone that bot detection systems treat less aggressively.
For targets without these headers, the signal is 429 frequency itself. Track your 429 rate over a rolling 60-second window. If it exceeds 5%, halve your request rate. If it drops to zero for 120 seconds, increase by 20%. This converges on the effective limit without hardcoding it.
Selectors and page structure changes are a related signal. If you are scraping structured search output like Google Shopping and the response structure shifts, that is often a soft block before a hard 429 — the Google Shopping scraping guide using the sh-dgr__content selector shows what stable selector anchors look like on a target that rate limits heavily. For review scraping on consumer platforms, the Airbnb review scraping guide using data-review-id covers how session management interacts with rate limit windows.
Retry Budgets and Failure Accounting
One thing teams skip: bounding total retries across the entire job, not just per request. If your job has 10,000 URLs and you allow 6 retries each, you could make 60,000 requests before the job fails. Set a job-level retry budget.
- Calculate expected total requests: URLs multiplied by expected retries given your observed 429 rate
- Set a hard cap: if total 429s exceed 15% of total attempts, abort and alert
- Log every 429 with timestamp, IP, URL pattern, and response headers
- Use that log to tune per-domain rate limits for future runs
This logging discipline also tells you which targets are getting harder over time. A target that needed a 2% retry budget six months ago and now needs 12% is tightening its defenses. You adjust strategy before it becomes a blocker.
Bottom Line
Use exponential backoff with full jitter, respect Retry-After headers when present, pair IP retirement with backoff on every 429, and add a token bucket to control actual request rate rather than just concurrency. If you are running volume jobs and managing your own proxy pool is slowing you down, most of the better scraping platforms now handle 429-aware rotation natively. DRT covers that tooling landscape regularly as the space evolves.
Related guides on dataresearchtools.com
- Tools That Integrate Proxies for B2B Data Collection at Scale (2026)
- Best Tools to Track Ticket Prices in 2026: Live Monitoring Setup
- Scraping Google Shopping with sh-dgr__content Selector (2026 Guide)
- Scraping Airbnb Reviews with data-review-id Selector (2026 Guide)
- Pillar: 429 Too Many Requests: Rate Limiting Fix Guide