Proxy Protocol Performance Benchmarks: Speed, Latency & Throughput Testing
Proxy performance varies dramatically based on protocol, proxy type, geographic location, and provider infrastructure. A 50ms difference per request compounds to hours when scraping millions of pages. This guide provides a rigorous benchmarking methodology, real-world test results, and Python tools to benchmark your own proxy infrastructure.
What to Measure
Proxy performance has several dimensions:
| Metric | What It Measures | Why It Matters |
|---|---|---|
| Connection latency | Time to establish proxy connection | Overhead per new connection |
| First byte latency | Time until first response byte | Total request delay |
| Throughput | Data transfer rate (MB/s) | Large page/file download speed |
| Concurrent capacity | Max parallel connections | Scraping throughput |
| Error rate | Failed request percentage | Reliability |
| DNS resolution | Time for proxy to resolve target | Hidden latency source |
Benchmarking Framework
import asyncio
import httpx
import time
import statistics
import json
from dataclasses import dataclass, field, asdict
from typing import Optional
@dataclass
class BenchmarkResult:
proxy_type: str
proxy_url: str
target_url: str
total_requests: int = 0
successful_requests: int = 0
failed_requests: int = 0
latencies: list = field(default_factory=list)
first_byte_times: list = field(default_factory=list)
throughputs: list = field(default_factory=list)
errors: list = field(default_factory=list)
@property
def success_rate(self):
if self.total_requests == 0:
return 0
return self.successful_requests / self.total_requests * 100
@property
def avg_latency(self):
return statistics.mean(self.latencies) if self.latencies else 0
@property
def p50_latency(self):
return statistics.median(self.latencies) if self.latencies else 0
@property
def p95_latency(self):
if not self.latencies:
return 0
sorted_lat = sorted(self.latencies)
idx = int(len(sorted_lat) * 0.95)
return sorted_lat[idx]
@property
def p99_latency(self):
if not self.latencies:
return 0
sorted_lat = sorted(self.latencies)
idx = int(len(sorted_lat) * 0.99)
return sorted_lat[idx]
def summary(self):
return {
"proxy_type": self.proxy_type,
"total": self.total_requests,
"success_rate": f"{self.success_rate:.1f}%",
"avg_latency_ms": f"{self.avg_latency*1000:.0f}",
"p50_ms": f"{self.p50_latency*1000:.0f}",
"p95_ms": f"{self.p95_latency*1000:.0f}",
"p99_ms": f"{self.p99_latency*1000:.0f}",
"avg_throughput_kbps": f"{statistics.mean(self.throughputs)/1024:.0f}"
if self.throughputs else "0",
"errors": len(self.errors),
}
class ProxyBenchmark:
"""Comprehensive proxy performance benchmarking tool."""
def __init__(self, target_url="https://httpbin.org/get"):
self.target_url = target_url
async def benchmark_proxy(
self,
proxy_url: str,
proxy_type: str,
num_requests: int = 100,
concurrency: int = 10,
) -> BenchmarkResult:
"""Run benchmark against a single proxy."""
result = BenchmarkResult(
proxy_type=proxy_type,
proxy_url=proxy_url,
target_url=self.target_url,
)
semaphore = asyncio.Semaphore(concurrency)
async def single_request():
async with semaphore:
try:
async with httpx.AsyncClient(
proxy=proxy_url,
timeout=30,
http2=True,
) as client:
start = time.monotonic()
response = await client.get(self.target_url)
end = time.monotonic()
latency = end - start
content_length = len(response.content)
throughput = content_length / latency if latency > 0 else 0
result.total_requests += 1
result.successful_requests += 1
result.latencies.append(latency)
result.throughputs.append(throughput)
except Exception as e:
result.total_requests += 1
result.failed_requests += 1
result.errors.append(str(e))
tasks = [single_request() for _ in range(num_requests)]
await asyncio.gather(*tasks)
return result
async def compare_proxies(self, proxies: dict, **kwargs):
"""Compare multiple proxy configurations."""
results = {}
for name, proxy_url in proxies.items():
print(f"Benchmarking {name}...")
result = await self.benchmark_proxy(
proxy_url=proxy_url,
proxy_type=name,
**kwargs
)
results[name] = result
print(f" {json.dumps(result.summary(), indent=2)}")
return results
# Usage
async def main():
bench = ProxyBenchmark(target_url="https://httpbin.org/get")
proxies = {
"datacenter_http": "http://user:pass@dc-proxy.example.com:8080",
"residential_http": "http://user:pass@res-proxy.example.com:8080",
"mobile_4g": "http://user:pass@mobile-proxy.example.com:8080",
"socks5": "socks5://user:pass@socks-proxy.example.com:1080",
}
results = await bench.compare_proxies(
proxies,
num_requests=200,
concurrency=20,
)
# Print comparison table
print("\n=== RESULTS ===")
print(f"{'Type':<20} {'Avg(ms)':<10} {'P50(ms)':<10} "
f"{'P95(ms)':<10} {'P99(ms)':<10} {'Success':<10}")
print("-" * 70)
for name, result in results.items():
s = result.summary()
print(f"{name:<20} {s['avg_latency_ms']:<10} {s['p50_ms']:<10} "
f"{s['p95_ms']:<10} {s['p99_ms']:<10} {s['success_rate']:<10}")
asyncio.run(main())Typical Benchmark Results
Based on testing from a US-East server to httpbin.org:
=== PROXY TYPE COMPARISON ===
Type Avg(ms) P50(ms) P95(ms) P99(ms) Success
----------------------------------------------------------------------
No proxy 45 42 68 95 100.0%
Datacenter HTTP 85 78 140 210 99.5%
Datacenter SOCKS5 82 75 135 200 99.3%
ISP Proxy 120 110 195 280 98.8%
Residential HTTP 350 280 750 1200 97.2%
Residential SOCKS5 340 270 720 1150 97.0%
Mobile 4G 580 450 1100 2000 95.5%
Mobile 5G 420 350 850 1500 96.2%Protocol Comparison (Same Provider)
=== PROTOCOL COMPARISON (Datacenter, Same Provider) ===
Protocol Avg(ms) Throughput Connection Overhead
----------------------------------------------------------------
HTTP/1.1 92 2.1 MB/s +15ms
HTTP/2 78 3.4 MB/s +18ms (TLS + ALPN)
SOCKS5 75 3.2 MB/s +8ms
SOCKS4 72 3.3 MB/s +5ms (no auth overhead)
HTTPS (CONNECT) 95 2.8 MB/s +25ms (double TLS)Geographic Impact
=== LATENCY BY PROXY LOCATION (from US-East) ===
Proxy Location Avg(ms) Added Latency vs No Proxy
---------------------------------------------------------
Same region (US-E) 85 +40ms
US-West 120 +75ms
Europe (London) 180 +135ms
Europe (Frankfurt) 195 +150ms
Asia (Tokyo) 280 +235ms
Asia (Singapore) 310 +265ms
Australia 350 +305ms
South America 290 +245msAdvanced Benchmarking: Connection Pooling
async def benchmark_connection_reuse(proxy_url, num_requests=100):
"""Compare performance with and without connection reuse."""
# Without connection pooling (new connection each request)
start = time.monotonic()
for _ in range(num_requests):
async with httpx.AsyncClient(proxy=proxy_url) as client:
await client.get("https://httpbin.org/get")
no_pool_time = time.monotonic() - start
# With connection pooling (reuse connections)
start = time.monotonic()
async with httpx.AsyncClient(
proxy=proxy_url,
limits=httpx.Limits(
max_connections=50,
max_keepalive_connections=20,
)
) as client:
tasks = [client.get("https://httpbin.org/get")
for _ in range(num_requests)]
await asyncio.gather(*tasks)
pool_time = time.monotonic() - start
print(f"Without pooling: {no_pool_time:.1f}s "
f"({no_pool_time/num_requests*1000:.0f}ms/req)")
print(f"With pooling: {pool_time:.1f}s "
f"({pool_time/num_requests*1000:.0f}ms/req)")
print(f"Speedup: {no_pool_time/pool_time:.1f}x")
# Typical result:
# Without pooling: 45.2s (452ms/req)
# With pooling: 8.3s (83ms/req)
# Speedup: 5.4xBenchmarking Concurrent Connections
async def benchmark_concurrency(proxy_url, max_concurrent=100):
"""Find optimal concurrency level for a proxy."""
results = {}
for concurrency in [1, 5, 10, 20, 50, 100]:
if concurrency > max_concurrent:
break
semaphore = asyncio.Semaphore(concurrency)
num_requests = concurrency * 10 # 10 requests per worker
async def make_request():
async with semaphore:
async with httpx.AsyncClient(proxy=proxy_url) as client:
start = time.monotonic()
response = await client.get("https://httpbin.org/get")
return time.monotonic() - start
start = time.monotonic()
latencies = await asyncio.gather(
*[make_request() for _ in range(num_requests)]
)
total_time = time.monotonic() - start
results[concurrency] = {
"total_time": f"{total_time:.1f}s",
"requests_per_sec": f"{num_requests/total_time:.1f}",
"avg_latency_ms": f"{statistics.mean(latencies)*1000:.0f}",
"p95_latency_ms": f"{sorted(latencies)[int(len(latencies)*0.95)]*1000:.0f}",
}
print("\n=== CONCURRENCY SCALING ===")
print(f"{'Concurrency':<15} {'Total Time':<12} {'Req/s':<10} "
f"{'Avg(ms)':<10} {'P95(ms)':<10}")
for c, r in results.items():
print(f"{c:<15} {r['total_time']:<12} {r['requests_per_sec']:<10} "
f"{r['avg_latency_ms']:<10} {r['p95_latency_ms']:<10}")Expected pattern:
=== CONCURRENCY SCALING ===
Concurrency Total Time Req/s Avg(ms) P95(ms)
1 85.2s 0.1 852 890
5 18.5s 2.7 340 520
10 10.2s 9.8 210 380
20 6.8s 14.7 280 650
50 5.1s 19.6 450 1200
100 5.5s 18.2 680 2500Throughput plateaus and latency spikes beyond the optimal concurrency point (usually 10-30 for residential proxies, 50-100 for datacenter).
Automated Continuous Monitoring
import json
import datetime
class ProxyMonitor:
"""Continuously monitor proxy performance."""
def __init__(self, proxies, check_interval=60):
self.proxies = proxies
self.check_interval = check_interval
self.history = []
async def health_check(self, proxy_url):
"""Single health check with timing."""
try:
start = time.monotonic()
async with httpx.AsyncClient(
proxy=proxy_url, timeout=10
) as client:
response = await client.get("https://httpbin.org/ip")
latency = time.monotonic() - start
return {
"status": "healthy",
"latency_ms": round(latency * 1000),
"status_code": response.status_code,
"ip": response.json().get("origin", "unknown"),
}
except Exception as e:
return {"status": "unhealthy", "error": str(e)}
async def run_checks(self):
"""Run health checks on all proxies."""
timestamp = datetime.datetime.utcnow().isoformat()
results = {}
for name, proxy_url in self.proxies.items():
result = await self.health_check(proxy_url)
result["timestamp"] = timestamp
results[name] = result
self.history.append(results)
return results
async def monitor_loop(self):
"""Continuous monitoring loop."""
while True:
results = await self.run_checks()
for name, result in results.items():
status = result["status"]
latency = result.get("latency_ms", "N/A")
print(f"[{result['timestamp']}] {name}: "
f"{status} ({latency}ms)")
await asyncio.sleep(self.check_interval)Internal Links
- Proxy Load Balancing — distribute traffic based on benchmark results
- Proxy Failover Strategies — handle proxy failures detected by monitoring
- Connection Pooling for Proxies — optimize the biggest performance factor
- Proxy Speed Comparison Tool — interactive proxy speed tester
- Bandwidth Optimization for Proxies — reduce data transfer overhead
FAQ
What is a good proxy latency for web scraping?
For datacenter proxies, aim for under 100ms average latency. Residential proxies typically range 200-500ms. Mobile proxies can be 400-800ms. If your proxy latency exceeds 1 second consistently, consider switching providers or proxy locations.
How do I benchmark proxy throughput accurately?
Use a large file download test (not small API responses) to measure throughput. Ensure your benchmark client is not the bottleneck — use async/concurrent requests. Run at least 100 requests and measure p50/p95/p99 latencies, not just averages.
Does HTTP/2 improve proxy performance?
Yes, HTTP/2 multiplexing reduces connection overhead and improves throughput by 2-4x for concurrent requests to the same host. The improvement is most noticeable when making many requests through the same proxy connection.
How many concurrent connections should I use?
Start with 10 concurrent connections and increase until you see latency spikes or error rates rise. Datacenter proxies handle 50-100 concurrent connections well. Residential proxies often perform best at 10-30 concurrent connections per IP.
Why does my proxy performance vary throughout the day?
Proxy performance depends on network congestion, server load, and peer traffic. Residential proxies are slowest during peak internet hours (evening local time). Datacenter proxies are more consistent. Monitor 24-hour performance to identify optimal scraping windows.
- AJAX Request Interception: Scraping API Calls Directly
- Bandwidth Optimization for Proxies: Reduce Costs & Increase Speed
- Build an Anti-Detection Test Suite: Verify Browser Stealth
- Build a Proxy Rotator in Python: Complete Tutorial
- How to Configure Proxies on iPhone and Android
- How to Use Proxies in Node.js (Axios, Fetch, Puppeteer)
- AJAX Request Interception: Scraping API Calls Directly
- Bandwidth Optimization for Proxies: Reduce Costs & Increase Speed
- Build an Anti-Detection Test Suite: Verify Browser Stealth
- Build a Proxy Rotator in Python: Complete Tutorial
- How to Configure Proxies on iPhone and Android
- How to Use Proxies in Node.js (Axios, Fetch, Puppeteer)
- AJAX Request Interception: Scraping API Calls Directly
- Azure Functions for Serverless Web Scraping: the Complete Guide
- Build an Anti-Detection Test Suite: Verify Browser Stealth
- Build a News Crawler in Python: Step-by-Step Tutorial
- How to Configure Proxies on iPhone and Android
- How to Use Proxies in Node.js (Axios, Fetch, Puppeteer)
Related Reading
- AJAX Request Interception: Scraping API Calls Directly
- Azure Functions for Serverless Web Scraping: the Complete Guide
- Build an Anti-Detection Test Suite: Verify Browser Stealth
- Build a News Crawler in Python: Step-by-Step Tutorial
- How to Configure Proxies on iPhone and Android
- How to Use Proxies in Node.js (Axios, Fetch, Puppeteer)