HTTP Headers Reference: Complete Guide for Proxy Users & Web Scrapers
HTTP headers are the metadata sent between clients and servers with every request and response. For proxy users, web scrapers, and API developers, understanding headers is essential. The wrong headers can get you blocked, while the right ones make your requests indistinguishable from normal browser traffic. This reference covers every important header with practical examples.
Request Headers
Request headers are sent by the client (browser, cURL, script) to the server.
Essential Request Headers
User-Agent
Identifies the client software. One of the most important headers for web scraping.
# Browser-like User-Agent
curl -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" \
https://example.com
# Mobile User-Agent
curl -H "User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Mobile/15E148 Safari/604.1" \
https://example.comAccept
Tells the server what content types the client can handle:
# JSON API
curl -H "Accept: application/json" https://api.example.com/data
# HTML page
curl -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" https://example.com
# Any content
curl -H "Accept: */*" https://example.comAccept-Language
Specifies preferred languages. Important for geo-targeted content:
curl -H "Accept-Language: en-US,en;q=0.9" https://example.com
curl -H "Accept-Language: ja-JP,ja;q=0.9,en;q=0.5" https://example.comAccept-Encoding
Indicates supported compression:
curl -H "Accept-Encoding: gzip, deflate, br" --compressed https://example.comReferer
Shows which page linked to the current request:
curl -H "Referer: https://www.google.com/" https://example.com/pageCookie
Sends stored cookies to the server:
curl -b "session_id=abc123; user_pref=dark_mode" https://example.com/dashboardAuthentication Headers
Authorization
# Basic Auth
curl -H "Authorization: Basic dXNlcjpwYXNz" https://api.example.com
# Bearer Token
curl -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIs..." https://api.example.com
# API Key
curl -H "X-API-Key: your-api-key-here" https://api.example.comContent Headers (for POST/PUT/PATCH)
Content-Type
# JSON
curl -H "Content-Type: application/json" -d '{"key":"value"}' https://api.example.com
# Form data
curl -H "Content-Type: application/x-www-form-urlencoded" -d "key=value" https://example.com
# Multipart (file upload) - cURL sets this automatically with -F
curl -F "file=@photo.jpg" https://example.com/uploadContent-Length
Automatically set by cURL based on the body size. Rarely needs manual configuration.
Response Headers
Headers returned by the server to the client.
Status and Content Headers
| Header | Description | Example Value |
|---|---|---|
Content-Type | Response body format | application/json; charset=utf-8 |
Content-Length | Response body size in bytes | 4523 |
Content-Encoding | Compression used | gzip |
Content-Language | Response language | en-US |
Content-Disposition | Download filename hint | attachment; filename="report.pdf" |
Caching Headers
| Header | Description | Example Value |
|---|---|---|
Cache-Control | Caching directives | max-age=3600, public |
ETag | Resource version identifier | "33a64df551425fcc55e4d42a148795d9f25f89d4" |
Last-Modified | When resource was last changed | Wed, 15 Jan 2025 08:00:00 GMT |
Expires | When cached content expires | Thu, 16 Jan 2025 08:00:00 GMT |
Age | Time since response was cached (seconds) | 3600 |
Security Headers
| Header | Description | Example Value |
|---|---|---|
Strict-Transport-Security | Force HTTPS | max-age=31536000; includeSubDomains |
X-Content-Type-Options | Prevent MIME sniffing | nosniff |
X-Frame-Options | Prevent clickjacking | DENY |
Content-Security-Policy | Control resource loading | default-src 'self' |
X-XSS-Protection | XSS filter | 1; mode=block |
Referrer-Policy | Control Referer header | strict-origin-when-cross-origin |
Rate Limiting Headers
| Header | Description | Example Value |
|---|---|---|
X-RateLimit-Limit | Max requests per window | 100 |
X-RateLimit-Remaining | Requests left in window | 47 |
X-RateLimit-Reset | When window resets (Unix timestamp) | 1705312800 |
Retry-After | Seconds to wait after 429/503 | 60 |
# Check rate limit headers
curl -s -I https://api.example.com/data | grep -i "rate\|retry"CORS Headers
| Header | Description | Example Value |
|---|---|---|
Access-Control-Allow-Origin | Allowed origins | * or https://app.example.com |
Access-Control-Allow-Methods | Allowed HTTP methods | GET, POST, PUT, DELETE |
Access-Control-Allow-Headers | Allowed request headers | Authorization, Content-Type |
Access-Control-Max-Age | Preflight cache time | 86400 |
Proxy-Specific Headers
Headers Added by Proxies
| Header | Description | Impact |
|---|---|---|
X-Forwarded-For | Client’s original IP | Reveals you are using a proxy |
X-Forwarded-Proto | Original protocol (http/https) | Can expose proxy usage |
X-Forwarded-Host | Original Host header | May reveal proxy |
Via | Proxy chain information | Directly identifies proxy |
Forwarded | Standardized forwarding info | Modern replacement for X-Forwarded |
X-Real-IP | Single client IP | Common in Nginx setups |
Detecting Proxy Headers
Check if your proxy leaks identifying headers:
# Check what headers the target sees
curl -x http://proxy:8080 https://httpbin.org/headers | jq '.'
# Look for proxy-revealing headers
curl -x http://proxy:8080 https://httpbin.org/headers | \
jq '.headers | to_entries[] | select(.key | test("forward|via|proxy|real.ip"; "i"))'Proxy Authentication Headers
# Proxy-Authorization (sent to proxy)
curl -x http://proxy:8080 \
-H "Proxy-Authorization: Basic dXNlcjpwYXNz" \
https://example.com
# vs Authorization (sent to target server)
curl -H "Authorization: Bearer token" https://api.example.comKey difference:
Authorization: Authenticates with the target serverProxy-Authorization: Authenticates with the proxy serverProxy-Authenticate: Server tells client that proxy auth is needed (407 response)WWW-Authenticate: Server tells client that server auth is needed (401 response)
Headers for Web Scraping Anti-Detection
Realistic Browser Headers
curl -s \
-H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" \
-H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8" \
-H "Accept-Language: en-US,en;q=0.9" \
-H "Accept-Encoding: gzip, deflate, br" \
-H "Connection: keep-alive" \
-H "Upgrade-Insecure-Requests: 1" \
-H "Sec-Fetch-Dest: document" \
-H "Sec-Fetch-Mode: navigate" \
-H "Sec-Fetch-Site: none" \
-H "Sec-Fetch-User: ?1" \
-H "Cache-Control: max-age=0" \
--compressed \
https://example.comPython Header Rotation
import requests
import random
USER_AGENTS = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.2 Safari/605.1.15",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
]
ACCEPT_LANGUAGES = [
"en-US,en;q=0.9",
"en-GB,en;q=0.9",
"en-US,en;q=0.9,es;q=0.8",
]
def get_realistic_headers():
"""Generate realistic browser-like headers."""
ua = random.choice(USER_AGENTS)
headers = {
"User-Agent": ua,
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language": random.choice(ACCEPT_LANGUAGES),
"Accept-Encoding": "gzip, deflate, br",
"Connection": "keep-alive",
"Upgrade-Insecure-Requests": "1",
"Sec-Fetch-Dest": "document",
"Sec-Fetch-Mode": "navigate",
"Sec-Fetch-Site": "none",
"Sec-Fetch-User": "?1",
"Cache-Control": "max-age=0",
}
# Add Chrome-specific headers if Chrome UA
if "Chrome" in ua:
headers["Sec-Ch-Ua"] = '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"'
headers["Sec-Ch-Ua-Mobile"] = "?0"
headers["Sec-Ch-Ua-Platform"] = '"Windows"' if "Windows" in ua else '"macOS"'
return headers
# Usage with proxy
response = requests.get(
"https://example.com",
headers=get_realistic_headers(),
proxies={"https": "http://user:pass@proxy:8080"}
)Inspecting Headers with cURL
View Response Headers Only
curl -I https://example.com
# or
curl --head https://example.comView Both Request and Response Headers
curl -v https://example.com 2>&1 | grep -E "^[<>]"
# > lines = sent (request headers)
# < lines = received (response headers)View Headers Through a Proxy
curl -v -x http://proxy:8080 https://example.com 2>&1 | grep -E "^[<>]"Save Headers to File
curl -D headers.txt -o body.html https://example.com
cat headers.txtHeader Troubleshooting Table
| Issue | Symptom | Header to Check/Fix |
|---|---|---|
| Blocked as bot | 403 Forbidden | User-Agent, Accept, Sec-Fetch headers |
| Wrong content type | Garbled response | Accept, Accept-Encoding |
| Authentication failed | 401 Unauthorized | Authorization |
| Proxy auth failed | 407 Proxy Auth Required | Proxy-Authorization |
| Rate limited | 429 Too Many Requests | X-RateLimit-Remaining, Retry-After |
| Redirect loop | Too many redirects | Location, Referer |
| CORS error | Browser blocks response | Access-Control-Allow-Origin |
| Caching issues | Stale data | Cache-Control, ETag, If-None-Match |
| Proxy detected | Different results vs browser | X-Forwarded-For, Via |
FAQ
What HTTP headers should I set for web scraping?
At minimum, set User-Agent to a current browser string, Accept to match what browsers send, Accept-Language to a common locale, and Accept-Encoding: gzip, deflate, br. For better stealth, also include Sec-Fetch-Dest, Sec-Fetch-Mode, Sec-Fetch-Site, and Sec-Ch-Ua headers that match your User-Agent. Rotate User-Agents between requests and set Referer to mimic natural browsing patterns.
How do I check if my proxy is leaking headers?
Send a request through your proxy to a header-inspection service like https://httpbin.org/headers or https://ifconfig.me/all. Look for X-Forwarded-For, Via, X-Real-IP, or Forwarded headers in the response. Elite/high-anonymity proxies should not add any of these headers. Transparent proxies add all of them, and anonymous proxies add some but mask your real IP.
What is the difference between Authorization and Proxy-Authorization headers?
Authorization authenticates you with the target web server (e.g., API authentication with Basic auth or Bearer tokens). Proxy-Authorization authenticates you with the proxy server itself. Both can be present simultaneously in the same request. A 401 status code means the target server rejected your Authorization header, while a 407 means the proxy rejected your Proxy-Authorization header.
How do I handle rate limit headers in my scraper?
Check X-RateLimit-Remaining in each response. When it reaches zero, read X-RateLimit-Reset for the reset timestamp and Retry-After for the wait duration. In Python: remaining = int(response.headers.get("X-RateLimit-Remaining", 1)). If remaining is zero, sleep until the reset time. Distribute requests across multiple proxies to effectively multiply your rate limit allowance.
Why do some websites return different content when I use cURL vs a browser?
Websites use header fingerprinting to detect non-browser clients. cURL’s default headers (User-Agent: curl/8.x, no Accept-Language, no Sec-Fetch headers) are easily identified. Websites may serve different content, block requests, or serve CAPTCHAs. To get browser-identical responses, replicate the full set of headers your target browser sends, including Sec-Ch-Ua, Sec-Fetch-* headers, and proper Accept values. Use browser DevTools Network tab to copy exact headers.
- cURL Authentication Guide: Basic, Bearer, Digest & Proxy Auth
- How to Download Files with cURL: Complete Guide with Proxy Support
- Anti-Bot Detection Glossary: 50+ Terms Defined
- Anti-Bot Terminology Glossary: Complete A-Z Reference 2026
- Backconnect Proxies Deep Dive: Architecture and Real-World Performance
- Best Proxies in Southeast Asia: Singapore, Thailand, Indonesia, Philippines
- cURL Authentication Guide: Basic, Bearer, Digest & Proxy Auth
- How to Download Files with cURL: Complete Guide with Proxy Support
- Anti-Bot Detection Glossary: 50+ Terms Defined
- Anti-Bot Terminology Glossary: Complete A-Z Reference 2026
- Backconnect Proxies Deep Dive: Architecture and Real-World Performance
- Best Proxies in Southeast Asia: Singapore, Thailand, Indonesia, Philippines
- 403 Forbidden Error: What It Means & How to Fix It
- 407 Proxy Authentication Required: Fix Guide
- Anti-Bot Detection Glossary: 50+ Terms Defined
- Anti-Bot Terminology Glossary: Complete A-Z Reference 2026
- Backconnect Proxies Deep Dive: Architecture and Real-World Performance
- Best Proxies in Southeast Asia: Singapore, Thailand, Indonesia, Philippines
Related Reading
- 403 Forbidden Error: What It Means & How to Fix It
- 407 Proxy Authentication Required: Fix Guide
- Anti-Bot Detection Glossary: 50+ Terms Defined
- Anti-Bot Terminology Glossary: Complete A-Z Reference 2026
- Backconnect Proxies Deep Dive: Architecture and Real-World Performance
- Best Proxies in Southeast Asia: Singapore, Thailand, Indonesia, Philippines