WhatWaf is an open-source Python tool that identifies which WAF protects a web target before you attempt scraping. it tests 80+ WAF signatures including Cloudflare, Sucuri, Akamai, and AWS Shield. running it first saves hours of debugging blocked requests.
why WAF detection matters for scrapers
different WAFs require different bypass strategies. Cloudflare’s JS challenge needs a headless browser. Akamai Bot Manager uses behavioral fingerprinting that requires spoofed TLS. Sucuri responds to residential proxies. knowing the WAF before writing your scraper means you pick the right approach from the start.
WhatWaf was created by Ekultek and has over 3,800 GitHub stars. it sends a series of test payloads and analyzes the response headers, body, and status codes to fingerprint the WAF. it supports both direct URL testing and integration into Python scripts.
installation
git clone https://github.com/Ekultek/WhatWaf.git
cd WhatWaf
pip install -r requirements.txtWhatWaf requires Python 3.6+. it uses requests, beautifulsoup4, and colorlog. on first run it downloads the latest WAF signatures from its GitHub repository.
basic command-line usage
the simplest usage is passing a URL with the -u flag. WhatWaf will return the detected WAF name and confidence score.
# test a single URL
python whatwaf.py -u https://target-site.com
# test with a random user agent
python whatwaf.py -u https://target-site.com --ra
# batch test from a file of URLs
python whatwaf.py -l urls.txt
# output results to JSON
python whatwaf.py -u https://target-site.com --format json -o results.jsonexample output for a Cloudflare-protected site:
[+] detected WAF: Cloudflare
[+] confidence: 95%
[+] detected tamper scripts: between, randomcase, space2commentinterpreting results
WhatWaf categorizes detections into three confidence tiers. a confidence score above 80% means the signature match is reliable. between 50-80% means multiple WAFs could match. below 50% means the site uses a custom or uncommon WAF. use the JSON output to build automated decision logic in your scraping pipeline.
import json
import subprocess
def detect_waf(url):
result = subprocess.run(
["python", "whatwaf.py", "-u", url, "--format", "json", "-o", "/tmp/waf.json"],
capture_output=True, text=True, cwd="/path/to/WhatWaf"
)
with open("/tmp/waf.json") as f:
data = json.load(f)
return data
waf_info = detect_waf("https://example.com")
print(waf_info)supported WAFs
WhatWaf detects 80+ WAFs including: Cloudflare, Sucuri, Akamai, AWS WAF, Imperva Incapsula, F5 BIG-IP ASM, Barracuda, Citrix NetScaler, Reblaze, StackPath, ModSecurity, and many more. the full list is in the content/wafdetector.py file in the repository.
using WhatWaf programmatically
for integration into a scraping framework, you can import WhatWaf’s detection module directly. this avoids subprocess overhead and lets you run detection in-process.
import sys
sys.path.insert(0, "/path/to/WhatWaf")
from lib.core.waf_detections import WafDetections
def check_protection(url):
detector = WafDetections(url)
detections = detector.get_page()
return detectionsbypass strategy selection based on WAF type
once you know the WAF, apply the right bypass strategy. Cloudflare: use Playwright with stealth mode or a browser fingerprint-spoofing library. Sucuri: residential proxies and realistic headers. Akamai: rotate TLS fingerprints using curl-impersonate or tls-client. ModSecurity: encode payloads and avoid SQLi-like patterns in parameters.
for more on proxy types relevant to bypassing WAFs, see what is a proxy server and SOCKS5 vs HTTP proxy. for scraping fundamentals, see what is web scraping.
sources and further reading
- WhatWaf GitHub repository
- OWASP Web Application Firewall overview
- tls-client for TLS fingerprint spoofing