DNS over HTTPS (DoH) with Proxies: Privacy & Anti-Detection Guide

DNS over HTTPS (DoH) with Proxies: Privacy & Anti-Detection Guide

You carefully route all your HTTP traffic through a proxy, but your DNS queries go directly to your ISP’s resolver — completely unencrypted. This DNS leak reveals every domain you visit, even when using proxies. DNS over HTTPS (DoH) solves this by encrypting DNS queries inside HTTPS requests, which can also be routed through your proxy.

This guide explains how DoH works, why it matters for proxy users, and how to implement it correctly for scraping and privacy.

The DNS Leak Problem

Without DoH:
┌─────────┐   DNS query (plaintext)    ┌──────────┐
│  Your   │ ──────────────────────────→ │ ISP DNS  │  ← ISP sees everything
│ Machine │                             │ Resolver │
│         │   HTTP via proxy            │          │
│         │ ──→ Proxy ──→ Target        └──────────┘
└─────────┘

Your ISP sees: "User queried competitor-prices.com"
Even though HTTP traffic goes through proxy.

With DoH through proxy:
┌─────────┐   DoH via proxy            ┌──────────┐
│  Your   │ ──→ Proxy ──→ DoH Server   │ ISP DNS  │  ← ISP sees nothing
│ Machine │                             │ Resolver │
│         │   HTTP via proxy            │ (unused) │
│         │ ──→ Proxy ──→ Target        └──────────┘
└─────────┘

What DNS Leaks Reveal

Without DoHWith DoH
ISP sees all domains you queryEncrypted, ISP sees nothing
Network admin logs all lookupsCannot log DNS queries
DNS queries linked to your real IPQueries routed through proxy IP
Potential DNS poisoning/hijackingAuthenticated via HTTPS
MITM attacks on DNS possibleTLS protects integrity

How DNS over HTTPS Works

DoH wraps DNS queries inside standard HTTPS POST or GET requests:

import httpx
import base64
import struct

class DoHResolver:
    """DNS over HTTPS resolver."""

    # Popular DoH providers
    PROVIDERS = {
        'cloudflare': 'https://cloudflare-dns.com/dns-query',
        'google': 'https://dns.google/dns-query',
        'quad9': 'https://dns.quad9.net:5053/dns-query',
        'nextdns': 'https://dns.nextdns.io/dns-query',
    }

    def __init__(self, provider='cloudflare', proxy=None):
        self.doh_url = self.PROVIDERS.get(provider, provider)
        self.proxy = proxy
        self.client = httpx.Client(
            http2=True,
            proxy=proxy,
            timeout=10,
        )

    def resolve(self, domain, record_type='A'):
        """Resolve domain using DoH."""
        # Build DNS query packet
        dns_query = self._build_query(domain, record_type)

        # Send as HTTPS POST with DNS wireformat
        response = self.client.post(
            self.doh_url,
            content=dns_query,
            headers={
                'Content-Type': 'application/dns-message',
                'Accept': 'application/dns-message',
            }
        )

        if response.status_code == 200:
            return self._parse_response(response.content)
        else:
            raise Exception(f"DoH query failed: {response.status_code}")

    def resolve_json(self, domain, record_type='A'):
        """Resolve using JSON API (simpler, Cloudflare/Google support)."""
        response = self.client.get(
            self.doh_url,
            params={
                'name': domain,
                'type': record_type,
            },
            headers={
                'Accept': 'application/dns-json',
            }
        )

        data = response.json()
        return [answer['data'] for answer in data.get('Answer', [])]

    def _build_query(self, domain, record_type):
        """Build a minimal DNS query packet."""
        # Transaction ID
        query = struct.pack('>H', 0x1234)
        # Flags: standard query
        query += struct.pack('>H', 0x0100)
        # Questions: 1, Answers: 0, Authority: 0, Additional: 0
        query += struct.pack('>HHHH', 1, 0, 0, 0)

        # Encode domain name
        for part in domain.split('.'):
            query += struct.pack('B', len(part)) + part.encode()
        query += b'\x00'  # End of domain name

        # Query type and class
        types = {'A': 1, 'AAAA': 28, 'CNAME': 5, 'MX': 15, 'TXT': 16}
        query += struct.pack('>HH', types.get(record_type, 1), 1)

        return query

    def _parse_response(self, data):
        """Parse DNS response packet (simplified)."""
        # Skip header (12 bytes) and question section
        offset = 12
        # Skip question
        while data[offset] != 0:
            offset += data[offset] + 1
        offset += 5  # null byte + type + class

        answers = []
        answer_count = struct.unpack('>H', data[6:8])[0]

        for _ in range(answer_count):
            # Skip name (may be compressed)
            if data[offset] & 0xC0 == 0xC0:
                offset += 2
            else:
                while data[offset] != 0:
                    offset += data[offset] + 1
                offset += 1

            rtype = struct.unpack('>H', data[offset:offset+2])[0]
            rdlength = struct.unpack('>H', data[offset+8:offset+10])[0]
            offset += 10

            if rtype == 1:  # A record
                ip = '.'.join(str(b) for b in data[offset:offset+rdlength])
                answers.append(ip)
            offset += rdlength

        return answers

# Usage
resolver = DoHResolver(
    provider='cloudflare',
    proxy='http://user:pass@proxy.example.com:8080'
)

ips = resolver.resolve_json('example.com')
print(f"example.com resolves to: {ips}")

Configuring DoH for Web Scraping

Python requests/httpx with Custom DNS

import httpx
import asyncio

class ProxiedDoHScraper:
    """Scraper that uses DoH through proxy for all DNS resolution."""

    def __init__(self, proxy_url, doh_provider='cloudflare'):
        self.proxy_url = proxy_url
        self.resolver = DoHResolver(
            provider=doh_provider,
            proxy=proxy_url
        )
        self.dns_cache = {}

    async def resolve_and_scrape(self, url):
        """Resolve DNS via DoH, then scrape through proxy."""
        from urllib.parse import urlparse
        parsed = urlparse(url)
        domain = parsed.hostname

        # Resolve via DoH through proxy
        if domain not in self.dns_cache:
            ips = self.resolver.resolve_json(domain)
            if ips:
                self.dns_cache[domain] = ips[0]

        # Scrape through proxy
        async with httpx.AsyncClient(proxy=self.proxy_url) as client:
            response = await client.get(url)
            return response

    def clear_cache(self):
        self.dns_cache.clear()

System-Level DoH Configuration

Linux (systemd-resolved):

# /etc/systemd/resolved.conf
[Resolve]
DNS=1.1.1.1#cloudflare-dns.com
DNSOverTLS=yes
# For DoH, use a stub resolver like dnscrypt-proxy

# Install dnscrypt-proxy
sudo apt install dnscrypt-proxy

# /etc/dnscrypt-proxy/dnscrypt-proxy.toml
server_names = ['cloudflare', 'google']
listen_addresses = ['127.0.0.1:53']

# Force all DNS through proxy using iptables
# (redirect DNS to local DoH resolver)
iptables -t nat -A OUTPUT -p udp --dport 53 -j REDIRECT --to-port 5353
iptables -t nat -A OUTPUT -p tcp --dport 53 -j REDIRECT --to-port 5353

macOS:

# Use dnscrypt-proxy via Homebrew
brew install dnscrypt-proxy

# Edit /opt/homebrew/etc/dnscrypt-proxy.toml
# Set proxy if needed:
# proxy = 'socks5://127.0.0.1:1080'

# Start the service
sudo brew services start dnscrypt-proxy

# Set DNS to local resolver
sudo networksetup -setdnsservers Wi-Fi 127.0.0.1

Browser-Level DoH

from playwright.sync_api import sync_playwright

def scrape_with_doh(url, proxy_server):
    """Use browser with DoH enabled through proxy."""
    with sync_playwright() as p:
        browser = p.chromium.launch(
            proxy={"server": proxy_server},
            args=[
                # Enable DoH in Chrome
                '--enable-features=DnsOverHttps',
                '--dns-over-https-mode=secure',
                '--dns-over-https-templates='
                'https://cloudflare-dns.com/dns-query',
            ]
        )
        page = browser.new_page()
        page.goto(url)
        content = page.content()
        browser.close()
        return content

DNS Leak Testing

Always verify your DNS is not leaking:

import httpx

def test_dns_leak(proxy_url=None):
    """Test for DNS leaks when using a proxy."""
    client = httpx.Client(proxy=proxy_url) if proxy_url else httpx.Client()

    # Method 1: Check DNS resolver IP
    response = client.get('https://cloudflare-dns.com/dns-query',
                         params={'name': 'whoami.cloudflare', 'type': 'TXT'},
                         headers={'Accept': 'application/dns-json'})
    dns_data = response.json()
    print(f"DNS resolver seen by Cloudflare: {dns_data}")

    # Method 2: Use DNS leak test service
    response = client.get('https://ipleak.net/json/')
    leak_data = response.json()
    print(f"Your visible IP: {leak_data.get('ip')}")
    print(f"DNS servers: {leak_data.get('dns', 'N/A')}")

    # Method 3: Check if DNS matches proxy location
    response = client.get('https://ipinfo.io/json')
    ip_data = response.json()
    print(f"IP location: {ip_data.get('city')}, {ip_data.get('country')}")

    client.close()

# Test without proxy
print("--- Without proxy ---")
test_dns_leak()

# Test with proxy
print("\n--- With proxy ---")
test_dns_leak("http://user:pass@proxy.example.com:8080")

DoH vs DoT vs DNSCrypt

FeatureDoHDoTDNSCrypt
ProtocolHTTPS (port 443)TLS (port 853)Custom (port 443/5443)
Blends with trafficYes (looks like HTTPS)No (dedicated port)Partially
Proxy compatibleYes (standard HTTPS)LimitedWith SOCKS
Blocked easilyHard (same as HTTPS)Easy (block port 853)Medium
SpeedSlightly slowerFastFast
Browser supportChrome, Firefox, EdgeAndroid, iOSRequires client

For proxy users, DoH is the best choice because it uses standard HTTPS, which proxies already handle.

Performance Impact

DNS Resolution Latency Comparison:

Standard DNS (UDP):     ~15ms   (unencrypted, can be snooped)
DNS over TLS (DoT):     ~25ms   (encrypted, dedicated port)
DNS over HTTPS (DoH):   ~35ms   (encrypted, blends with HTTPS)
DoH through proxy:      ~85ms   (encrypted, proxied, most private)
Cached (any method):    ~0.1ms  (no network call)

Impact on scraping 10,000 pages:
├─ Standard DNS:    +150s total DNS time
├─ DoH direct:      +350s total DNS time
├─ DoH via proxy:   +850s total DNS time
└─ DoH + cache:     ~0s (after initial resolution)

Caching is essential — most scraping targets resolve to the same IPs:

from functools import lru_cache
import time

class CachedDoHResolver:
    def __init__(self, proxy_url, ttl=300):
        self.resolver = DoHResolver(proxy='http://proxy:8080')
        self.ttl = ttl
        self._cache = {}

    def resolve(self, domain):
        now = time.time()
        if domain in self._cache:
            result, timestamp = self._cache[domain]
            if now - timestamp < self.ttl:
                return result

        result = self.resolver.resolve_json(domain)
        self._cache[domain] = (result, now)
        return result

Internal Links

FAQ

Does DNS over HTTPS prevent my ISP from seeing which sites I visit?

DoH encrypts DNS queries so your ISP cannot see domain lookups. However, the ISP can still see the IP addresses you connect to (via SNI in TLS or connection metadata). For full privacy, combine DoH with encrypted SNI (ECH) and proxy/VPN usage.

Will DoH slow down my web scraping?

Initial DNS lookups add 20-50ms of latency compared to standard DNS. However, with proper caching, subsequent lookups are instant. For scraping thousands of pages on the same domains, the impact is negligible after the first resolution.

Do proxy providers handle DNS resolution on their end?

Yes, most proxy providers resolve DNS on the proxy server side when using CONNECT tunnels. The domain name travels encrypted to the proxy, which resolves it using its own DNS. This already prevents your local ISP from seeing DNS queries. DoH adds protection against the proxy provider’s ISP snooping.

Can websites detect that I am using DNS over HTTPS?

Websites cannot directly detect DoH usage. They receive connections from your proxy IP regardless of how DNS was resolved. However, some advanced fingerprinting techniques might detect timing differences between DoH and standard DNS resolution.

Should I use Cloudflare or Google for DoH?

Cloudflare (1.1.1.1) has a strong privacy policy, deleting logs within 24 hours. Google (8.8.8.8) keeps logs longer but offers slightly more reliable resolution. For scraping, Cloudflare is generally preferred for privacy. You can also self-host a DoH resolver for maximum control.


Related Reading

Scroll to Top