Subnet Diversity Explained: Why It Matters for Web Scraping

Subnet Diversity Explained: Why It Matters for Web Scraping

Subnet diversity refers to the distribution of proxy IP addresses across different IP subnets (network blocks). When web scraping at scale, using IPs from many different subnets makes your traffic appear to originate from unrelated sources rather than from a concentrated block of addresses owned by a single network. Websites actively monitor subnet patterns, and poor subnet diversity is one of the fastest ways to get your entire proxy set banned at once.

Understanding subnets and how to maximize diversity across them is a critical skill for anyone running large-scale data collection operations.

Understanding IP Subnets

What Is a Subnet?

A subnet (short for subnetwork) is a logical subdivision of an IP network. IP addresses within the same subnet share a common prefix and are typically managed by the same organization or allocated from the same block.

For IPv4, subnets are described using CIDR notation:

CIDR NotationSubnet MaskNumber of IPsExample Range
/32255.255.255.2551Single IP
/24255.255.255.0256192.168.1.0 – 192.168.1.255
/16255.255.0.065,536192.168.0.0 – 192.168.255.255
/8255.0.0.016,777,216192.0.0.0 – 192.255.255.255

The /24 Subnet (C-Class Block)

The /24 subnet (also called a C-class block or /24 block) is the most commonly referenced subnet size in the proxy industry. A /24 contains 256 IP addresses that share the first three octets:

Example /24 subnet: 192.168.1.0/24
Includes: 192.168.1.0 through 192.168.1.255
All IPs share: 192.168.1.x

When proxy providers talk about “subnet diversity,” they are almost always referring to diversity across /24 blocks.

Why /24 Matters

Organizations are typically allocated IP addresses in /24 blocks or larger. All 256 IPs in a /24 block usually belong to the same owner and are hosted in the same location. Websites use this knowledge to:

  • Identify related IPs: If one IP in a /24 is flagged, others in the same /24 become suspect
  • Apply subnet-level bans: Block the entire /24 instead of individual IPs
  • Detect proxy networks: Many IPs from the same /24 making similar requests signals automation

Why Subnet Diversity Matters for Scraping

The Subnet Ban Problem

When a website detects suspicious activity from an IP address, it often does not just ban that single IP — it bans the entire subnet:

Scenario: You have 50 datacenter proxy IPs from the same /24 subnet

IP 45.67.89.10 gets banned for aggressive scraping
    ↓
Website bans entire 45.67.89.0/24
    ↓
All 50 of your proxy IPs (45.67.89.*) are blocked simultaneously
    ↓
Your scraping operation stops completely

Compare this with subnet-diverse proxies:

Scenario: You have 50 datacenter proxy IPs across 50 different /24 subnets

IP 45.67.89.10 gets banned
    ↓
Website bans 45.67.89.0/24
    ↓
Only 1 of your 50 proxies is affected
    ↓
Your scraping operation continues with 49 healthy IPs

Real-World Impact

Subnet DiversityIPs Banned per Block BanOperational Impact
100 IPs from 1 subnetAll 100Complete shutdown
100 IPs from 10 subnets~10 per banSignificant disruption
100 IPs from 50 subnets~2 per banMinor disruption
100 IPs from 100 subnets1 per banMinimal impact

How Websites Detect Poor Subnet Diversity

Pattern Analysis

Anti-bot systems analyze the source IPs of incoming requests for subnet patterns:

  1. Concentration detection: Multiple requests from the same /24 within a short window
  2. Sequential IP patterns: Requests from 10.20.30.1, 10.20.30.2, 10.20.30.3 are obviously from the same block
  3. ASN correlation: All IPs from the same ASN, regardless of subnet, may be flagged
  4. Behavioral similarity: IPs from the same subnet exhibiting identical request patterns

Subnet Fingerprinting

Advanced anti-bot systems go beyond simple /24 analysis:

  • /16 analysis: Check if many IPs share the same first two octets
  • ASN grouping: Correlate IPs by autonomous system number
  • Hosting provider detection: Identify known datacenter IP ranges
  • BGP prefix analysis: Check which network announces the IP block

Measuring Subnet Diversity

Key Metrics

from collections import Counter

def analyze_subnet_diversity(ip_list):
    """Analyze the subnet diversity of a proxy list"""

    # Extract /24 subnets
    subnets_24 = ['.'.join(ip.split('.')[:3]) for ip in ip_list]
    # Extract /16 subnets
    subnets_16 = ['.'.join(ip.split('.')[:2]) for ip in ip_list]

    unique_24 = len(set(subnets_24))
    unique_16 = len(set(subnets_16))

    subnet_counts = Counter(subnets_24)
    max_concentration = subnet_counts.most_common(1)[0][1]

    print(f"Total IPs: {len(ip_list)}")
    print(f"Unique /24 subnets: {unique_24}")
    print(f"Unique /16 subnets: {unique_16}")
    print(f"Diversity ratio (/24): {unique_24/len(ip_list)*100:.1f}%")
    print(f"Max IPs in single /24: {max_concentration}")
    print(f"Average IPs per /24: {len(ip_list)/unique_24:.1f}")

    return {
        "total_ips": len(ip_list),
        "unique_24": unique_24,
        "unique_16": unique_16,
        "diversity_ratio": unique_24 / len(ip_list),
        "max_concentration": max_concentration
    }

# Example usage
proxy_ips = ["45.67.89.10", "45.67.89.11", "123.45.67.1",
             "98.76.54.32", "98.76.54.33", "200.100.50.1"]
analyze_subnet_diversity(proxy_ips)

What Good Diversity Looks Like

MetricPoorAcceptableExcellent
/24 Diversity ratio< 20%40-70%> 80%
Max IPs per /24> 205-101-3
/16 Diversity ratio< 10%20-50%> 60%

A diversity ratio of 100% means every IP is from a different /24 subnet — the ideal scenario.

Subnet Diversity by Proxy Type

Datacenter Proxies

Datacenter proxies face the biggest subnet diversity challenges:

  • Problem: Datacenter IPs are often allocated in large contiguous blocks
  • Typical diversity: 10-50 IPs per /24 subnet
  • Solution: Premium datacenter proxy providers acquire IPs from multiple allocations across different hosting providers

When purchasing datacenter proxies, always ask about subnet distribution. A provider offering “1,000 IPs from 500+ subnets” is far more valuable than “1,000 IPs from 10 subnets.”

Residential Proxies

Residential proxies naturally provide excellent subnet diversity:

  • Why: IPs come from millions of individual home users across thousands of ISPs
  • Typical diversity: Near 100% unique /24 subnets
  • Advantage: Even without actively managing subnet diversity, residential IPs are inherently distributed

Mobile Proxies

Mobile proxies have good diversity at the IP level but limited ASN diversity:

  • IP diversity: High — CGNAT assigns different public IPs dynamically
  • ASN diversity: Low — all IPs come from a small number of mobile carrier ASNs
  • Trade-off: Websites expect mobile carrier traffic, so ASN concentration is acceptable

Best Practices for Maximizing Subnet Diversity

When Buying Datacenter Proxies

  1. Request subnet distribution data before purchasing
  2. Prioritize providers with diverse sourcing: IPs from multiple data centers across regions
  3. Mix providers: Use IPs from 2-3 different datacenter proxy providers
  4. Avoid sequential IPs: Request randomized IP assignment, not sequential blocks

When Configuring Rotation

  1. Rotate across subnets, not just IPs: Ensure consecutive requests come from different /24 blocks
  2. Subnet-aware cooldowns: After a ban, cool down the entire /24, not just the single IP
  3. Balance subnet usage: Avoid sending disproportionate traffic through any one subnet
  4. Monitor per-subnet success rates: Track which subnets perform best on your target sites

Subnet-Aware Rotation Example

import random
from collections import defaultdict

class SubnetAwareRotator:
    def __init__(self, proxies):
        self.subnet_groups = defaultdict(list)
        for proxy in proxies:
            # Extract IP from proxy URL
            ip = proxy.split('@')[1].split(':')[0]
            subnet = '.'.join(ip.split('.')[:3])
            self.subnet_groups[subnet].append(proxy)

        self.last_subnet = None

    def get_proxy(self):
        """Get a proxy from a different subnet than the last request"""
        available_subnets = [
            s for s in self.subnet_groups.keys()
            if s != self.last_subnet
        ]

        if not available_subnets:
            available_subnets = list(self.subnet_groups.keys())

        subnet = random.choice(available_subnets)
        proxy = random.choice(self.subnet_groups[subnet])
        self.last_subnet = subnet

        return proxy

Frequently Asked Questions

What is the ideal subnet diversity ratio?

For datacenter proxies, aim for at least 50% diversity (one unique /24 per two IPs) and ideally 80%+. For residential proxies, diversity is typically 90%+ by default. The more sensitive your target websites, the higher your subnet diversity should be.

Can websites ban entire /16 subnets?

Yes, but it is less common because /16 blocks contain 65,536 IPs and banning them would likely block legitimate users. However, aggressive anti-bot systems do apply /16-level rate limiting and increased scrutiny. This is why /16 diversity also matters.

Do residential proxies need subnet diversity management?

Generally no. Residential proxies inherently come from diverse subnets because they originate from millions of individual home connections across thousands of ISPs and networks. This natural diversity is one of the key advantages of residential proxies over datacenter proxies.

How do I check the subnet diversity of my proxy provider?

Request a sample IP list from your provider and analyze the /24 distribution. Alternatively, run 100-500 requests through their gateway, log the IPs, and calculate the diversity ratio. Any provider unable or unwilling to share subnet diversity data may have poor diversity.

Is subnet diversity the same as ASN diversity?

No. Subnet diversity refers to distribution across /24 IP blocks, while ASN diversity refers to distribution across autonomous systems (network organizations). An ISP with ASN 7922 (Comcast) could have thousands of /24 subnets. Ideally, you want both subnet AND ASN diversity for maximum protection.


Related Reading

Scroll to Top