How to Scrape E-Commerce Prices Without Getting Blocked (Shopee, Lazada, Amazon)

How to Scrape E-Commerce Prices Without Getting Blocked (Shopee, Lazada, Amazon)

If you’re running a price monitoring operation, competitive analysis, or market research project, scraping e-commerce platforms is essential. But platforms like Shopee, Lazada, and Amazon have sophisticated anti-bot systems specifically designed to block scrapers. This guide walks you through the technical realities: why they block you, which proxy type works for each platform, and how to structure your scraping workflow for consistent success.

Why E-Commerce Platforms Block Scrapers

Before we get to solutions, understand the defense mechanisms you’re up against.

JavaScript Rendering Requirements

Shopee, Lazada, and Amazon don’t serve their product catalogs as static HTML. They use client-side JavaScript to render prices, inventory status, and product details. A basic HTTP request returns a skeleton page with minimal data. You need a headless browser or JavaScript rendering engine to extract the actual content.

Cloudflare and WAF Protection

Shopee and Lazada sit behind Cloudflare or similar Web Application Firewalls. These systems don’t just block based on IP reputation—they analyze request patterns. If you send 500 requests per second from the same IP, Cloudflare will challenge you with a CAPTCHA or block you outright, regardless of proxy quality.

CAPTCHA Challenges

Amazon is aggressive with CAPTCHA challenges. A single scraper IP requesting product pages at human speeds can trigger a CAPTCHA wall. Rotating residential IPs helps, but behavioral patterns matter too: pausing between requests, varying request intervals, and matching typical user browsing patterns all reduce challenge rates.

Device Fingerprinting

Platforms track browser fingerprints: User-Agent, TLS fingerprint, HTTP/2 header order, and WebGL capabilities. If your scraper has a consistent fingerprint across thousands of requests, it stands out. Mobile proxies are harder to fingerprint at scale, which is why they’re premium.

Rate Limiting and Behavioral Analysis

E-commerce sites enforce per-IP rate limits (requests/minute) and per-account limits (purchases, logins, page views). They also track behavioral patterns: a real user browses product pages, adds to cart, removes items, and makes purchases at varying rates. A scraper hitting 100 product pages in 30 seconds triggers immediate blocks.

The Proxy Ladder: Datacenter, Residential, Mobile

Not all proxies are equal. Choose the wrong type, and you’ll burn through IPs while getting zero data. Here’s the practical hierarchy. For a deeper comparison across all proxy types, see our guide on residential vs. datacenter vs. mobile proxies.

Datacenter Proxies: Fast, Cheap, Easily Blocked

Cost: $0.50–$2 per GB of traffic

Best for: Static HTML scraping, low-security targets, testing

Datacenter proxies are hosted on cloud servers (AWS, Azure, etc.). They’re fast and cheap, but e-commerce platforms know datacenter IP ranges and deprioritize them for blocking.

Use datacenter proxies only if:

  • You’re scraping a single product page once to test your setup
  • You’re targeting platforms with weak anti-bot (unlikely for Shopee, Lazada, Amazon)
  • You need speed for non-sensitive operations

Don’t rely on them for persistent scraping of Shopee, Lazada, or Amazon. You’ll exhaust your IP pool in days.

Residential Proxies: Mid-Tier, Good for Low-Volume Scraping

Cost: $5–$15 per GB of traffic

Best for: Price monitoring, moderate-volume scraping, session-based operations

Residential proxies route traffic through real ISP-assigned IPs. To the platform, they look identical to home internet users. They’re harder to fingerprint and trigger fewer CAPTCHAs than datacenter IPs.

Use residential proxies when:

  • Scraping 100–500 product pages per day
  • Monitoring competitor prices weekly or monthly
  • You need geographic coverage (Singapore, Malaysia, Thailand for Shopee/Lazada)
  • You can accept slight latency in exchange for reliability

Residential IPs rotate less frequently (every 10–60 minutes) than datacenter IPs, so session management becomes critical. See the workflow section below.

Mobile Proxies: Highest Trust, Best for Persistent Scraping

Cost: $15–$40+ per GB of traffic

Best for: High-volume scraping, account monitoring, seller analytics

Mobile proxies route through actual mobile devices or mobile carrier networks. Fingerprinting a mobile device at scale is exponentially harder than fingerprinting desktop browsers. E-commerce platforms consider mobile traffic lower-risk (more legitimate users) and enforce lighter rate limits.

Use mobile proxies when:

  • Scraping thousands of products daily
  • Monitoring multiple seller accounts
  • Operating for extended periods without IP rotation
  • Running multi-account operations (see related reading)

The premium cost is justified if you’re running serious price monitoring or competitive analysis. A single residential IP ban costs you more in lost data than the extra $0.10–$0.20 per GB for mobile.

Scraping Shopee: Heavy JavaScript, Cloudflare, Session-Critical

Shopee is the toughest target of the three. It’s heavy on client-side rendering and sits behind Cloudflare.

Shopee’s Anti-Scraping Stack

  • Rendering: Product pages require JavaScript execution. Prices and product metadata are in the DOM after JS runs.
  • Cloudflare Protection: Shopee uses Cloudflare’s Bot Management. Suspicious requests get challenged with a JavaScript puzzle.
  • Session Tracking: Shopee tracks session tokens in cookies. Requests without valid session cookies are deprioritized.
  • Geo-Blocking: Regional variations (SG, MY, TH) have different rate limits.

Practical Approach for Shopee

1. Use a headless browser (Puppeteer, Selenium, Playwright)

Never attempt to scrape Shopee with static HTTP requests. Use a headless browser:

from playwright.async_api import async_playwright

async def scrape_shopee(product_url, proxy):
    async with async_playwright() as p:
        browser = await p.chromium.launch(proxy={"server": proxy})
        page = await browser.new_page()
        await page.goto(product_url, wait_until="networkidle")
        price = await page.query_selector('.product-price')
        print(price.text_content())
        await browser.close()

2. Rotate residential or mobile IPs

A single Shopee session can handle 20–50 product pages before needing rotation. Rotate IPs every 30–50 requests:

proxies = load_proxy_list()  # Residential or mobile
for i, product_url in enumerate(product_list):
    proxy = proxies[i % len(proxies)]
    await scrape_shopee(product_url, proxy)
    if i % 30 == 0:
        await page.context.clear_cookies()  # Clear session

3. Maintain realistic request delays

Insert delays between requests. A real user browses for 5–15 seconds per product page:

import random
import asyncio

delay = random.uniform(5, 15)  # 5–15 second random delay
await asyncio.sleep(delay)

4. Set proper headers

Match realistic browser headers:

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    "Accept-Language": "en-US,en;q=0.9",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Referer": "https://shopee.sg/",
    "Connection": "keep-alive"
}

Shopee Result: With residential or mobile proxies and proper session management, you can sustain 200–500 product pages per day per account without blocks.

Scraping Lazada: Alibaba-Backed Anti-Bot, Regional Rate Limits

Lazada (owned by Alibaba) has similar defenses to Shopee but with different rate-limiting rules per region.

Lazada’s Anti-Scraping Stack

  • JavaScript Rendering: Like Shopee, product pages render client-side.
  • Per-Region Rate Limits: Thailand has stricter limits than Malaysia. Singapore has different thresholds.
  • Account-Level Blocks: Repeated scraping from the same account triggers account suspension.
  • Behavioral Flags: Too many rapid product views flag an account as bot.

Practical Approach for Lazada

1. Separate accounts by region

Create region-specific accounts (Lazada TH, MY, SG):

regions = {
    "TH": "https://www.lazada.co.th",
    "MY": "https://www.lazada.com.my",
    "SG": "https://www.lazada.sg"
}

2. Use mobile proxies with sticky sessions

Lazada heavily monitors account-level activity. Use mobile proxies with sticky sessions (the same IP for 20–30 minutes):

# Sticky session: same proxy for 30 requests, then rotate
sessions = []
for i, product_url in enumerate(product_list):
    session_index = i // 30
    proxy = mobile_proxies[session_index % len(mobile_proxies)]

    if session_index not in sessions:
        sessions.append(session_index)
        # Bind proxy for 30 minutes (sticky session)

    await scrape_lazada(product_url, proxy)

Learn more about sticky sessions and why they matter in mobile proxy testing to optimize your scraping workflow.

3. Respect regional rate limits

Thailand: 100 requests/hour Malaysia: 150 requests/hour Singapore: 120 requests/hour

rate_limits = {
    "TH": 100,
    "MY": 150,
    "SG": 120
}

for region, limit in rate_limits.items():
    delay_between_requests = 3600 / limit
    # Insert delay between each request

4. Implement login and account rotation

Alternate between multiple accounts to avoid per-account blocks:

accounts = [
    {"username": "account1@mail.com", "password": "..."},
    {"username": "account2@mail.com", "password": "..."},
]

for i, product_url in enumerate(product_list):
    account = accounts[i % len(accounts)]
    proxy = mobile_proxies[i % len(mobile_proxies)]
    await login_and_scrape(account, product_url, proxy)

    if i % 50 == 0:
        await logout()  # Rotate account

Lazada Result: With mobile proxies, region-aware rate limiting, and account rotation, sustain 300–600 product pages per day across regions without blocks.

Scraping Amazon: CAPTCHA, Device Fingerprinting, Seller Accounts

Amazon is the hardest target. It combines sophisticated CAPTCHA systems, device fingerprinting, and account-level tracking.

Amazon’s Anti-Scraping Stack

  • Aggressive CAPTCHA: Amazon challenges suspicious IPs with CAPTCHAs on first request.
  • Device Fingerprinting: TLS fingerprint, HTTP/2 header order, WebGL data all matter.
  • Account Suspension: Even residential proxies can trigger account-level suspensions after 500–1,000 requests.
  • Seller Account Monitoring: Amazon tracks seller login patterns and geographic inconsistencies.

Practical Approach for Amazon

1. Start with mobile proxies—residential won’t cut it

Datacenter proxies: 90% CAPTCHA rate Residential proxies: 40–60% CAPTCHA rate Mobile proxies: 10–20% CAPTCHA rate

# Mobile proxy non-negotiable for Amazon
proxy = select_mobile_proxy()

2. Rotate IPs more frequently

Amazon enforces per-IP request limits at ~50–100 requests before CAPTCHA. Rotate every 40 requests:

for i, product_url in enumerate(product_list):
    if i % 40 == 0:
        proxy = next_mobile_proxy()  # Rotate IP

    await scrape_amazon(product_url, proxy)
    await asyncio.sleep(random.uniform(8, 12))

3. Mimic realistic browser behavior

Add realistic user actions:

# User-like behavior: page load, scroll, then scrape
await page.goto(product_url, wait_until="networkidle")
await page.mouse.move(random.randint(0, 1920), random.randint(0, 1080))
await page.mouse.wheel(0, random.randint(100, 500))  # Scroll
await asyncio.sleep(random.uniform(3, 7))
# Now extract data

4. For seller account monitoring, use dedicated residential/mobile proxies

Seller accounts need consistent IP/location pairs:

# Seller account: keep same IP for 500 requests, then rotate to new region
seller_proxies = {
    "account1": [proxy1_Singapore, proxy2_Singapore, proxy3_Singapore],
    "account2": [proxy4_Malaysia, proxy5_Malaysia, proxy6_Malaysia],
}

requests_per_proxy = 0
current_proxy_index = 0

for seller_id, product_url in enumerate(seller_product_list):
    proxy = seller_proxies[seller_id][current_proxy_index]

    await login_seller_account(seller_id, proxy)
    await scrape_amazon(product_url, proxy)

    requests_per_proxy += 1
    if requests_per_proxy > 500:
        current_proxy_index = (current_proxy_index + 1) % len(seller_proxies[seller_id])
        requests_per_proxy = 0
        await asyncio.sleep(300)  # Cool-down before next proxy

Amazon Result: With mobile proxies, IP rotation every 40 requests, and realistic delays, sustain 400–800 product pages per day. Seller accounts: 150–300 per account per day without suspension.

Real Scraping Workflow: From Proxy Selection to Data

Here’s a complete workflow for price monitoring across all three platforms. For region-specific strategies, see our guide on Singapore mobile proxy use cases and best setup.

Step 1: Choose Your Proxy Type

Use CaseRecommendationReason
Testing setupDatacenterCheap, fast, acceptable to fail
Weekly price checkResidentialSufficient for low-volume, cost-effective
Daily monitoring 100+ productsResidential + Mobile mixResidentials for base load, mobile for high-volume
1,000+ products dailyMobileAvoid blocks, sustain consistent data collection
Multi-account seller monitoringMobile (dedicated)Account-level consistency critical

Step 2: Configure Request Headers and Browser Settings

import random

def get_realistic_headers():
    user_agents = [
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
        "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36",
    ]

    return {
        "User-Agent": random.choice(user_agents),
        "Accept-Language": "en-US,en;q=0.9",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
        "Accept-Encoding": "gzip, deflate, br",
        "Connection": "keep-alive",
        "Upgrade-Insecure-Requests": "1",
        "Sec-Fetch-Dest": "document",
        "Sec-Fetch-Mode": "navigate",
        "Sec-Fetch-Site": "none",
    }

def launch_browser_with_proxy(proxy):
    return {
        "headless": True,
        "proxy": {"server": proxy},
        "args": [
            "--disable-blink-features=AutomationControlled",
            "--disable-web-resources",
        ]
    }

Step 3: Implement Rate Limiting and Delays

import asyncio
import random
from datetime import datetime, timedelta

class RateLimiter:
    def __init__(self, requests_per_hour):
        self.requests_per_hour = requests_per_hour
        self.delay = 3600 / requests_per_hour
        self.last_request = datetime.now()

    async def wait(self):
        elapsed = (datetime.now() - self.last_request).total_seconds()
        if elapsed < self.delay:
            await asyncio.sleep(self.delay - elapsed)
        self.last_request = datetime.now()

# Usage
limiter = RateLimiter(requests_per_hour=150)
for product_url in product_list:
    await limiter.wait()
    await scrape(product_url)

Step 4: Manage Sessions (When to Rotate, When to Stick)

class SessionManager:
    def __init__(self, rotation_interval=30):
        self.rotation_interval = rotation_interval
        self.request_count = 0
        self.current_proxy = None

    def should_rotate(self):
        self.request_count += 1
        if self.request_count >= self.rotation_interval:
            self.request_count = 0
            return True
        return False

    def get_proxy(self, proxy_pool):
        if self.should_rotate():
            self.current_proxy = random.choice(proxy_pool)
        return self.current_proxy

# Usage
session_mgr = SessionManager(rotation_interval=30)
for product_url in product_list:
    proxy = session_mgr.get_proxy(proxy_pool)
    await scrape(product_url, proxy)

Step 5: Handle Errors and Retries

async def scrape_with_retry(url, proxy, max_retries=3):
    for attempt in range(max_retries):
        try:
            result = await scrape(url, proxy)
            return result
        except CaptchaException:
            # Rotate IP and retry
            proxy = get_next_proxy()
            await asyncio.sleep(60)  # Cool-down
        except RateLimitException:
            # Hit rate limit, wait longer
            await asyncio.sleep(300)
        except ConnectionError:
            # Network issue, retry
            await asyncio.sleep(random.uniform(5, 10))
        except Exception as e:
            if attempt == max_retries - 1:
                return None
            await asyncio.sleep(random.uniform(10, 20))

    return None

Cost-Benefit Analysis: Datacenter vs. Residential vs. Mobile

Let’s get practical about cost.

Datacenter Proxies

  • Cost: $0.50–$2 per GB
  • Blocked after: 50–100 requests (Shopee/Lazada), 20–50 (Amazon)
  • Effective cost per 1,000 requests: $50–$200
  • When justified: Testing, static HTML targets, budget-constrained pilots

Residential Proxies

  • Cost: $5–$15 per GB
  • Blocked after: 200–500 requests (Shopee/Lazada), 100–200 (Amazon)
  • Effective cost per 1,000 requests: $10–$20
  • When justified: Weekly/monthly monitoring, <500 products/day, price-sensitive operations

Mobile Proxies

  • Cost: $15–$40+ per GB
  • Blocked after: 1,000–3,000 requests (Shopee), 800–2,000 (Lazada), 400–800 (Amazon)
  • Effective cost per 1,000 requests: $5–$15
  • When justified: Daily monitoring, 500+ products/day, high-volume, account safety critical

Real Example: Scraping 1,000 Amazon product pages per day

  • Datacenter: 40–50 IPs needed, ~$100–200/day in blocked traffic
  • Residential: 10–15 IPs, ~$20–30/day
  • Mobile: 2–5 IPs, ~$8–15/day

Over a year, mobile proxies save $2,500–$5,000 despite higher per-GB cost, because you rotate less and waste less traffic on blocked IPs.

Frequently Asked Questions

Q: Will I get my account banned if I scrape Amazon?

A: Account suspension is possible after 1,000+ requests from suspicious IPs. Mitigate this by rotating IPs every 40 requests, using mobile proxies, and spreading requests across multiple accounts. If you’re running a legitimate price monitoring service, consider using the Amazon Product Advertising API instead.

Q: Can I scrape Shopee/Lazada data for commercial resale?

A: Most e-commerce platforms’ terms of service prohibit scraping. Check local jurisdictions: some countries allow it for research/personal use, others don’t. Consult legal counsel before building a commercial scraping operation.

Q: What’s the difference between sticky sessions and rotating proxies?

A: Sticky sessions keep you on the same IP for 15–60 minutes, then rotate. Rotating proxies switch IPs after every request or every N requests. Sticky sessions are better for session-based operations (login, cart actions); rotating proxies are better for evasion (avoiding per-IP rate limits). See our full guide on sticky sessions and why they matter in mobile proxy testing.

Q: How do I avoid CAPTCHA challenges?

A: CAPTCHA frequency correlates with proxy trust. Mobile proxies have 5–10x lower CAPTCHA rates than residential, which have 5x lower than datacenter. Realistic behavioral patterns (delays, scrolling, varied request intervals) reduce challenges. CAPTCHA-solving services exist but are expensive and often detected. Better to rotate IPs or use better proxy types.

Q: Can I use the same proxy for multiple accounts?

A: Not safely. E-commerce platforms flag multiple logins from the same IP. Assign different proxies to different accounts. Learn more about secure multi-account setup and best practices.

Q: How often should I rotate IPs for Lazada?

A: Rotate every 30–50 requests or every 20–30 minutes. Lazada’s per-IP rate limit is ~100–150 requests/hour. Rotate before hitting the ceiling to avoid triggering rate-limit detection algorithms.


Last updated: February 28, 2026

Article type: Technical Guide / Informational

Difficulty level: Intermediate to Advanced

Scroll to Top