Playwright Proxy Configuration: Step-by-Step Scraping Guide
Playwright has become the preferred browser automation tool for many scraping teams, and for good reason. It offers native support for per-context proxy configuration, built-in authentication handling, multi-browser support (Chromium, Firefox, WebKit), and a modern async API. Where Selenium requires workarounds for proxy authentication and Puppeteer lacks per-page proxy switching, Playwright handles both natively.
This guide covers Playwright proxy configuration from basic setup through advanced patterns including per-context routing, request interception, and production-ready scraper architectures. Examples are provided in both Python and Node.js.
Why Playwright for Proxy-Based Scraping
Playwright was built by the team that created Puppeteer, and it addresses many of Puppeteer’s limitations. For proxy-based scraping specifically, Playwright offers several advantages.
Native Proxy Authentication
Unlike Selenium and Puppeteer, Playwright supports proxy authentication as a first-class feature. No extensions, no third-party libraries, no local proxy forwarders — just pass credentials directly in the proxy configuration.
Browser Contexts with Independent Proxies
Playwright’s browser context model lets you create isolated browsing sessions within a single browser instance. Each context can have its own proxy, cookies, storage, and viewport. This means you can scrape multiple sites with different proxies simultaneously without launching multiple browsers.
Multi-Browser Support
Test your scraping setup across Chromium, Firefox, and WebKit from the same codebase. Different anti-bot systems may respond differently to different browsers, and Playwright lets you switch with a single parameter change.
Basic Proxy Setup
Python
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(
headless=True,
proxy={
'server': 'http://proxy-host:proxy-port',
'username': 'your-username',
'password': 'your-password'
}
)
page = browser.new_page()
page.goto('https://httpbin.org/ip')
print(page.content())
browser.close()Node.js
const { chromium } = require('playwright');
(async () => {
const browser = await chromium.launch({
headless: true,
proxy: {
server: 'http://proxy-host:proxy-port',
username: 'your-username',
password: 'your-password'
}
});
const page = await browser.newPage();
await page.goto('https://httpbin.org/ip');
console.log(await page.content());
await browser.close();
})();When you set the proxy at the browser level, all pages and contexts created from that browser instance use the same proxy. This is the simplest configuration and works for single-proxy scraping tasks.
SOCKS5 Proxy
browser = p.chromium.launch(
proxy={
'server': 'socks5://proxy-host:proxy-port',
'username': 'your-username',
'password': 'your-password'
}
)Playwright handles SOCKS5 natively, including DNS resolution through the proxy.
Per-Context Proxy Configuration
This is where Playwright shines compared to other tools. You can assign different proxies to different browser contexts within a single browser instance.
Python Example
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
# Launch browser WITHOUT a proxy -- proxies are set per context
browser = p.chromium.launch(headless=True)
# Context 1: Singapore mobile proxy
context_sg = browser.new_context(
proxy={
'server': 'http://sg-mobile-proxy:port',
'username': 'user_sg',
'password': 'pass_sg'
},
viewport={'width': 1920, 'height': 1080},
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
)
# Context 2: US mobile proxy
context_us = browser.new_context(
proxy={
'server': 'http://us-mobile-proxy:port',
'username': 'user_us',
'password': 'pass_us'
},
viewport={'width': 1920, 'height': 1080},
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
)
# Scrape with different proxies simultaneously
page_sg = context_sg.new_page()
page_us = context_us.new_page()
page_sg.goto('https://example.sg/products')
page_us.goto('https://example.com/products')
sg_content = page_sg.content()
us_content = page_us.content()
context_sg.close()
context_us.close()
browser.close()Node.js Example
const { chromium } = require('playwright');
(async () => {
const browser = await chromium.launch({ headless: true });
// Create contexts with different proxies
const contextSG = await browser.newContext({
proxy: {
server: 'http://sg-mobile-proxy:port',
username: 'user_sg',
password: 'pass_sg'
}
});
const contextUS = await browser.newContext({
proxy: {
server: 'http://us-mobile-proxy:port',
username: 'user_us',
password: 'pass_us'
}
});
const pageSG = await contextSG.newPage();
const pageUS = await contextUS.newPage();
await Promise.all([
pageSG.goto('https://example.sg/products'),
pageUS.goto('https://example.com/products')
]);
// Extract data from both...
await browser.close();
})();Memory Advantage
Each browser context uses significantly less memory than a separate browser instance. A Chromium browser with 10 contexts uses roughly 500 MB, compared to 1.5-3 GB for 10 separate browser instances. For multi-account scraping operations that need distinct proxies per account, this is a major advantage.
Learn more about managing multiple accounts with proxies in our multi-account proxy guide.
Request Interception
Playwright’s route API lets you intercept, modify, or block requests. This is useful for reducing bandwidth, injecting headers, and debugging proxy issues.
Blocking Unnecessary Resources
async def block_resources(route, request):
blocked_types = ['image', 'stylesheet', 'font', 'media']
if request.resource_type in blocked_types:
await route.abort()
else:
await route.continue_()
page = context.new_page()
await page.route('**/*', block_resources)
await page.goto('https://example.com')Blocking images, fonts, and stylesheets can reduce bandwidth consumption by 50-70%, which directly reduces proxy costs when paying per GB.
Modifying Request Headers
async def add_custom_headers(route, request):
headers = {
**request.headers,
'Accept-Language': 'en-SG,en;q=0.9',
'Accept-Encoding': 'gzip, deflate, br'
}
await route.continue_(headers=headers)
await page.route('**/*', add_custom_headers)Intercepting API Responses
Sometimes the data you need is in API responses loaded by the page, not in the rendered HTML. Playwright can intercept these:
from playwright.sync_api import sync_playwright
import json
api_data = []
def handle_response(response):
if '/api/products' in response.url:
try:
data = response.json()
api_data.append(data)
except Exception:
pass
with sync_playwright() as p:
browser = p.chromium.launch(
proxy={'server': 'http://proxy:port', 'username': 'user', 'password': 'pass'}
)
page = browser.new_page()
page.on('response', handle_response)
page.goto('https://example.com/products')
page.wait_for_timeout(5000) # Wait for API calls to complete
print(json.dumps(api_data, indent=2))
browser.close()This technique is often more efficient than parsing rendered HTML, especially for SPAs that load data via JSON APIs.
Proxy Rotation Strategies
Context-Based Rotation
Create a new context with a different proxy for each batch of requests:
from playwright.sync_api import sync_playwright
import random
proxies = [
{'server': 'http://proxy1:port', 'username': 'user1', 'password': 'pass1'},
{'server': 'http://proxy2:port', 'username': 'user2', 'password': 'pass2'},
{'server': 'http://proxy3:port', 'username': 'user3', 'password': 'pass3'},
]
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
urls = ['https://example.com/page1', 'https://example.com/page2', ...]
for i, url in enumerate(urls):
if i % 5 == 0: # New context every 5 URLs
if 'context' in dir():
context.close()
proxy = random.choice(proxies)
context = browser.new_context(proxy=proxy)
page = context.new_page()
page.goto(url, wait_until='networkidle')
content = page.content()
# Process content...
page.close()
browser.close()Rotating Gateway
With a rotating proxy gateway, the provider handles IP rotation server-side:
browser = p.chromium.launch(
proxy={
'server': 'http://rotating-gateway.provider.com:port',
'username': 'user',
'password': 'pass'
}
)
# Each new context or page may get a different IP
for url in urls:
context = browser.new_context()
page = context.new_page()
page.goto(url)
# Scrape...
context.close()Sticky Session Rotation
For scraping that requires session continuity (login-based scraping, pagination), use sticky sessions:
# Use a proxy endpoint that supports sticky sessions via session ID in username
context = browser.new_context(
proxy={
'server': 'http://sticky-gateway.provider.com:port',
'username': 'user-session-abc123', # Session ID keeps same IP
'password': 'pass'
}
)
page = context.new_page()
# All requests in this context use the same IP
page.goto('https://example.com/login')
# ... login ...
page.goto('https://example.com/dashboard')
# ... scrape multiple pages ...
context.close()Anti-Detection with Playwright
Playwright is harder to detect than stock Selenium, but it still requires configuration to avoid fingerprinting.
Stealth Configuration
context = browser.new_context(
viewport={'width': 1920, 'height': 1080},
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
locale='en-SG',
timezone_id='Asia/Singapore',
proxy={'server': 'http://sg-proxy:port', 'username': 'user', 'password': 'pass'}
)
page = context.new_page()
# Remove automation indicators
await page.add_init_script("""
Object.defineProperty(navigator, 'webdriver', { get: () => undefined });
window.chrome = { runtime: {} };
""")Playwright Stealth Plugin (Node.js)
const { chromium } = require('playwright');
const { PlaywrightExtra } = require('playwright-extra');
const stealth = require('puppeteer-extra-plugin-stealth');
const pw = new PlaywrightExtra(chromium);
pw.use(stealth());
const browser = await pw.launch({
headless: true,
proxy: {
server: 'http://mobile-proxy:port',
username: 'user',
password: 'pass'
}
});Geographic Consistency
When using a Singapore mobile proxy, your browser context should reflect a Singapore user:
context = browser.new_context(
proxy={'server': 'http://sg-mobile-proxy:port', 'username': 'user', 'password': 'pass'},
locale='en-SG',
timezone_id='Asia/Singapore',
geolocation={'latitude': 1.3521, 'longitude': 103.8198},
permissions=['geolocation'],
viewport={'width': 1920, 'height': 1080}
)Anti-bot systems cross-reference IP geolocation with browser timezone and locale. A mismatch is a detection signal. For more on how specific anti-bot platforms analyze these signals, see our guides on Cloudflare bypass and Akamai bypass.
Production Scraper: Python Async
Here is a production-ready async scraper using Playwright with mobile proxies:
import asyncio
from playwright.async_api import async_playwright
import logging
import random
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
class PlaywrightProxyScraper:
def __init__(self, proxy_config, concurrency=3):
self.proxy_config = proxy_config
self.concurrency = concurrency
self.browser = None
self.playwright = None
async def start(self):
self.playwright = await async_playwright().start()
self.browser = await self.playwright.chromium.launch(headless=True)
async def scrape_url(self, url, semaphore):
async with semaphore:
context = await self.browser.new_context(
proxy=self.proxy_config,
viewport={'width': 1920, 'height': 1080},
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
'AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/120.0.0.0 Safari/537.36',
locale='en-SG',
timezone_id='Asia/Singapore'
)
page = await context.new_page()
# Block heavy resources
await page.route('**/*.{png,jpg,jpeg,gif,svg,css,woff,woff2}',
lambda route: route.abort())
try:
await asyncio.sleep(random.uniform(0.5, 2.0))
await page.goto(url, wait_until='networkidle', timeout=30000)
title = await page.title()
if any(x in title.lower() for x in ['blocked', 'denied', 'captcha']):
logger.warning(f"Blocked on {url}")
return {'url': url, 'success': False, 'content': None}
content = await page.content()
logger.info(f"Scraped {url}")
return {'url': url, 'success': True, 'content': content}
except Exception as e:
logger.error(f"Failed {url}: {e}")
return {'url': url, 'success': False, 'content': None}
finally:
await context.close()
async def scrape_many(self, urls):
semaphore = asyncio.Semaphore(self.concurrency)
tasks = [self.scrape_url(url, semaphore) for url in urls]
return await asyncio.gather(*tasks)
async def close(self):
if self.browser:
await self.browser.close()
if self.playwright:
await self.playwright.stop()
# Usage
async def main():
scraper = PlaywrightProxyScraper(
proxy_config={
'server': 'http://mobile-proxy.example.com:port',
'username': 'your-username',
'password': 'your-password'
},
concurrency=5
)
await scraper.start()
urls = [f'https://example.com/page/{i}' for i in range(1, 51)]
results = await scraper.scrape_many(urls)
successful = sum(1 for r in results if r['success'])
logger.info(f"Scraped {successful}/{len(urls)} pages successfully")
await scraper.close()
asyncio.run(main())Production Scraper: Node.js
const { chromium } = require('playwright');
class PlaywrightScraper {
constructor(proxyConfig, concurrency = 3) {
this.proxyConfig = proxyConfig;
this.concurrency = concurrency;
this.browser = null;
}
async start() {
this.browser = await chromium.launch({ headless: true });
}
async scrapeUrl(url) {
const context = await this.browser.newContext({
proxy: this.proxyConfig,
viewport: { width: 1920, height: 1080 },
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
locale: 'en-SG',
timezoneId: 'Asia/Singapore'
});
const page = await context.newPage();
await page.route('**/*.{png,jpg,jpeg,gif,svg,css,woff,woff2}',
route => route.abort()
);
try {
await page.waitForTimeout(Math.random() * 1500 + 500);
await page.goto(url, { waitUntil: 'networkidle', timeout: 30000 });
const content = await page.content();
return { url, success: true, content };
} catch (err) {
console.error(`Failed ${url}: ${err.message}`);
return { url, success: false, content: null };
} finally {
await context.close();
}
}
async scrapeMany(urls) {
const results = [];
for (let i = 0; i < urls.length; i += this.concurrency) {
const batch = urls.slice(i, i + this.concurrency);
const batchResults = await Promise.all(
batch.map(url => this.scrapeUrl(url))
);
results.push(...batchResults);
}
return results;
}
async close() {
if (this.browser) await this.browser.close();
}
}
// Usage
(async () => {
const scraper = new PlaywrightScraper({
server: 'http://mobile-proxy.example.com:port',
username: 'your-username',
password: 'your-password'
}, 5);
await scraper.start();
const urls = Array.from({ length: 50 }, (_, i) =>
`https://example.com/page/${i + 1}`
);
const results = await scraper.scrapeMany(urls);
const successful = results.filter(r => r.success).length;
console.log(`Scraped ${successful}/${urls.length} pages`);
await scraper.close();
})();Playwright vs. Puppeteer vs. Selenium for Proxy Scraping
| Feature | Playwright | Puppeteer | Selenium |
|---|---|---|---|
| Native proxy auth | Yes | No (needs workaround) | No (needs extension/Wire) |
| Per-context proxies | Yes | No (per-browser only) | No (per-driver only) |
| Multi-browser | Chromium, Firefox, WebKit | Chromium only | All browsers |
| Async API | Native | Native | Via async wrapper |
| Stealth ecosystem | Growing | Mature | Mature (undetected-chromedriver) |
| Memory per session | Low (contexts) | High (instances) | High (instances) |
For most new scraping projects, Playwright is the recommended choice. Its proxy handling is cleaner, its context model is more memory-efficient, and its API is more intuitive.
For Puppeteer-specific setups, see our Puppeteer proxy guide. For Selenium, check the Selenium proxy guide.
Conclusion
Playwright’s native proxy support — especially per-context configuration with built-in authentication — makes it the most scraping-friendly browser automation tool available. Combined with mobile proxies, you get a stack that handles both IP trust and browser fingerprinting in a clean, maintainable architecture.
The per-context model is particularly powerful for multi-account operations where each account needs its own proxy and cookie store. Instead of running dozens of browser instances, you run one browser with dozens of lightweight contexts.
DataResearchTools mobile proxies work seamlessly with Playwright’s proxy configuration. Get started with our scraping proxy plans and connect them to Playwright in under five minutes.
- Mobile Proxies for Web Scraping: Why They Work When Others Don’t
- How to Use Mobile Proxies with Puppeteer for Web Scraping
- Selenium Proxy Setup: Complete Guide for Web Scraping
- Python Requests + Proxies: Scraping Setup from Scratch
- Scrapy Proxy Middleware: Rotate Mobile Proxies for Large-Scale Crawls
- Headless Browser + Proxy Setup: The Anti-Detection Stack
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- How to Build an Ethical Web Scraping Policy for Your Company
- How to Scrape Amazon Product Data with Proxies: 2026 Python Guide
- How to Scrape Bing Search Results with Python and Proxies
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- aiohttp + BeautifulSoup: Async Python Scraping
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- Axios + Cheerio: Lightweight Node.js Scraping
- How to Build an Ethical Web Scraping Policy for Your Company
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- aiohttp + BeautifulSoup: Async Python Scraping
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- Axios + Cheerio: Lightweight Node.js Scraping
- How to Build an Ethical Web Scraping Policy for Your Company
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- aiohttp + BeautifulSoup: Async Python Scraping
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- Axios + Cheerio: Lightweight Node.js Scraping
- How to Build an Ethical Web Scraping Policy for Your Company
Related Reading
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- aiohttp + BeautifulSoup: Async Python Scraping
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- Axios + Cheerio: Lightweight Node.js Scraping
- How to Build an Ethical Web Scraping Policy for Your Company