Proxies for NFT Marketplace Scraping (OpenSea, Blur, Magic Eden)
NFT marketplace data powers trading algorithms, rarity analysis tools, portfolio trackers, and market research platforms. OpenSea, Blur, and Magic Eden collectively host millions of NFTs with constantly changing floor prices, listing activity, and sales data. Scraping this data at scale requires proxy infrastructure that handles aggressive anti-bot protections while maintaining the throughput needed for real-time market monitoring.
This guide covers the technical requirements for scraping each major NFT marketplace, including proxy selection, anti-detection strategies, and practical code examples.
Why NFT Marketplace Scraping Requires Proxies
NFT marketplaces implement multiple layers of anti-scraping protection:
- Cloudflare Bot Management: OpenSea and Magic Eden use Cloudflare’s enterprise bot detection, which fingerprints browsers, analyzes request patterns, and serves CAPTCHAs to suspicious traffic.
- API Rate Limits: OpenSea’s API limits requests to 5 per second on free tiers. Even paid tiers cap at 30 requests per second.
- IP Reputation Scoring: Marketplaces track request volume per IP and progressively degrade service for high-volume IPs before eventually blocking them.
- JavaScript Challenges: Many marketplace pages require JavaScript execution to render content, blocking simple HTTP scrapers.
Without proxies, a scraper monitoring even a modest collection of 100 NFT projects will exhaust rate limits and trigger blocks within minutes.
For comprehensive guidance on web scraping with proxies, including rotation patterns and anti-detection fundamentals, the dedicated guide covers the technical foundation.
Proxy Selection for NFT Scraping
By Marketplace
| Marketplace | Recommended Proxy | Reason |
|---|---|---|
| OpenSea | Mobile proxies | Cloudflare enterprise + aggressive fingerprinting |
| Blur | Mobile proxies | Heavy bot detection + Cloudflare |
| Magic Eden | Residential or mobile | Less aggressive but still Cloudflare-protected |
| LooksRare | Residential | Moderate protection |
| Tensor (Solana) | Residential | API-first approach, moderate limits |
Mobile proxies consistently outperform other types for NFT scraping because Cloudflare assigns the highest trust scores to mobile carrier IPs. This means fewer CAPTCHAs, fewer blocks, and higher sustained throughput.
Scraping OpenSea
Using the OpenSea API
OpenSea provides a REST API that is the primary target for data collection:
import aiohttp
import asyncio
import time
from typing import List, Dict, Optional
class OpenSeaScraper:
BASE_URL = "https://api.opensea.io/api/v2"
def __init__(self, api_keys: list, proxy_pool: list):
self.api_keys = api_keys
self.proxy_pool = proxy_pool
self.key_index = 0
self.proxy_index = 0
def _get_headers(self) -> dict:
key = self.api_keys[self.key_index % len(self.api_keys)]
self.key_index += 1
return {
"accept": "application/json",
"x-api-key": key,
}
def _get_proxy(self) -> str:
proxy = self.proxy_pool[self.proxy_index % len(self.proxy_pool)]
self.proxy_index += 1
return proxy
async def get_collection_stats(self, session, collection_slug):
"""Fetch collection statistics."""
url = f"{self.BASE_URL}/collections/{collection_slug}/stats"
proxy = self._get_proxy()
async with session.get(
url,
headers=self._get_headers(),
proxy=f"http://{proxy}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
data = await resp.json()
return {
"collection": collection_slug,
"floor_price": data.get("total", {}).get("floor_price"),
"total_volume": data.get("total", {}).get("volume"),
"total_sales": data.get("total", {}).get("sales"),
"num_owners": data.get("total", {}).get("num_owners"),
"market_cap": data.get("total", {}).get("market_cap"),
"timestamp": time.time()
}
elif resp.status == 429:
await asyncio.sleep(2)
return None
return None
async def get_collection_listings(self, session, collection_slug,
limit=50):
"""Fetch active listings for a collection."""
url = f"{self.BASE_URL}/listings/collection/{collection_slug}/all"
proxy = self._get_proxy()
params = {"limit": limit}
async with session.get(
url,
headers=self._get_headers(),
params=params,
proxy=f"http://{proxy}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
data = await resp.json()
listings = []
for listing in data.get("listings", []):
price_data = listing.get("price", {}).get("current", {})
listings.append({
"token_id": listing.get("protocol_data", {})
.get("parameters", {}).get("offer", [{}])[0]
.get("identifierOrCriteria"),
"price_eth": float(price_data.get("value", 0)) / 1e18,
"currency": price_data.get("currency"),
"expiration": listing.get("protocol_data", {})
.get("parameters", {}).get("endTime"),
})
return listings
return []
async def get_collection_events(self, session, collection_slug,
event_type="sale"):
"""Fetch recent sales or other events."""
url = f"{self.BASE_URL}/events/collection/{collection_slug}"
proxy = self._get_proxy()
params = {"event_type": event_type, "limit": 50}
async with session.get(
url,
headers=self._get_headers(),
params=params,
proxy=f"http://{proxy}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
return await resp.json()
return NoneScraping OpenSea’s Web Interface
When API limits are insufficient, scraping the web interface provides additional data. This requires a headless browser:
from playwright.async_api import async_playwright
class OpenSeaWebScraper:
async def scrape_collection_page(self, collection_slug: str,
proxy: str):
async with async_playwright() as p:
browser = await p.chromium.launch(
headless=True,
proxy={
"server": f"http://{proxy.split('@')[1]}",
"username": proxy.split(':')[0],
"password": proxy.split(':')[1].split('@')[0],
}
)
context = await browser.new_context(
viewport={"width": 1920, "height": 1080},
user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
"AppleWebKit/537.36 Chrome/120.0.0.0 Safari/537.36"
)
page = await context.new_page()
await page.goto(
f"https://opensea.io/collection/{collection_slug}",
wait_until="networkidle"
)
# Wait for listings to load
await page.wait_for_selector('[data-testid="ItemCard"]',
timeout=15000)
# Extract listing data
listings = await page.evaluate('''() => {
const cards = document.querySelectorAll(
'[data-testid="ItemCard"]'
);
return Array.from(cards).map(card => ({
name: card.querySelector(
'[data-testid="ItemCardFooter-name"]'
)?.textContent,
price: card.querySelector(
'[data-testid="ItemCardPrice"]'
)?.textContent,
}));
}''')
await browser.close()
return listingsScraping Blur
Blur’s API is less documented than OpenSea’s but provides valuable data for NFT traders:
class BlurScraper:
BASE_URL = "https://core-api.prod.blur.io/v1"
def __init__(self, proxy_pool: list):
self.proxy_pool = proxy_pool
self.proxy_index = 0
def _get_proxy(self):
proxy = self.proxy_pool[self.proxy_index % len(self.proxy_pool)]
self.proxy_index += 1
return proxy
async def get_collection_stats(self, session, contract_address):
url = f"{self.BASE_URL}/collections/{contract_address}"
proxy = self._get_proxy()
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
"AppleWebKit/537.36 Chrome/120.0.0.0",
"Accept": "application/json",
"Origin": "https://blur.io",
"Referer": "https://blur.io/",
}
async with session.get(
url,
headers=headers,
proxy=f"http://{proxy}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
data = await resp.json()
collection = data.get("collection", {})
return {
"contract": contract_address,
"floor_price": collection.get("floorPrice"),
"total_supply": collection.get("totalSupply"),
"num_owners": collection.get("numberOwners"),
"volume_24h": collection.get("volume24h"),
"source": "blur"
}
return None
async def get_floor_listings(self, session, contract_address):
url = f"{self.BASE_URL}/collections/{contract_address}/tokens"
proxy = self._get_proxy()
params = {"sort": "price", "order": "asc", "limit": 50}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
"AppleWebKit/537.36 Chrome/120.0.0.0",
"Origin": "https://blur.io",
"Referer": "https://blur.io/",
}
async with session.get(
url,
headers=headers,
params=params,
proxy=f"http://{proxy}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
data = await resp.json()
return data.get("tokens", [])
return []Scraping Magic Eden
class MagicEdenScraper:
# Ethereum collections
ETH_BASE = "https://api-mainnet.magiceden.dev/v3/rtp/ethereum"
# Solana collections
SOL_BASE = "https://api-mainnet.magiceden.dev/v2"
def __init__(self, proxy_pool: list, api_key: str = None):
self.proxy_pool = proxy_pool
self.api_key = api_key
self.proxy_index = 0
def _get_proxy(self):
proxy = self.proxy_pool[self.proxy_index % len(self.proxy_pool)]
self.proxy_index += 1
return proxy
async def get_solana_collection_stats(self, session, symbol):
url = f"{self.SOL_BASE}/collections/{symbol}/stats"
proxy = self._get_proxy()
headers = {}
if self.api_key:
headers["Authorization"] = f"Bearer {self.api_key}"
async with session.get(
url, headers=headers,
proxy=f"http://{proxy}",
timeout=aiohttp.ClientTimeout(total=10)
) as resp:
if resp.status == 200:
data = await resp.json()
return {
"symbol": symbol,
"floor_price_sol": data.get("floorPrice", 0) / 1e9,
"listed_count": data.get("listedCount"),
"volume_all": data.get("volumeAll", 0) / 1e9,
"avg_price_24h": data.get("avgPrice24hr", 0) / 1e9,
"source": "magic_eden"
}
return NoneMulti-Marketplace Aggregator
class NFTMarketAggregator:
def __init__(self, proxy_pool: list, opensea_keys: list):
self.opensea = OpenSeaScraper(opensea_keys, proxy_pool)
self.blur = BlurScraper(proxy_pool)
self.magic_eden = MagicEdenScraper(proxy_pool)
async def get_collection_overview(self, collection_slug: str,
contract_address: str):
"""Get data from all marketplaces for a single collection."""
async with aiohttp.ClientSession() as session:
tasks = {
"opensea": self.opensea.get_collection_stats(
session, collection_slug
),
"blur": self.blur.get_collection_stats(
session, contract_address
),
}
results = {}
for source, task in tasks.items():
try:
results[source] = await task
except Exception as e:
results[source] = {"error": str(e)}
return results
async def monitor_collections(self, collections: list,
interval: float = 30):
"""Continuously monitor multiple collections."""
while True:
for col in collections:
data = await self.get_collection_overview(
col["slug"], col["contract"]
)
# Store or process data
print(f"{col['slug']}: {data}")
await asyncio.sleep(interval)Proxy Sizing Recommendations
| Scraping Scope | Collections | Data Points | Proxies Needed |
|---|---|---|---|
| Hobby tracker | 5-10 | Floor prices | 2-3 mobile |
| Trading tool | 50-100 | Full listings + sales | 5-10 mobile |
| Analytics platform | 500+ | All available data | 15-30 mobile |
| Enterprise | 5,000+ | Complete market data | 50+ mobile |
Handling Cloudflare Challenges
When Cloudflare serves a challenge page, your scraper needs to solve it. Options include:
- Use headless browsers with stealth plugins for web scraping
- Switch to a fresh mobile proxy — often the challenge is IP-specific
- Implement CAPTCHA solving services as a fallback
The best approach is prevention: mobile proxies trigger far fewer Cloudflare challenges than other proxy types, and rotating IPs before hitting challenge thresholds keeps your scraper running smoothly.
Data Storage Best Practices
Store NFT market data in a time-series format for historical analysis:
import sqlite3
def init_nft_db(db_path: str):
conn = sqlite3.connect(db_path)
conn.execute('''CREATE TABLE IF NOT EXISTS floor_prices (
id INTEGER PRIMARY KEY AUTOINCREMENT,
collection TEXT NOT NULL,
marketplace TEXT NOT NULL,
floor_price REAL,
volume_24h REAL,
listed_count INTEGER,
timestamp REAL NOT NULL
)''')
conn.execute('''CREATE INDEX IF NOT EXISTS idx_collection_ts
ON floor_prices(collection, timestamp)''')
conn.commit()
return connFor a technical overview of proxy rotation and how different proxy types handle rate limiting, the glossary provides useful background.
Conclusion
NFT marketplace scraping is a demanding use case that requires the right combination of proxy infrastructure, anti-detection techniques, and marketplace-specific scraping logic. Mobile proxies provide the trust scores needed to bypass Cloudflare protection on OpenSea and Blur, while proper rate limiting and request distribution ensure sustained access. Start with API-based scraping and fall back to headless browser scraping only when API limits are insufficient for your data needs.
- How to Avoid IP-Based Sybil Detection in Crypto Protocols
- Best Proxies for Binance, Bybit, and OKX API Trading
- How to Collect Cryptocurrency Price Data Across Exchanges
- How to Scrape Stock Market Data with Mobile Proxies
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Anti-Phishing with Proxies: How Security Teams Use Mobile IPs
- How to Avoid IP-Based Sybil Detection in Crypto Protocols
- Best Proxies for Binance, Bybit, and OKX API Trading
- How to Collect Cryptocurrency Price Data Across Exchanges
- How to Scrape Stock Market Data with Mobile Proxies
- 403 Forbidden in Web Scraping: How to Fix It
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Avoid IP-Based Sybil Detection in Crypto Protocols
- Best Proxies for Binance, Bybit, and OKX API Trading
- How to Collect Cryptocurrency Price Data Across Exchanges
- How to Scrape Stock Market Data with Mobile Proxies
- 403 Forbidden in Web Scraping: How to Fix It
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Avoid IP-Based Sybil Detection in Crypto Protocols
- Best Proxies for Binance, Bybit, and OKX API Trading
- How to Collect Cryptocurrency Price Data Across Exchanges
- How to Scrape Stock Market Data with Mobile Proxies
- 403 Forbidden Error: What It Means & How to Fix It
- 403 Forbidden in Web Scraping: How to Fix It
Related Reading
- How to Avoid IP-Based Sybil Detection in Crypto Protocols
- Best Proxies for Binance, Bybit, and OKX API Trading
- How to Collect Cryptocurrency Price Data Across Exchanges
- How to Scrape Stock Market Data with Mobile Proxies
- 403 Forbidden Error: What It Means & How to Fix It
- 403 Forbidden in Web Scraping: How to Fix It