How to Scrape Pinterest with Proxies in 2026
Pinterest is a visual discovery platform with over 450 million monthly active users, hosting billions of pins across virtually every interest and niche. For marketers, trend researchers, designers, and e-commerce businesses, Pinterest data provides unique insights into visual trends, consumer interests, and content performance.
This guide covers how to scrape Pinterest data including pins, boards, images, and engagement metrics using Python with proxy rotation.
Why Scrape Pinterest?
Pinterest data enables several valuable applications:
- Trend research — Identify emerging visual and product trends before they go mainstream
- Content strategy — Analyze what types of pins perform best in your niche
- Competitive analysis — Monitor competitor boards, pin frequency, and engagement
- E-commerce product research — Find trending products based on pin popularity
- Image dataset creation — Build visual datasets for machine learning training
- Influencer analysis — Evaluate Pinterest influencer reach and engagement
- SEO research — Pinterest is a major traffic source; understand what ranks on the platform
Pinterest’s API vs Scraping
Pinterest offers an official API, but with significant limitations:
Official Pinterest API
- Requires developer application and approval
- Limited to your own account data or approved scopes
- Rate limited to 200 calls per hour for most endpoints
- Does not provide competitor data or engagement metrics for other users’ pins
- No access to search results or trending data
Why Scraping Is Often Necessary
- Access to any public pin, board, or profile
- Search results with full engagement data
- No API rate limits (though you must self-limit)
- Trending pins and popular content discovery
- Full image URLs at original resolution
Data Points to Extract
| Data Point | Source | Notes |
|---|---|---|
| Pin image URL | Pin element | Multiple resolutions available |
| Pin description | Pin overlay / detail | User-written description |
| Pin title | Pin detail page | Often the page title of source |
| Source URL | Pin metadata | Link back to original content |
| Repins count | Engagement data | How many times pin was saved |
| Comments count | Engagement data | Comment engagement |
| Board name | Board / pin context | Which board the pin belongs to |
| Board follower count | Board page | Board popularity |
| Pinner profile | Pin metadata | Who pinned it |
| Pin category | Metadata | Pinterest’s categorization |
| Related pins | Pin detail page | Visually similar content |
Understanding Pinterest’s Anti-Bot Measures
Pinterest’s defenses are moderate compared to Facebook or Airbnb:
- Rate limiting — Moderately strict per-IP rate limits
- JavaScript rendering — Infinite scroll requires JS execution
- Session tokens — CSRF tokens required for API-like requests
- User-Agent checks — Blocks obvious bot User-Agents
- Login walls — Some content requires authentication after a few pages
- IP blocking — Datacenter IPs are flagged relatively quickly
Setting Up Your Environment
pip install requests beautifulsoup4 playwright fake-useragent
playwright install chromiumPython Code: Scraping Pinterest with Proxies
Approach 1: Using Pinterest’s Internal API
Pinterest’s web app makes API calls to internal endpoints. Intercepting these gives you structured JSON data:
import requests
import json
import time
import random
import logging
from fake_useragent import UserAgent
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
class PinterestScraper:
def __init__(self, proxy_list: list):
self.proxy_list = proxy_list
self.ua = UserAgent()
self.session = requests.Session()
self.pins = []
self.base_url = "https://www.pinterest.com"
def get_proxy(self) -> dict:
proxy = random.choice(self.proxy_list)
return {"http": f"http://{proxy}", "https": f"http://{proxy}"}
def get_headers(self) -> dict:
return {
"User-Agent": self.ua.random,
"Accept": "application/json, text/javascript, */*; q=0.01",
"Accept-Language": "en-US,en;q=0.9",
"X-Requested-With": "XMLHttpRequest",
"Referer": "https://www.pinterest.com/",
"Origin": "https://www.pinterest.com"
}
def init_session(self):
"""Initialize session and get CSRF token."""
response = self.session.get(
self.base_url,
headers={"User-Agent": self.ua.random},
proxies=self.get_proxy(),
timeout=30
)
# Extract csrftoken from cookies
self.csrf_token = self.session.cookies.get("csrftoken", "")
logger.info(f"Session initialized, CSRF token: {self.csrf_token[:20]}...")
def search_pins(self, query: str, max_results: int = 200):
"""Search Pinterest for pins matching a query."""
self.init_session()
bookmark = ""
results_collected = 0
while results_collected < max_results:
params = {
"source_url": f"/search/pins/?q={query}",
"data": json.dumps({
"options": {
"query": query,
"scope": "pins",
"bookmarks": [bookmark] if bookmark else [],
"page_size": 25
},
"context": {}
})
}
headers = self.get_headers()
headers["X-CSRFToken"] = self.csrf_token
try:
response = self.session.get(
f"{self.base_url}/resource/BaseSearchResource/get/",
params=params,
headers=headers,
proxies=self.get_proxy(),
timeout=30
)
if response.status_code == 200:
data = response.json()
results = data.get("resource_response", {}).get("data", {})
if isinstance(results, dict):
pins_data = results.get("results", [])
elif isinstance(results, list):
pins_data = results
else:
break
if not pins_data:
logger.info("No more results")
break
for pin_data in pins_data:
pin = self.parse_pin_data(pin_data)
if pin:
self.pins.append(pin)
results_collected += 1
# Get bookmark for next page
bookmark = data.get("resource_response", {}).get("bookmark", "")
if not bookmark:
break
logger.info(f"Collected {results_collected} pins so far")
elif response.status_code == 429:
logger.warning("Rate limited -- waiting")
time.sleep(random.uniform(30, 60))
continue
else:
logger.error(f"Status {response.status_code}")
break
except Exception as e:
logger.error(f"Search request failed: {e}")
time.sleep(random.uniform(2, 5))
def parse_pin_data(self, data: dict) -> dict:
"""Parse pin data from API response."""
if not isinstance(data, dict):
return None
pin = {
"pin_id": data.get("id"),
"description": data.get("description"),
"title": data.get("title"),
"link": data.get("link"),
"created_at": data.get("created_at"),
"domain": data.get("domain"),
"repin_count": data.get("repin_count", 0),
"comment_count": data.get("comment_count", 0),
}
# Image URLs at various resolutions
images = data.get("images", {})
if images:
pin["image_original"] = images.get("orig", {}).get("url")
pin["image_736"] = images.get("736x", {}).get("url")
pin["image_236"] = images.get("236x", {}).get("url")
# Pinner info
pinner = data.get("pinner", {})
if pinner:
pin["pinner_username"] = pinner.get("username")
pin["pinner_full_name"] = pinner.get("full_name")
pin["pinner_follower_count"] = pinner.get("follower_count")
# Board info
board = data.get("board", {})
if board:
pin["board_name"] = board.get("name")
pin["board_url"] = board.get("url")
return pin
def scrape_board(self, username: str, board_slug: str,
max_pins: int = 200):
"""Scrape all pins from a specific board."""
self.init_session()
bookmark = ""
collected = 0
while collected < max_pins:
params = {
"source_url": f"/{username}/{board_slug}/",
"data": json.dumps({
"options": {
"board_url": f"/{username}/{board_slug}/",
"bookmarks": [bookmark] if bookmark else [],
"page_size": 25
},
"context": {}
})
}
headers = self.get_headers()
headers["X-CSRFToken"] = self.csrf_token
try:
response = self.session.get(
f"{self.base_url}/resource/BoardFeedResource/get/",
params=params,
headers=headers,
proxies=self.get_proxy(),
timeout=30
)
if response.status_code == 200:
data = response.json()
pins_data = data.get("resource_response", {}).get("data", [])
if not pins_data:
break
for pin_data in pins_data:
pin = self.parse_pin_data(pin_data)
if pin:
self.pins.append(pin)
collected += 1
bookmark = data.get("resource_response", {}).get("bookmark", "")
if not bookmark:
break
logger.info(f"Board pins collected: {collected}")
else:
break
except Exception as e:
logger.error(f"Board scrape failed: {e}")
time.sleep(random.uniform(2, 4))
# Usage
if __name__ == "__main__":
proxies = [
"user:pass@residential1.proxy.com:8080",
"user:pass@residential2.proxy.com:8080",
"user:pass@residential3.proxy.com:8080",
]
scraper = PinterestScraper(proxy_list=proxies)
# Search for pins
scraper.search_pins("minimalist home decor", max_results=100)
print(f"Found {len(scraper.pins)} pins")
# Save results
with open("pinterest_pins.json", "w") as f:
json.dump(scraper.pins, f, indent=2)Approach 2: Handling Infinite Scroll with Playwright
For scraping Pinterest’s visual grid with infinite scroll:
import asyncio
from playwright.async_api import async_playwright
from bs4 import BeautifulSoup
import json
import random
async def scrape_pinterest_visual(query: str, proxy: str, max_scrolls: int = 20):
"""Scrape Pinterest search results using headless browser."""
async with async_playwright() as p:
auth, server = proxy.rsplit("@", 1)
user, password = auth.split(":", 1)
browser = await p.chromium.launch(
headless=True,
proxy={
"server": f"http://{server}",
"username": user,
"password": password
}
)
page = await browser.new_page()
url = f"https://www.pinterest.com/search/pins/?q={query}"
await page.goto(url, wait_until="networkidle")
all_pins = set()
for i in range(max_scrolls):
# Scroll down to trigger more pins loading
await page.evaluate("window.scrollBy(0, 800)")
await page.wait_for_timeout(random.randint(1500, 3000))
# Extract pin URLs from current page state
pin_links = await page.query_selector_all("a[href*='/pin/']")
for link in pin_links:
href = await link.get_attribute("href")
if href:
all_pins.add(href)
print(f"Scroll {i+1}: {len(all_pins)} unique pins found")
await browser.close()
return list(all_pins)Image Extraction and Downloading
Pinterest pins are fundamentally about images. Here is how to download pin images at full resolution:
import os
import requests
from urllib.parse import urlparse
def download_pin_images(pins: list, output_dir: str, proxy_list: list):
"""Download pin images at original resolution."""
os.makedirs(output_dir, exist_ok=True)
for pin in pins:
image_url = pin.get("image_original") or pin.get("image_736")
if not image_url:
continue
pin_id = pin.get("pin_id", "unknown")
ext = os.path.splitext(urlparse(image_url).path)[1] or ".jpg"
filename = f"{pin_id}{ext}"
filepath = os.path.join(output_dir, filename)
if os.path.exists(filepath):
continue
try:
proxy = random.choice(proxy_list)
response = requests.get(
image_url,
proxies={"http": f"http://{proxy}", "https": f"http://{proxy}"},
timeout=30,
stream=True
)
if response.status_code == 200:
with open(filepath, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
print(f"Downloaded: {filename}")
except Exception as e:
print(f"Failed to download {pin_id}: {e}")
time.sleep(random.uniform(0.5, 1.5))Proxy Rotation Strategy for Pinterest
Pinterest’s rate limiting is moderate, making it more accessible than platforms like Facebook or Airbnb:
- Residential rotating proxies — Best choice for sustained scraping. Rotate every 3-5 requests.
- Datacenter proxies — Can work for small-scale scraping but get blocked faster.
- US/EU IPs — Pinterest is primarily a US and European platform. Use IPs from these regions.
- Session management — Maintain session cookies across requests on the same IP for better success rates.
Estimate your proxy costs with our proxy cost calculator.
Troubleshooting
Problem: Pinterest returns login page instead of search results
- Pinterest gates content after a few unauthenticated page views. You may need to include session cookies from a logged-in account.
- Try accessing with a fresh session and new proxy IP.
Problem: API requests return empty results
- The CSRF token may have expired. Re-initialize your session to get a fresh token.
- Verify the API endpoint URL has not changed (Pinterest updates these periodically).
Problem: Images downloading as broken files
- Some Pinterest image URLs expire after a period. Download images immediately after scraping URLs.
- Check that you are using the correct image resolution URL (orig, 736x, 236x).
Problem: Rate limited (429 errors)
- Increase delays between requests to 5-10 seconds.
- Rotate to a fresh proxy IP after receiving a 429 response.
- Reduce the page_size parameter in API requests.
Problem: Infinite scroll stops loading new content
- Pinterest may have detected automation. Add random pauses and vary scroll distances.
- Try scrolling up slightly before scrolling down again to simulate human behavior.
Verify your proxy location with our IP lookup tool.
Legal and Ethical Considerations
Pinterest scraping involves several legal and ethical issues:
- Terms of Service — Pinterest’s ToS prohibits scraping. Violation may result in account termination and potential legal claims.
- Copyright — Pin images are often copyrighted by their creators. Downloading and using images may infringe on copyright, particularly for commercial purposes.
- robots.txt — Pinterest’s robots.txt restricts automated access to many paths. Respecting robots.txt is a best practice, though its legal enforceability varies.
- Image rights — Even if you can technically download images, using them without permission from the copyright holder is legally risky.
- Data privacy — Pinner profiles contain personal information. Handle usernames, follower counts, and profile data carefully under GDPR/CCPA.
- Rate limiting — Aggressive scraping that impacts Pinterest’s performance for real users could strengthen legal claims against you.
For image datasets, consider using Pinterest’s official API or licensed image datasets from providers like Unsplash, Pexels, or Getty for your training data needs.
Conclusion
Pinterest is a moderately difficult scraping target with excellent data for visual trend analysis and content research. The internal API approach provides the most structured data, while the headless browser approach handles infinite scroll reliably. Residential proxies with rotation provide good success rates without the cost of mobile proxies. Focus on extracting structured API data over HTML parsing for the most reliable results, and always implement respectful rate limiting.
- How to Scrape Airbnb Listings with Proxies in 2026
- How to Scrape Facebook Marketplace with Proxies in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- How to Scrape Airbnb Listings with Proxies in 2026
- How to Scrape Facebook Marketplace with Proxies in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- How to Scrape Airbnb Listings with Proxies in 2026
- How to Scrape Facebook Marketplace with Proxies in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
Related Reading
- How to Scrape Airbnb Listings with Proxies in 2026
- How to Scrape Facebook Marketplace with Proxies in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix