How to Scrape Pinterest with Proxies in 2026

How to Scrape Pinterest with Proxies in 2026

Pinterest is a visual discovery platform with over 450 million monthly active users, hosting billions of pins across virtually every interest and niche. For marketers, trend researchers, designers, and e-commerce businesses, Pinterest data provides unique insights into visual trends, consumer interests, and content performance.

This guide covers how to scrape Pinterest data including pins, boards, images, and engagement metrics using Python with proxy rotation.

Why Scrape Pinterest?

Pinterest data enables several valuable applications:

  • Trend research — Identify emerging visual and product trends before they go mainstream
  • Content strategy — Analyze what types of pins perform best in your niche
  • Competitive analysis — Monitor competitor boards, pin frequency, and engagement
  • E-commerce product research — Find trending products based on pin popularity
  • Image dataset creation — Build visual datasets for machine learning training
  • Influencer analysis — Evaluate Pinterest influencer reach and engagement
  • SEO research — Pinterest is a major traffic source; understand what ranks on the platform

Pinterest’s API vs Scraping

Pinterest offers an official API, but with significant limitations:

Official Pinterest API

  • Requires developer application and approval
  • Limited to your own account data or approved scopes
  • Rate limited to 200 calls per hour for most endpoints
  • Does not provide competitor data or engagement metrics for other users’ pins
  • No access to search results or trending data

Why Scraping Is Often Necessary

  • Access to any public pin, board, or profile
  • Search results with full engagement data
  • No API rate limits (though you must self-limit)
  • Trending pins and popular content discovery
  • Full image URLs at original resolution

Data Points to Extract

Data PointSourceNotes
Pin image URLPin elementMultiple resolutions available
Pin descriptionPin overlay / detailUser-written description
Pin titlePin detail pageOften the page title of source
Source URLPin metadataLink back to original content
Repins countEngagement dataHow many times pin was saved
Comments countEngagement dataComment engagement
Board nameBoard / pin contextWhich board the pin belongs to
Board follower countBoard pageBoard popularity
Pinner profilePin metadataWho pinned it
Pin categoryMetadataPinterest’s categorization
Related pinsPin detail pageVisually similar content

Understanding Pinterest’s Anti-Bot Measures

Pinterest’s defenses are moderate compared to Facebook or Airbnb:

  1. Rate limiting — Moderately strict per-IP rate limits
  2. JavaScript rendering — Infinite scroll requires JS execution
  3. Session tokens — CSRF tokens required for API-like requests
  4. User-Agent checks — Blocks obvious bot User-Agents
  5. Login walls — Some content requires authentication after a few pages
  6. IP blocking — Datacenter IPs are flagged relatively quickly

Setting Up Your Environment

pip install requests beautifulsoup4 playwright fake-useragent
playwright install chromium

Python Code: Scraping Pinterest with Proxies

Approach 1: Using Pinterest’s Internal API

Pinterest’s web app makes API calls to internal endpoints. Intercepting these gives you structured JSON data:

import requests
import json
import time
import random
import logging
from fake_useragent import UserAgent

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class PinterestScraper:
    def __init__(self, proxy_list: list):
        self.proxy_list = proxy_list
        self.ua = UserAgent()
        self.session = requests.Session()
        self.pins = []
        self.base_url = "https://www.pinterest.com"

    def get_proxy(self) -> dict:
        proxy = random.choice(self.proxy_list)
        return {"http": f"http://{proxy}", "https": f"http://{proxy}"}

    def get_headers(self) -> dict:
        return {
            "User-Agent": self.ua.random,
            "Accept": "application/json, text/javascript, */*; q=0.01",
            "Accept-Language": "en-US,en;q=0.9",
            "X-Requested-With": "XMLHttpRequest",
            "Referer": "https://www.pinterest.com/",
            "Origin": "https://www.pinterest.com"
        }

    def init_session(self):
        """Initialize session and get CSRF token."""
        response = self.session.get(
            self.base_url,
            headers={"User-Agent": self.ua.random},
            proxies=self.get_proxy(),
            timeout=30
        )
        # Extract csrftoken from cookies
        self.csrf_token = self.session.cookies.get("csrftoken", "")
        logger.info(f"Session initialized, CSRF token: {self.csrf_token[:20]}...")

    def search_pins(self, query: str, max_results: int = 200):
        """Search Pinterest for pins matching a query."""
        self.init_session()
        bookmark = ""
        results_collected = 0

        while results_collected < max_results:
            params = {
                "source_url": f"/search/pins/?q={query}",
                "data": json.dumps({
                    "options": {
                        "query": query,
                        "scope": "pins",
                        "bookmarks": [bookmark] if bookmark else [],
                        "page_size": 25
                    },
                    "context": {}
                })
            }

            headers = self.get_headers()
            headers["X-CSRFToken"] = self.csrf_token

            try:
                response = self.session.get(
                    f"{self.base_url}/resource/BaseSearchResource/get/",
                    params=params,
                    headers=headers,
                    proxies=self.get_proxy(),
                    timeout=30
                )

                if response.status_code == 200:
                    data = response.json()
                    results = data.get("resource_response", {}).get("data", {})

                    if isinstance(results, dict):
                        pins_data = results.get("results", [])
                    elif isinstance(results, list):
                        pins_data = results
                    else:
                        break

                    if not pins_data:
                        logger.info("No more results")
                        break

                    for pin_data in pins_data:
                        pin = self.parse_pin_data(pin_data)
                        if pin:
                            self.pins.append(pin)
                            results_collected += 1

                    # Get bookmark for next page
                    bookmark = data.get("resource_response", {}).get("bookmark", "")
                    if not bookmark:
                        break

                    logger.info(f"Collected {results_collected} pins so far")

                elif response.status_code == 429:
                    logger.warning("Rate limited -- waiting")
                    time.sleep(random.uniform(30, 60))
                    continue
                else:
                    logger.error(f"Status {response.status_code}")
                    break

            except Exception as e:
                logger.error(f"Search request failed: {e}")

            time.sleep(random.uniform(2, 5))

    def parse_pin_data(self, data: dict) -> dict:
        """Parse pin data from API response."""
        if not isinstance(data, dict):
            return None

        pin = {
            "pin_id": data.get("id"),
            "description": data.get("description"),
            "title": data.get("title"),
            "link": data.get("link"),
            "created_at": data.get("created_at"),
            "domain": data.get("domain"),
            "repin_count": data.get("repin_count", 0),
            "comment_count": data.get("comment_count", 0),
        }

        # Image URLs at various resolutions
        images = data.get("images", {})
        if images:
            pin["image_original"] = images.get("orig", {}).get("url")
            pin["image_736"] = images.get("736x", {}).get("url")
            pin["image_236"] = images.get("236x", {}).get("url")

        # Pinner info
        pinner = data.get("pinner", {})
        if pinner:
            pin["pinner_username"] = pinner.get("username")
            pin["pinner_full_name"] = pinner.get("full_name")
            pin["pinner_follower_count"] = pinner.get("follower_count")

        # Board info
        board = data.get("board", {})
        if board:
            pin["board_name"] = board.get("name")
            pin["board_url"] = board.get("url")

        return pin

    def scrape_board(self, username: str, board_slug: str,
                     max_pins: int = 200):
        """Scrape all pins from a specific board."""
        self.init_session()
        bookmark = ""
        collected = 0

        while collected < max_pins:
            params = {
                "source_url": f"/{username}/{board_slug}/",
                "data": json.dumps({
                    "options": {
                        "board_url": f"/{username}/{board_slug}/",
                        "bookmarks": [bookmark] if bookmark else [],
                        "page_size": 25
                    },
                    "context": {}
                })
            }

            headers = self.get_headers()
            headers["X-CSRFToken"] = self.csrf_token

            try:
                response = self.session.get(
                    f"{self.base_url}/resource/BoardFeedResource/get/",
                    params=params,
                    headers=headers,
                    proxies=self.get_proxy(),
                    timeout=30
                )

                if response.status_code == 200:
                    data = response.json()
                    pins_data = data.get("resource_response", {}).get("data", [])

                    if not pins_data:
                        break

                    for pin_data in pins_data:
                        pin = self.parse_pin_data(pin_data)
                        if pin:
                            self.pins.append(pin)
                            collected += 1

                    bookmark = data.get("resource_response", {}).get("bookmark", "")
                    if not bookmark:
                        break

                    logger.info(f"Board pins collected: {collected}")
                else:
                    break

            except Exception as e:
                logger.error(f"Board scrape failed: {e}")

            time.sleep(random.uniform(2, 4))


# Usage
if __name__ == "__main__":
    proxies = [
        "user:pass@residential1.proxy.com:8080",
        "user:pass@residential2.proxy.com:8080",
        "user:pass@residential3.proxy.com:8080",
    ]

    scraper = PinterestScraper(proxy_list=proxies)

    # Search for pins
    scraper.search_pins("minimalist home decor", max_results=100)
    print(f"Found {len(scraper.pins)} pins")

    # Save results
    with open("pinterest_pins.json", "w") as f:
        json.dump(scraper.pins, f, indent=2)

Approach 2: Handling Infinite Scroll with Playwright

For scraping Pinterest’s visual grid with infinite scroll:

import asyncio
from playwright.async_api import async_playwright
from bs4 import BeautifulSoup
import json
import random

async def scrape_pinterest_visual(query: str, proxy: str, max_scrolls: int = 20):
    """Scrape Pinterest search results using headless browser."""
    async with async_playwright() as p:
        auth, server = proxy.rsplit("@", 1)
        user, password = auth.split(":", 1)

        browser = await p.chromium.launch(
            headless=True,
            proxy={
                "server": f"http://{server}",
                "username": user,
                "password": password
            }
        )

        page = await browser.new_page()
        url = f"https://www.pinterest.com/search/pins/?q={query}"
        await page.goto(url, wait_until="networkidle")

        all_pins = set()

        for i in range(max_scrolls):
            # Scroll down to trigger more pins loading
            await page.evaluate("window.scrollBy(0, 800)")
            await page.wait_for_timeout(random.randint(1500, 3000))

            # Extract pin URLs from current page state
            pin_links = await page.query_selector_all("a[href*='/pin/']")
            for link in pin_links:
                href = await link.get_attribute("href")
                if href:
                    all_pins.add(href)

            print(f"Scroll {i+1}: {len(all_pins)} unique pins found")

        await browser.close()
        return list(all_pins)

Image Extraction and Downloading

Pinterest pins are fundamentally about images. Here is how to download pin images at full resolution:

import os
import requests
from urllib.parse import urlparse

def download_pin_images(pins: list, output_dir: str, proxy_list: list):
    """Download pin images at original resolution."""
    os.makedirs(output_dir, exist_ok=True)

    for pin in pins:
        image_url = pin.get("image_original") or pin.get("image_736")
        if not image_url:
            continue

        pin_id = pin.get("pin_id", "unknown")
        ext = os.path.splitext(urlparse(image_url).path)[1] or ".jpg"
        filename = f"{pin_id}{ext}"
        filepath = os.path.join(output_dir, filename)

        if os.path.exists(filepath):
            continue

        try:
            proxy = random.choice(proxy_list)
            response = requests.get(
                image_url,
                proxies={"http": f"http://{proxy}", "https": f"http://{proxy}"},
                timeout=30,
                stream=True
            )

            if response.status_code == 200:
                with open(filepath, "wb") as f:
                    for chunk in response.iter_content(chunk_size=8192):
                        f.write(chunk)
                print(f"Downloaded: {filename}")

        except Exception as e:
            print(f"Failed to download {pin_id}: {e}")

        time.sleep(random.uniform(0.5, 1.5))

Proxy Rotation Strategy for Pinterest

Pinterest’s rate limiting is moderate, making it more accessible than platforms like Facebook or Airbnb:

  • Residential rotating proxies — Best choice for sustained scraping. Rotate every 3-5 requests.
  • Datacenter proxies — Can work for small-scale scraping but get blocked faster.
  • US/EU IPs — Pinterest is primarily a US and European platform. Use IPs from these regions.
  • Session management — Maintain session cookies across requests on the same IP for better success rates.

Estimate your proxy costs with our proxy cost calculator.

Troubleshooting

Problem: Pinterest returns login page instead of search results

  • Pinterest gates content after a few unauthenticated page views. You may need to include session cookies from a logged-in account.
  • Try accessing with a fresh session and new proxy IP.

Problem: API requests return empty results

  • The CSRF token may have expired. Re-initialize your session to get a fresh token.
  • Verify the API endpoint URL has not changed (Pinterest updates these periodically).

Problem: Images downloading as broken files

  • Some Pinterest image URLs expire after a period. Download images immediately after scraping URLs.
  • Check that you are using the correct image resolution URL (orig, 736x, 236x).

Problem: Rate limited (429 errors)

  • Increase delays between requests to 5-10 seconds.
  • Rotate to a fresh proxy IP after receiving a 429 response.
  • Reduce the page_size parameter in API requests.

Problem: Infinite scroll stops loading new content

  • Pinterest may have detected automation. Add random pauses and vary scroll distances.
  • Try scrolling up slightly before scrolling down again to simulate human behavior.

Verify your proxy location with our IP lookup tool.

Legal and Ethical Considerations

Pinterest scraping involves several legal and ethical issues:

  • Terms of Service — Pinterest’s ToS prohibits scraping. Violation may result in account termination and potential legal claims.
  • Copyright — Pin images are often copyrighted by their creators. Downloading and using images may infringe on copyright, particularly for commercial purposes.
  • robots.txt — Pinterest’s robots.txt restricts automated access to many paths. Respecting robots.txt is a best practice, though its legal enforceability varies.
  • Image rights — Even if you can technically download images, using them without permission from the copyright holder is legally risky.
  • Data privacy — Pinner profiles contain personal information. Handle usernames, follower counts, and profile data carefully under GDPR/CCPA.
  • Rate limiting — Aggressive scraping that impacts Pinterest’s performance for real users could strengthen legal claims against you.

For image datasets, consider using Pinterest’s official API or licensed image datasets from providers like Unsplash, Pexels, or Getty for your training data needs.

Conclusion

Pinterest is a moderately difficult scraping target with excellent data for visual trend analysis and content research. The internal API approach provides the most structured data, while the headless browser approach handles infinite scroll reliably. Residential proxies with rotation provide good success rates without the cost of mobile proxies. Focus on extracting structured API data over HTML parsing for the most reliable results, and always implement respectful rate limiting.


Related Reading

Scroll to Top