How to Scrape Etsy Product Listings 2026

How to Scrape Etsy Product Listings 2026

Etsy is one of the largest marketplaces for handmade, vintage, and unique goods, hosting over 90 million active buyers and 7.5 million active sellers. For e-commerce researchers, competitor analysts, and market intelligence professionals, scraping Etsy product listings provides invaluable data on pricing trends, popular categories, and seller performance.

In this comprehensive guide, you’ll learn how to scrape Etsy product data using Python, handle their anti-bot protections, and build a reliable data pipeline.

What Data Can You Extract from Etsy?

Etsy product listings contain a wealth of structured data that’s useful for market research and competitive analysis:

  • Product titles and descriptions
  • Pricing (including sale prices and original prices)
  • Review counts and ratings
  • Seller information (shop name, location, total sales)
  • Product images and thumbnails
  • Categories and tags
  • Shipping information
  • Variation options (sizes, colors, etc.)

Example JSON Output

{
  "product_id": "1234567890",
  "title": "Handmade Ceramic Mug - Blue Glaze",
  "price": 28.99,
  "original_price": 34.99,
  "currency": "USD",
  "rating": 4.8,
  "review_count": 1243,
  "seller": {
    "shop_name": "CeramicStudioCo",
    "location": "Portland, Oregon",
    "total_sales": 15420
  },
  "categories": ["Home & Living", "Kitchen & Dining", "Drinkware", "Mugs"],
  "tags": ["handmade mug", "ceramic mug", "blue mug", "pottery"],
  "shipping": {
    "free_shipping": true,
    "estimated_delivery": "3-5 business days"
  },
  "variations": [
    {"type": "Color", "options": ["Blue", "Green", "White"]},
    {"type": "Size", "options": ["12oz", "16oz"]}
  ],
  "url": "https://www.etsy.com/listing/1234567890/"
}

Prerequisites

Before you start scraping Etsy, make sure you have the following installed:

pip install requests beautifulsoup4 lxml fake-useragent

You’ll also want a reliable proxy service to avoid IP blocks. Residential proxies are recommended for Etsy scraping since they rotate IPs from real residential connections.

Method 1: Scraping Etsy with Requests and BeautifulSoup

This method works well for extracting data from Etsy search results and individual product pages.

Scraping Etsy Search Results

import requests
from bs4 import BeautifulSoup
from fake_useragent import UserAgent
import json
import time
import random

class EtsyScraper:
    def __init__(self, proxy_url=None):
        self.session = requests.Session()
        self.ua = UserAgent()
        self.proxy_url = proxy_url
        self.base_url = "https://www.etsy.com"

    def _get_headers(self):
        return {
            "User-Agent": self.ua.random,
            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
            "Accept-Language": "en-US,en;q=0.9",
            "Accept-Encoding": "gzip, deflate, br",
            "Referer": "https://www.etsy.com/",
            "DNT": "1",
            "Connection": "keep-alive",
        }

    def _get_proxies(self):
        if self.proxy_url:
            return {
                "http": self.proxy_url,
                "https": self.proxy_url
            }
        return None

    def search_products(self, query, max_pages=5):
        """Scrape Etsy search results for a given query."""
        all_products = []

        for page in range(1, max_pages + 1):
            url = f"{self.base_url}/search?q={query}&page={page}"

            try:
                response = self.session.get(
                    url,
                    headers=self._get_headers(),
                    proxies=self._get_proxies(),
                    timeout=30
                )
                response.raise_for_status()

                soup = BeautifulSoup(response.text, "lxml")
                products = self._parse_search_results(soup)
                all_products.extend(products)

                print(f"Page {page}: Found {len(products)} products")

                # Respectful rate limiting
                time.sleep(random.uniform(2, 5))

            except requests.RequestException as e:
                print(f"Error on page {page}: {e}")
                continue

        return all_products

    def _parse_search_results(self, soup):
        """Parse product data from search results page."""
        products = []

        # Etsy renders product cards in a grid
        listings = soup.select("div.v2-listing-card")

        for listing in listings:
            try:
                product = {}

                # Extract title
                title_elem = listing.select_one("h3.v2-listing-card__title")
                product["title"] = title_elem.get_text(strip=True) if title_elem else None

                # Extract price
                price_elem = listing.select_one("span.currency-value")
                product["price"] = float(price_elem.get_text(strip=True).replace(",", "")) if price_elem else None

                # Extract URL
                link_elem = listing.select_one("a.listing-link")
                product["url"] = link_elem["href"] if link_elem else None

                # Extract listing ID from URL
                if product["url"]:
                    product["listing_id"] = product["url"].split("/listing/")[1].split("/")[0]

                # Extract shop name
                shop_elem = listing.select_one("p.v2-listing-card__shop")
                product["shop_name"] = shop_elem.get_text(strip=True) if shop_elem else None

                # Extract rating
                rating_elem = listing.select_one("span.v2-listing-card__rating")
                if rating_elem:
                    product["rating"] = float(rating_elem.get_text(strip=True))

                products.append(product)

            except Exception as e:
                print(f"Error parsing listing: {e}")
                continue

        return products

    def scrape_product_page(self, url):
        """Scrape detailed data from an individual product page."""
        try:
            response = self.session.get(
                url,
                headers=self._get_headers(),
                proxies=self._get_proxies(),
                timeout=30
            )
            response.raise_for_status()

            soup = BeautifulSoup(response.text, "lxml")

            # Etsy embeds structured data in JSON-LD
            script_tags = soup.find_all("script", type="application/ld+json")
            for script in script_tags:
                try:
                    data = json.loads(script.string)
                    if data.get("@type") == "Product":
                        return self._parse_structured_data(data, soup)
                except json.JSONDecodeError:
                    continue

            # Fallback to HTML parsing
            return self._parse_product_html(soup)

        except requests.RequestException as e:
            print(f"Error scraping product: {e}")
            return None

    def _parse_structured_data(self, data, soup):
        """Parse product data from JSON-LD structured data."""
        product = {
            "title": data.get("name"),
            "description": data.get("description"),
            "url": data.get("url"),
            "image": data.get("image"),
        }

        # Extract pricing
        offers = data.get("offers", {})
        if isinstance(offers, list):
            offers = offers[0]
        product["price"] = offers.get("price")
        product["currency"] = offers.get("priceCurrency")

        # Extract reviews
        aggregate_rating = data.get("aggregateRating", {})
        product["rating"] = aggregate_rating.get("ratingValue")
        product["review_count"] = aggregate_rating.get("reviewCount")

        # Extract seller info from HTML
        shop_elem = soup.select_one("a[data-shop-name]")
        if shop_elem:
            product["shop_name"] = shop_elem.get("data-shop-name")

        return product

    def _parse_product_html(self, soup):
        """Fallback HTML parsing for product pages."""
        product = {}

        title = soup.select_one("h1[data-buy-box-listing-title]")
        product["title"] = title.get_text(strip=True) if title else None

        price = soup.select_one("div[data-buy-box-region='price'] p.wt-text-title-larger")
        if price:
            price_text = price.get_text(strip=True).replace("$", "").replace(",", "")
            try:
                product["price"] = float(price_text)
            except ValueError:
                product["price"] = price_text

        return product


# Usage example
if __name__ == "__main__":
    # Initialize with proxy
    scraper = EtsyScraper(proxy_url="http://user:pass@proxy-server:port")

    # Search for products
    results = scraper.search_products("handmade ceramic mug", max_pages=3)

    # Scrape individual product details
    for product in results[:5]:
        if product.get("url"):
            details = scraper.scrape_product_page(product["url"])
            print(json.dumps(details, indent=2))
            time.sleep(random.uniform(3, 6))

Method 2: Scraping Etsy with Selenium

For pages that rely heavily on JavaScript rendering, Selenium provides a more robust solution.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import json
import time

class EtsySeleniumScraper:
    def __init__(self, proxy=None):
        chrome_options = Options()
        chrome_options.add_argument("--headless")
        chrome_options.add_argument("--no-sandbox")
        chrome_options.add_argument("--disable-dev-shm-usage")
        chrome_options.add_argument("--disable-blink-features=AutomationControlled")

        if proxy:
            chrome_options.add_argument(f"--proxy-server={proxy}")

        self.driver = webdriver.Chrome(options=chrome_options)
        self.driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
            "source": "Object.defineProperty(navigator, 'webdriver', {get: () => undefined})"
        })

    def search_products(self, query, max_pages=3):
        """Search Etsy and extract product data."""
        products = []

        for page in range(1, max_pages + 1):
            url = f"https://www.etsy.com/search?q={query}&page={page}"
            self.driver.get(url)

            # Wait for product cards to load
            WebDriverWait(self.driver, 15).until(
                EC.presence_of_all_elements_located(
                    (By.CSS_SELECTOR, "div.v2-listing-card")
                )
            )

            # Scroll to load lazy content
            self._scroll_page()

            cards = self.driver.find_elements(By.CSS_SELECTOR, "div.v2-listing-card")

            for card in cards:
                try:
                    product = {
                        "title": card.find_element(By.CSS_SELECTOR, "h3").text,
                        "url": card.find_element(By.CSS_SELECTOR, "a").get_attribute("href"),
                    }

                    try:
                        price_elem = card.find_element(By.CSS_SELECTOR, "span.currency-value")
                        product["price"] = price_elem.text
                    except Exception:
                        product["price"] = None

                    products.append(product)
                except Exception as e:
                    continue

            time.sleep(3)

        return products

    def _scroll_page(self):
        """Scroll down the page to trigger lazy loading."""
        total_height = self.driver.execute_script("return document.body.scrollHeight")
        for i in range(0, total_height, 500):
            self.driver.execute_script(f"window.scrollTo(0, {i});")
            time.sleep(0.3)

    def close(self):
        self.driver.quit()


# Usage
scraper = EtsySeleniumScraper(proxy="http://proxy-server:port")
results = scraper.search_products("vintage jewelry")
print(json.dumps(results[:5], indent=2))
scraper.close()

Handling Etsy’s Anti-Bot Protections

Etsy employs several anti-scraping measures that you need to handle:

1. Rate Limiting

Etsy will throttle or block your requests if you send too many in a short period. Implement respectful delays:

import random
import time

def respectful_delay(min_seconds=2, max_seconds=6):
    """Add a random delay between requests."""
    delay = random.uniform(min_seconds, max_seconds)
    time.sleep(delay)

2. CAPTCHA Challenges

Etsy uses CAPTCHA to verify human visitors. To minimize CAPTCHA triggers:

  • Rotate user agents frequently
  • Use residential proxies that appear as regular users
  • Maintain consistent session cookies
  • Avoid rapid successive requests

3. JavaScript Rendering

Many Etsy pages load content dynamically. If you’re getting empty responses, switch to Selenium or consider using a headless browser with stealth plugins.

4. Request Headers

Always send complete, realistic headers. Missing headers are a common trigger for anti-bot systems:

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.9",
    "Accept-Encoding": "gzip, deflate, br",
    "Referer": "https://www.etsy.com/",
    "Sec-Fetch-Dest": "document",
    "Sec-Fetch-Mode": "navigate",
    "Sec-Fetch-Site": "same-origin",
}

Proxy Recommendations for Etsy Scraping

Etsy is moderately aggressive with anti-bot detection. Here’s what works best:

Proxy TypeEffectivenessCostBest For
Residential RotatingHigh$$$Large-scale scraping
ISP ProxiesHigh$$Consistent sessions
DatacenterLow$Small tests only
Mobile ProxiesVery High$$$$When other methods fail

For most Etsy scraping projects, rotating residential proxies provide the best balance of reliability and cost. Rotate IPs every 3-5 requests to stay under the radar.

Etsy API Alternative

Before scraping, consider using the Etsy Open API v3. It provides structured access to:

  • Active listings
  • Shop information
  • Receipts and transactions (for shop owners)
  • Taxonomy and categories

However, the API has rate limits (10 requests per second) and requires OAuth authentication, which is why many opt for web scraping for large-scale data collection.

Legal Considerations

Before scraping Etsy, be aware of these legal aspects:

  1. Terms of Service: Etsy’s ToS prohibits automated data collection. Scraping may put your account or IP at risk.
  2. Copyright: Product descriptions and images are copyrighted by sellers. Don’t republish scraped content.
  3. Personal Data: Avoid collecting personally identifiable information (PII) about sellers or buyers. Comply with GDPR and other privacy regulations.
  4. Rate Limiting: Excessive scraping can impact Etsy’s servers. Always implement respectful rate limits.
  5. Commercial Use: Using scraped data commercially may have additional legal implications. Consult a lawyer for your specific use case.

For a deeper dive into the legal landscape, check out our guide to web scraping compliance.

Rate Limiting Best Practices

To scrape Etsy sustainably without getting blocked:

  1. Start slow: Begin with 1 request every 5 seconds and gradually increase
  2. Random delays: Use randomized intervals (2-6 seconds) between requests
  3. Session management: Rotate sessions every 50-100 requests
  4. Off-peak hours: Scrape during low-traffic hours (2 AM – 6 AM EST)
  5. Exponential backoff: If you receive a 429 status code, wait progressively longer before retrying
import time

def exponential_backoff(attempt, base_delay=5, max_delay=300):
    """Calculate delay with exponential backoff."""
    delay = min(base_delay * (2 ** attempt), max_delay)
    time.sleep(delay)

Data Storage and Export

Once you’ve scraped Etsy data, store it efficiently:

import pandas as pd
import json

# Save to CSV
df = pd.DataFrame(products)
df.to_csv("etsy_products.csv", index=False)

# Save to JSON
with open("etsy_products.json", "w") as f:
    json.dump(products, f, indent=2)

Conclusion

Scraping Etsy product listings gives you powerful market intelligence for e-commerce research, competitive analysis, and trend monitoring. By using the techniques in this guide — proper headers, rotating proxies, respectful rate limiting, and robust parsing — you can build a reliable Etsy scraping pipeline.

For the best results, pair your scraper with high-quality residential proxies and always respect Etsy’s servers with appropriate rate limiting. Check out our complete guide to e-commerce scraping for more platform-specific strategies.


Related Reading

Scroll to Top