How to Scrape App Store Reviews (iOS)

Apple’s App Store hosts over 1.8 million apps with billions of user reviews. For app developers, product managers, and competitive intelligence teams, App Store review data provides direct insight into user satisfaction, feature requests, and competitive positioning.

What Data Can You Extract?

App metadata (name, developer, category, price, rating)
User reviews (text, rating, date, helpful count)
Version-specific reviews
App screenshots and descriptions
In-app purchase details
App update history
Developer information

Example JSON Output

{
  "app_id": "1234567890",
  "name": "ProxyManager",
  "developer": "Tech Corp",
  "rating": 4.7,
  "review_count": 15432,
  "price": "Free",
  "category": "Utilities",
  "reviews": [{
    "id": "9876543210",
    "title": "Great app!",
    "content": "This is exactly what I needed for managing my proxies...",
    "rating": 5,
    "author": "TechUser42",
    "date": "2026-02-28",
    "version": "3.2.1",
    "helpful_count": 15
  }]
}

Prerequisites

pip install requests beautifulsoup4 lxml

Apple’s App Store data can be accessed via the iTunes API. No proxies are typically needed for API access.

Method 1: iTunes Search and Lookup API

import requests
import json
import time

class AppStoreScraper:
    def __init__(self, country="us"):
        self.country = country
        self.session = requests.Session()

    def search_apps(self, term, limit=25):
        url = "https://itunes.apple.com/search"
        params = {"term": term, "country": self.country, "media": "software", "limit": limit}
        response = self.session.get(url, params=params, timeout=30)
        data = response.json()

        return [{
            "id": app.get("trackId"),
            "name": app.get("trackName"),
            "developer": app.get("artistName"),
            "rating": app.get("averageUserRating"),
            "rating_count": app.get("userRatingCount"),
            "price": app.get("formattedPrice"),
            "category": app.get("primaryGenreName"),
            "description": app.get("description", "")[:500],
            "url": app.get("trackViewUrl"),
            "icon": app.get("artworkUrl512"),
            "version": app.get("version"),
            "bundle_id": app.get("bundleId"),
        } for app in data.get("results", [])]

    def get_app_details(self, app_id):
        url = f"https://itunes.apple.com/lookup"
        params = {"id": app_id, "country": self.country}
        response = self.session.get(url, params=params, timeout=30)
        data = response.json()
        results = data.get("results", [])
        return results[0] if results else None

    def get_reviews(self, app_id, page=1, sort="mostRecent"):
        url = f"https://itunes.apple.com/{self.country}/rss/customerreviews/id={app_id}/page={page}/sortBy={sort}/json"
        response = self.session.get(url, timeout=30)

        if response.status_code != 200:
            return []

        data = response.json()
        entries = data.get("feed", {}).get("entry", [])

        reviews = []
        for entry in entries:
            if isinstance(entry, dict) and "content" in entry:
                reviews.append({
                    "id": entry.get("id", {}).get("label"),
                    "title": entry.get("title", {}).get("label"),
                    "content": entry.get("content", {}).get("label"),
                    "rating": entry.get("im:rating", {}).get("label"),
                    "author": entry.get("author", {}).get("name", {}).get("label"),
                    "version": entry.get("im:version", {}).get("label"),
                    "vote_count": entry.get("im:voteCount", {}).get("label"),
                })

        return reviews

    def get_all_reviews(self, app_id, max_pages=10):
        all_reviews = []
        for page in range(1, max_pages + 1):
            reviews = self.get_reviews(app_id, page=page)
            if not reviews:
                break
            all_reviews.extend(reviews)
            time.sleep(1)
        return all_reviews


# Usage
scraper = AppStoreScraper(country="us")
apps = scraper.search_apps("proxy vpn", limit=10)
for app in apps[:3]:
    reviews = scraper.get_all_reviews(app["id"], max_pages=5)
    print(f"{app['name']} ({app['rating']}): {len(reviews)} reviews")

Proxy Recommendations

Proxies are rarely needed for the iTunes API, but use them for high-volume scraping or region-specific data. Residential proxies from the target country provide accurate regional data.

Legal Considerations

iTunes API: The RSS feed API is designed for public consumption.
Review Content: Reviews are user-generated and copyrighted.
Apple Guidelines: Follow Apple’s API usage guidelines.
Rate Limits: Respect API rate limits.

See our compliance guide.

Method 2: Scraping with Selenium

For data not available via the iTunes API, use Selenium to scrape the App Store web interface:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import json
import time

class AppStoreSeleniumScraper:
    def __init__(self, proxy=None):
        options = Options()
        options.add_argument("--headless=new")
        options.add_argument("--no-sandbox")
        if proxy:
            options.add_argument(f"--proxy-server={proxy}")
        self.driver = webdriver.Chrome(options=options)

    def scrape_app_page(self, url):
        self.driver.get(url)
        time.sleep(3)

        try:
            WebDriverWait(self.driver, 15).until(
                EC.presence_of_element_located((By.CSS_SELECTOR, "h1"))
            )
        except Exception:
            return None

        data = self.driver.execute_script('''
            const result = {};
            const title = document.querySelector("h1");
            result.title = title ? title.innerText.trim() : null;

            const rating = document.querySelector("[class*='rating']");
            result.rating = rating ? rating.innerText.trim() : null;

            const description = document.querySelector("[class*='description']");
            result.description = description ? description.innerText.substring(0, 500) : null;

            // Extract from JSON-LD
            const scripts = document.querySelectorAll('script[type="application/ld+json"]');
            for (const script of scripts) {
                try {
                    const json = JSON.parse(script.textContent);
                    if (json["@type"] === "SoftwareApplication") {
                        result.structured_data = json;
                    }
                } catch {}
            }
            return result;
        ''')

        return data

    def scrape_reviews_page(self, app_url, max_reviews=50):
        reviews_url = app_url.replace("/app/", "/app/reviews/")
        self.driver.get(reviews_url)
        time.sleep(3)

        # Scroll to load reviews
        for _ in range(5):
            self.driver.execute_script("window.scrollBy(0, 800);")
            time.sleep(1)

        reviews = self.driver.execute_script('''
            const items = [];
            document.querySelectorAll("[class*='review']").forEach(el => {
                const title = el.querySelector("[class*='title']");
                const body = el.querySelector("[class*='body']");
                const stars = el.querySelector("[class*='star']");
                items.push({
                    title: title ? title.innerText.trim() : null,
                    body: body ? body.innerText.trim() : null,
                    stars: stars ? stars.getAttribute("aria-label") : null,
                });
            });
            return items;
        ''')

        return reviews

    def close(self):
        self.driver.quit()

Handling App Store Anti-Bot Protections

1. Rate Limiting on iTunes API

The iTunes RSS feed API has rate limits. Implement delays of 1-2 seconds between requests and avoid bursts.

2. Regional Content

App Store content varies by country. Use the country parameter in API requests to get region-specific data:

# Available country codes
countries = ["us", "gb", "jp", "au", "ca", "de", "fr", "in", "br"]

for country in countries:
    scraper = AppStoreScraper(country=country)
    reviews = scraper.get_reviews(app_id, page=1)
    print(f"{country}: {len(reviews)} reviews")
    time.sleep(1)

3. Review Pagination

The RSS feed API limits reviews to 10 pages (approximately 500 reviews). For comprehensive review collection, combine multiple sort orders and country codes.

4. App Store Connect API

For app developers, Apple’s App Store Connect API provides additional data including sales, downloads, and crash reports. This requires developer account authentication.

Method 3: Scraping App Rankings and Charts

Beyond individual app reviews, you can scrape App Store charts and category rankings to track competitive positioning:

import requests
import json

class AppStoreChartScraper:
    def __init__(self, country="us"):
        self.country = country
        self.session = requests.Session()

    def get_top_apps(self, genre_id=36, limit=100, chart="topfreeapplications"):
        """Fetch top apps from a specific category.
        Genre IDs: 36=All, 6007=Productivity, 6005=Social Networking, 6002=Utilities
        Charts: topfreeapplications, toppaidapplications, topgrossingapplications
        """
        url = f"https://itunes.apple.com/{self.country}/rss/{chart}/limit={limit}/genre={genre_id}/json"
        response = self.session.get(url, timeout=30)

        if response.status_code != 200:
            return []

        data = response.json()
        entries = data.get("feed", {}).get("entry", [])

        apps = []
        for i, entry in enumerate(entries):
            apps.append({
                "rank": i + 1,
                "name": entry.get("im:name", {}).get("label"),
                "app_id": entry.get("id", {}).get("attributes", {}).get("im:id"),
                "developer": entry.get("im:artist", {}).get("label"),
                "category": entry.get("category", {}).get("attributes", {}).get("label"),
                "price": entry.get("im:price", {}).get("attributes", {}).get("amount"),
                "summary": entry.get("summary", {}).get("label", "")[:200],
                "icon": entry.get("im:image", [{}])[-1].get("label") if entry.get("im:image") else None,
            })

        return apps

    def track_ranking_changes(self, app_id, genre_id=36, limit=200):
        """Check an app's current ranking position."""
        apps = self.get_top_apps(genre_id=genre_id, limit=limit)
        for app in apps:
            if str(app["app_id"]) == str(app_id):
                return app["rank"]
        return None  # Not in top charts


# Usage
chart_scraper = AppStoreChartScraper(country="us")
top_apps = chart_scraper.get_top_apps(genre_id=6002, limit=25)
for app in top_apps[:10]:
    print(f"#{app['rank']}: {app['name']} by {app['developer']}")

Data Export and Analysis

import pandas as pd
import json

# Export reviews to various formats
def export_reviews(reviews, app_name):
    # JSON
    with open(f"{app_name}_reviews.json", "w") as f:
        json.dump(reviews, f, indent=2)

    # CSV
    df = pd.DataFrame(reviews)
    df.to_csv(f"{app_name}_reviews.csv", index=False)

    # Sentiment analysis preparation
    df["rating_numeric"] = pd.to_numeric(df["rating"], errors="coerce")
    avg_rating = df["rating_numeric"].mean()
    print(f"Average rating: {avg_rating:.2f}")
    print(f"Total reviews: {len(reviews)}")
    print(f"5-star: {len(df[df['rating_numeric'] == 5])}")
    print(f"1-star: {len(df[df['rating_numeric'] == 1])}")

Monitoring Review Trends Over Time

For ongoing app intelligence, set up automated review monitoring to detect shifts in user sentiment, bug reports after updates, or competitor activity:

import sqlite3
from datetime import datetime

class ReviewMonitor:
    def __init__(self, db_path="app_reviews.db"):
        self.conn = sqlite3.connect(db_path)
        self.conn.execute('''CREATE TABLE IF NOT EXISTS reviews
            (review_id TEXT PRIMARY KEY, app_id TEXT, rating INTEGER,
             title TEXT, content TEXT, author TEXT, version TEXT,
             scraped_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP)''')

    def store_reviews(self, app_id, reviews):
        new_count = 0
        for review in reviews:
            try:
                self.conn.execute(
                    "INSERT OR IGNORE INTO reviews (review_id, app_id, rating, title, content, author, version) VALUES (?, ?, ?, ?, ?, ?, ?)",
                    (review.get("id"), app_id, review.get("rating"), review.get("title"),
                     review.get("content"), review.get("author"), review.get("version"))
                )
                new_count += 1
            except sqlite3.IntegrityError:
                pass
        self.conn.commit()
        return new_count

    def get_rating_trend(self, app_id, days=30):
        cursor = self.conn.execute(
            "SELECT rating, COUNT(*) FROM reviews WHERE app_id = ? AND scraped_at > datetime('now', ?) GROUP BY rating",
            (app_id, f"-{days} days")
        )
        return dict(cursor.fetchall())

Frequently Asked Questions

Does Apple have an official review API?

Yes, Apple provides the iTunes RSS Feed API for public review access (no authentication needed) and the App Store Connect API for developers (requires developer account). The RSS feed is limited to 500 reviews per app per country.

How often are App Store reviews updated?

New reviews appear in the RSS feed within 24-48 hours of submission. For real-time monitoring, poll every few hours.

Can I scrape competitor app reviews?

Yes, the iTunes RSS feed provides public access to any app’s reviews. Use the app’s numeric ID (available in the App Store URL) to fetch reviews.

What’s the difference between iTunes API and App Store Connect?

The iTunes API provides public data (search, reviews, app metadata) for any app. App Store Connect is Apple’s private API for app developers to access their own app’s detailed analytics, financial reports, and management tools.

How do I handle apps with millions of reviews?

The iTunes RSS feed caps at approximately 500 reviews per country. To maximize coverage, scrape across multiple countries and combine both “mostRecent” and “mostHelpful” sort orders. For apps you own, use App Store Connect which provides complete review access. Third-party services like AppFollow or Appfigures aggregate reviews beyond the RSS feed limit.

Can I track review changes over time?

Yes. Store reviews in a database with timestamps and run scheduled scrapes daily or weekly. Compare new results against stored data to identify new reviews, removed reviews, and rating trend shifts. This is especially useful for monitoring user sentiment after app updates.

Conclusion

Apple’s App Store provides relatively accessible review data through the iTunes RSS feed API. For comprehensive data, combine API access with multi-country scraping and Selenium for additional web-based data.

Visit dataresearchtools.com for proxy recommendations and our app store optimization guide.