How to Scrape App Store Reviews (iOS)
Apple’s App Store hosts over 1.8 million apps with billions of user reviews. For app developers, product managers, and competitive intelligence teams, App Store review data provides direct insight into user satisfaction, feature requests, and competitive positioning.
What Data Can You Extract?
- App metadata (name, developer, category, price, rating)
- User reviews (text, rating, date, helpful count)
- Version-specific reviews
- App screenshots and descriptions
- In-app purchase details
- App update history
- Developer information
Example JSON Output
{
"app_id": "1234567890",
"name": "ProxyManager",
"developer": "Tech Corp",
"rating": 4.7,
"review_count": 15432,
"price": "Free",
"category": "Utilities",
"reviews": [{
"id": "9876543210",
"title": "Great app!",
"content": "This is exactly what I needed for managing my proxies...",
"rating": 5,
"author": "TechUser42",
"date": "2026-02-28",
"version": "3.2.1",
"helpful_count": 15
}]
}Prerequisites
pip install requests beautifulsoup4 lxmlApple’s App Store data can be accessed via the iTunes API. No proxies are typically needed for API access.
Method 1: iTunes Search and Lookup API
import requests
import json
import time
class AppStoreScraper:
def __init__(self, country="us"):
self.country = country
self.session = requests.Session()
def search_apps(self, term, limit=25):
url = "https://itunes.apple.com/search"
params = {"term": term, "country": self.country, "media": "software", "limit": limit}
response = self.session.get(url, params=params, timeout=30)
data = response.json()
return [{
"id": app.get("trackId"),
"name": app.get("trackName"),
"developer": app.get("artistName"),
"rating": app.get("averageUserRating"),
"rating_count": app.get("userRatingCount"),
"price": app.get("formattedPrice"),
"category": app.get("primaryGenreName"),
"description": app.get("description", "")[:500],
"url": app.get("trackViewUrl"),
"icon": app.get("artworkUrl512"),
"version": app.get("version"),
"bundle_id": app.get("bundleId"),
} for app in data.get("results", [])]
def get_app_details(self, app_id):
url = f"https://itunes.apple.com/lookup"
params = {"id": app_id, "country": self.country}
response = self.session.get(url, params=params, timeout=30)
data = response.json()
results = data.get("results", [])
return results[0] if results else None
def get_reviews(self, app_id, page=1, sort="mostRecent"):
url = f"https://itunes.apple.com/{self.country}/rss/customerreviews/id={app_id}/page={page}/sortBy={sort}/json"
response = self.session.get(url, timeout=30)
if response.status_code != 200:
return []
data = response.json()
entries = data.get("feed", {}).get("entry", [])
reviews = []
for entry in entries:
if isinstance(entry, dict) and "content" in entry:
reviews.append({
"id": entry.get("id", {}).get("label"),
"title": entry.get("title", {}).get("label"),
"content": entry.get("content", {}).get("label"),
"rating": entry.get("im:rating", {}).get("label"),
"author": entry.get("author", {}).get("name", {}).get("label"),
"version": entry.get("im:version", {}).get("label"),
"vote_count": entry.get("im:voteCount", {}).get("label"),
})
return reviews
def get_all_reviews(self, app_id, max_pages=10):
all_reviews = []
for page in range(1, max_pages + 1):
reviews = self.get_reviews(app_id, page=page)
if not reviews:
break
all_reviews.extend(reviews)
time.sleep(1)
return all_reviews
# Usage
scraper = AppStoreScraper(country="us")
apps = scraper.search_apps("proxy vpn", limit=10)
for app in apps[:3]:
reviews = scraper.get_all_reviews(app["id"], max_pages=5)
print(f"{app['name']} ({app['rating']}): {len(reviews)} reviews")Proxy Recommendations
Proxies are rarely needed for the iTunes API, but use them for high-volume scraping or region-specific data. Residential proxies from the target country provide accurate regional data.
Legal Considerations
- iTunes API: The RSS feed API is designed for public consumption.
- Review Content: Reviews are user-generated and copyrighted.
- Apple Guidelines: Follow Apple’s API usage guidelines.
- Rate Limits: Respect API rate limits.
See our compliance guide.
Method 2: Scraping with Selenium
For data not available via the iTunes API, use Selenium to scrape the App Store web interface:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import json
import time
class AppStoreSeleniumScraper:
def __init__(self, proxy=None):
options = Options()
options.add_argument("--headless=new")
options.add_argument("--no-sandbox")
if proxy:
options.add_argument(f"--proxy-server={proxy}")
self.driver = webdriver.Chrome(options=options)
def scrape_app_page(self, url):
self.driver.get(url)
time.sleep(3)
try:
WebDriverWait(self.driver, 15).until(
EC.presence_of_element_located((By.CSS_SELECTOR, "h1"))
)
except Exception:
return None
data = self.driver.execute_script('''
const result = {};
const title = document.querySelector("h1");
result.title = title ? title.innerText.trim() : null;
const rating = document.querySelector("[class*='rating']");
result.rating = rating ? rating.innerText.trim() : null;
const description = document.querySelector("[class*='description']");
result.description = description ? description.innerText.substring(0, 500) : null;
// Extract from JSON-LD
const scripts = document.querySelectorAll('script[type="application/ld+json"]');
for (const script of scripts) {
try {
const json = JSON.parse(script.textContent);
if (json["@type"] === "SoftwareApplication") {
result.structured_data = json;
}
} catch {}
}
return result;
''')
return data
def scrape_reviews_page(self, app_url, max_reviews=50):
reviews_url = app_url.replace("/app/", "/app/reviews/")
self.driver.get(reviews_url)
time.sleep(3)
# Scroll to load reviews
for _ in range(5):
self.driver.execute_script("window.scrollBy(0, 800);")
time.sleep(1)
reviews = self.driver.execute_script('''
const items = [];
document.querySelectorAll("[class*='review']").forEach(el => {
const title = el.querySelector("[class*='title']");
const body = el.querySelector("[class*='body']");
const stars = el.querySelector("[class*='star']");
items.push({
title: title ? title.innerText.trim() : null,
body: body ? body.innerText.trim() : null,
stars: stars ? stars.getAttribute("aria-label") : null,
});
});
return items;
''')
return reviews
def close(self):
self.driver.quit()Handling App Store Anti-Bot Protections
1. Rate Limiting on iTunes API
The iTunes RSS feed API has rate limits. Implement delays of 1-2 seconds between requests and avoid bursts.
2. Regional Content
App Store content varies by country. Use the country parameter in API requests to get region-specific data:
# Available country codes
countries = ["us", "gb", "jp", "au", "ca", "de", "fr", "in", "br"]
for country in countries:
scraper = AppStoreScraper(country=country)
reviews = scraper.get_reviews(app_id, page=1)
print(f"{country}: {len(reviews)} reviews")
time.sleep(1)3. Review Pagination
The RSS feed API limits reviews to 10 pages (approximately 500 reviews). For comprehensive review collection, combine multiple sort orders and country codes.
4. App Store Connect API
For app developers, Apple’s App Store Connect API provides additional data including sales, downloads, and crash reports. This requires developer account authentication.
Method 3: Scraping App Rankings and Charts
Beyond individual app reviews, you can scrape App Store charts and category rankings to track competitive positioning:
import requests
import json
class AppStoreChartScraper:
def __init__(self, country="us"):
self.country = country
self.session = requests.Session()
def get_top_apps(self, genre_id=36, limit=100, chart="topfreeapplications"):
"""Fetch top apps from a specific category.
Genre IDs: 36=All, 6007=Productivity, 6005=Social Networking, 6002=Utilities
Charts: topfreeapplications, toppaidapplications, topgrossingapplications
"""
url = f"https://itunes.apple.com/{self.country}/rss/{chart}/limit={limit}/genre={genre_id}/json"
response = self.session.get(url, timeout=30)
if response.status_code != 200:
return []
data = response.json()
entries = data.get("feed", {}).get("entry", [])
apps = []
for i, entry in enumerate(entries):
apps.append({
"rank": i + 1,
"name": entry.get("im:name", {}).get("label"),
"app_id": entry.get("id", {}).get("attributes", {}).get("im:id"),
"developer": entry.get("im:artist", {}).get("label"),
"category": entry.get("category", {}).get("attributes", {}).get("label"),
"price": entry.get("im:price", {}).get("attributes", {}).get("amount"),
"summary": entry.get("summary", {}).get("label", "")[:200],
"icon": entry.get("im:image", [{}])[-1].get("label") if entry.get("im:image") else None,
})
return apps
def track_ranking_changes(self, app_id, genre_id=36, limit=200):
"""Check an app's current ranking position."""
apps = self.get_top_apps(genre_id=genre_id, limit=limit)
for app in apps:
if str(app["app_id"]) == str(app_id):
return app["rank"]
return None # Not in top charts
# Usage
chart_scraper = AppStoreChartScraper(country="us")
top_apps = chart_scraper.get_top_apps(genre_id=6002, limit=25)
for app in top_apps[:10]:
print(f"#{app['rank']}: {app['name']} by {app['developer']}")Data Export and Analysis
import pandas as pd
import json
# Export reviews to various formats
def export_reviews(reviews, app_name):
# JSON
with open(f"{app_name}_reviews.json", "w") as f:
json.dump(reviews, f, indent=2)
# CSV
df = pd.DataFrame(reviews)
df.to_csv(f"{app_name}_reviews.csv", index=False)
# Sentiment analysis preparation
df["rating_numeric"] = pd.to_numeric(df["rating"], errors="coerce")
avg_rating = df["rating_numeric"].mean()
print(f"Average rating: {avg_rating:.2f}")
print(f"Total reviews: {len(reviews)}")
print(f"5-star: {len(df[df['rating_numeric'] == 5])}")
print(f"1-star: {len(df[df['rating_numeric'] == 1])}")Monitoring Review Trends Over Time
For ongoing app intelligence, set up automated review monitoring to detect shifts in user sentiment, bug reports after updates, or competitor activity:
import sqlite3
from datetime import datetime
class ReviewMonitor:
def __init__(self, db_path="app_reviews.db"):
self.conn = sqlite3.connect(db_path)
self.conn.execute('''CREATE TABLE IF NOT EXISTS reviews
(review_id TEXT PRIMARY KEY, app_id TEXT, rating INTEGER,
title TEXT, content TEXT, author TEXT, version TEXT,
scraped_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP)''')
def store_reviews(self, app_id, reviews):
new_count = 0
for review in reviews:
try:
self.conn.execute(
"INSERT OR IGNORE INTO reviews (review_id, app_id, rating, title, content, author, version) VALUES (?, ?, ?, ?, ?, ?, ?)",
(review.get("id"), app_id, review.get("rating"), review.get("title"),
review.get("content"), review.get("author"), review.get("version"))
)
new_count += 1
except sqlite3.IntegrityError:
pass
self.conn.commit()
return new_count
def get_rating_trend(self, app_id, days=30):
cursor = self.conn.execute(
"SELECT rating, COUNT(*) FROM reviews WHERE app_id = ? AND scraped_at > datetime('now', ?) GROUP BY rating",
(app_id, f"-{days} days")
)
return dict(cursor.fetchall())Frequently Asked Questions
Does Apple have an official review API?
Yes, Apple provides the iTunes RSS Feed API for public review access (no authentication needed) and the App Store Connect API for developers (requires developer account). The RSS feed is limited to 500 reviews per app per country.
How often are App Store reviews updated?
New reviews appear in the RSS feed within 24-48 hours of submission. For real-time monitoring, poll every few hours.
Can I scrape competitor app reviews?
Yes, the iTunes RSS feed provides public access to any app’s reviews. Use the app’s numeric ID (available in the App Store URL) to fetch reviews.
What’s the difference between iTunes API and App Store Connect?
The iTunes API provides public data (search, reviews, app metadata) for any app. App Store Connect is Apple’s private API for app developers to access their own app’s detailed analytics, financial reports, and management tools.
How do I handle apps with millions of reviews?
The iTunes RSS feed caps at approximately 500 reviews per country. To maximize coverage, scrape across multiple countries and combine both “mostRecent” and “mostHelpful” sort orders. For apps you own, use App Store Connect which provides complete review access. Third-party services like AppFollow or Appfigures aggregate reviews beyond the RSS feed limit.
Can I track review changes over time?
Yes. Store reviews in a database with timestamps and run scheduled scrapes daily or weekly. Compare new results against stored data to identify new reviews, removed reviews, and rating trend shifts. This is especially useful for monitoring user sentiment after app updates.
Conclusion
Apple’s App Store provides relatively accessible review data through the iTunes RSS feed API. For comprehensive data, combine API access with multi-country scraping and Selenium for additional web-based data.
Visit dataresearchtools.com for proxy recommendations and our app store optimization guide.
- How to Scrape AliExpress Product Data
- How to Scrape Amazon Product Reviews in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- How to Scrape AliExpress Product Data
- How to Scrape Amazon Product Reviews in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- How to Scrape AliExpress Product Data
- How to Scrape Amazon Product Reviews in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
Related Reading
- How to Scrape AliExpress Product Data
- How to Scrape Amazon Product Reviews in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix