Scraping Grocery Delivery Platforms: HappyFresh, Pandamart, GrabMart
Grocery delivery has become a massive segment within Southeast Asia’s food ecosystem. Platforms like GrabMart, Pandamart (Foodpanda), HappyFresh, and Shopee Supermarket now offer same-day or instant delivery of groceries, household goods, and fresh produce across major cities. For CPG brands, grocery retailers, and market analysts, these platforms generate a continuous stream of pricing, product availability, and promotional data.
This guide covers how to scrape grocery delivery platforms in Southeast Asia for competitive intelligence and market analysis.
The Grocery Delivery Data Opportunity
Market Context
Southeast Asia’s online grocery market is growing rapidly:
- GrabMart: Integrated into the Grab super-app, available across all Grab markets
- Pandamart: Foodpanda’s grocery vertical, operating dark stores and partnering with retailers
- HappyFresh: Dedicated grocery delivery platform in Malaysia, Indonesia, and Thailand
- Shopee Supermarket: Shopee’s grocery offering, tied to the e-commerce ecosystem
- Lazada Groceries: Alibaba-backed grocery delivery
Data Categories
| Data Type | Business Use |
|---|---|
| Product catalog | Competitor product assortment analysis |
| Pricing | Price monitoring and competitive benchmarking |
| Stock availability | Supply chain intelligence |
| Promotions | Promotional strategy analysis |
| Store coverage | Delivery zone and logistics intelligence |
| Product images | Brand compliance monitoring |
| Categories | Category management insights |
Platform-Specific Scraping Approaches
GrabMart
GrabMart operates through the Grab app, serving grocery items from dark stores, convenience stores, and retail partners.
import requests
import time
import random
from datetime import datetime
class GrabMartScraper:
def __init__(self, proxy_user, proxy_pass, country="SG"):
self.session = requests.Session()
self.country = country
proxy_host = f"{country.lower()}-mobile.dataresearchtools.com"
self.session.proxies = {
"http": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080",
"https": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080"
}
self.session.headers.update({
"User-Agent": "Grab/5.x (Android 14; Samsung Galaxy S24)",
"Accept": "application/json",
"Accept-Language": "en-US,en;q=0.9"
})
def get_nearby_stores(self, latitude, longitude):
"""Find GrabMart stores near a location."""
response = self.session.get(
"https://api.grab.com/mart/v2/stores",
params={
"latitude": latitude,
"longitude": longitude,
"limit": 50
}
)
if response.status_code != 200:
return []
stores = response.json().get("stores", [])
return [
{
"id": s.get("id"),
"name": s.get("name"),
"type": s.get("store_type"), # dark_store, retail, convenience
"distance_km": s.get("distance"),
"delivery_fee": s.get("delivery_fee"),
"delivery_time": s.get("estimated_delivery_minutes"),
"is_open": s.get("is_open"),
"min_order": s.get("minimum_order")
}
for s in stores
]
def get_store_catalog(self, store_id):
"""Get full product catalog for a store."""
all_products = []
page = 0
while True:
response = self.session.get(
f"https://api.grab.com/mart/v2/stores/{store_id}/products",
params={"page": page, "limit": 50}
)
if response.status_code != 200:
break
data = response.json()
products = data.get("products", [])
if not products:
break
for product in products:
all_products.append({
"product_id": product.get("id"),
"name": product.get("name"),
"brand": product.get("brand", ""),
"category": product.get("category", ""),
"subcategory": product.get("subcategory", ""),
"price": product.get("price"),
"original_price": product.get("original_price"),
"unit": product.get("unit", ""),
"unit_price": product.get("price_per_unit"),
"in_stock": product.get("available", True),
"image_url": product.get("image_url"),
"weight": product.get("weight"),
"store_id": store_id
})
page += 1
time.sleep(random.uniform(1, 3))
return all_productsPandamart
Pandamart operates Foodpanda’s dark store grocery delivery:
class PandamartScraper:
def __init__(self, proxy_user, proxy_pass, country="SG"):
self.session = requests.Session()
self.country = country
proxy_host = f"{country.lower()}-mobile.dataresearchtools.com"
self.session.proxies = {
"http": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080",
"https": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080"
}
self.domains = {
"SG": "www.foodpanda.sg",
"MY": "www.foodpanda.my",
"TH": "www.foodpanda.co.th",
"PH": "www.foodpanda.ph"
}
self.domain = self.domains.get(country, self.domains["SG"])
self.session.headers.update({
"User-Agent": "Mozilla/5.0 (Linux; Android 14) AppleWebKit/537.36",
"Accept": "application/json"
})
def get_pandamart_stores(self, latitude, longitude):
"""Find Pandamart stores near a location."""
response = self.session.get(
f"https://{self.domain}/api/v5/vendors",
params={
"latitude": latitude,
"longitude": longitude,
"vertical": "shop",
"limit": 20
}
)
if response.status_code != 200:
return []
vendors = response.json().get("data", {}).get("items", [])
pandamart_stores = [
v for v in vendors
if "pandamart" in v.get("name", "").lower() or v.get("is_pandamart")
]
return pandamart_stores
def get_store_products(self, vendor_code):
"""Get all products from a Pandamart store."""
response = self.session.get(
f"https://{self.domain}/api/v5/vendors/{vendor_code}/menu"
)
if response.status_code != 200:
return []
data = response.json()
products = []
for menu in data.get("data", {}).get("menus", []):
for category in menu.get("menu_categories", []):
category_name = category.get("name", "")
for product in category.get("products", []):
variations = product.get("product_variations", [{}])
primary = variations[0] if variations else {}
products.append({
"product_id": product.get("id"),
"name": product.get("name"),
"description": product.get("description", ""),
"category": category_name,
"price": primary.get("price", 0),
"original_price": primary.get("original_price"),
"is_available": product.get("is_available", True),
"image_url": product.get("file_path", ""),
"vendor_code": vendor_code
})
return productsHappyFresh
HappyFresh partners with existing grocery retailers for delivery:
class HappyFreshScraper:
def __init__(self, proxy_user, proxy_pass, country="MY"):
self.session = requests.Session()
proxy_host = f"{country.lower()}-mobile.dataresearchtools.com"
self.session.proxies = {
"http": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080",
"https": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080"
}
self.base_url = "https://www.happyfresh.com"
self.session.headers.update({
"User-Agent": "Mozilla/5.0 (Linux; Android 14) AppleWebKit/537.36",
"Accept": "application/json"
})
def get_available_stores(self, latitude, longitude):
"""Find HappyFresh partner stores."""
response = self.session.get(
f"{self.base_url}/api/v3/stores",
params={"lat": latitude, "lng": longitude}
)
if response.status_code != 200:
return []
return response.json().get("stores", [])
def get_category_products(self, store_id, category_id, page=1):
"""Get products in a category from a specific store."""
response = self.session.get(
f"{self.base_url}/api/v3/stores/{store_id}/categories/{category_id}/products",
params={"page": page, "per_page": 48}
)
if response.status_code != 200:
return []
return response.json().get("products", [])
def scrape_full_store(self, store_id):
"""Scrape all products from a HappyFresh partner store."""
# First get categories
response = self.session.get(
f"{self.base_url}/api/v3/stores/{store_id}/categories"
)
if response.status_code != 200:
return []
categories = response.json().get("categories", [])
all_products = []
for category in categories:
page = 1
while True:
products = self.get_category_products(store_id, category["id"], page)
if not products:
break
for p in products:
all_products.append({
"product_id": p.get("id"),
"name": p.get("name"),
"brand": p.get("brand", {}).get("name", ""),
"category": category.get("name"),
"price": p.get("price"),
"original_price": p.get("original_price"),
"weight": p.get("weight_or_volume"),
"in_stock": p.get("in_stock", True),
"store_id": store_id,
"image_url": p.get("image_url")
})
if len(products) < 48:
break
page += 1
time.sleep(random.uniform(1, 2))
time.sleep(random.uniform(1, 3))
return all_productsCross-Platform Price Comparison
Building a Unified Product Database
def build_unified_catalog(grabmart_products, pandamart_products, happyfresh_products):
"""Merge products across platforms for comparison."""
from difflib import SequenceMatcher
unified = []
# Use one platform as base and match others
for product in grabmart_products:
entry = {
"name": product["name"],
"brand": product.get("brand", ""),
"category": product.get("category", ""),
"platforms": {
"grabmart": {
"price": product["price"],
"in_stock": product.get("in_stock", True),
"store_id": product.get("store_id")
}
}
}
# Try to match on Pandamart
pandamart_match = find_best_match(product["name"], pandamart_products)
if pandamart_match:
entry["platforms"]["pandamart"] = {
"price": pandamart_match["price"],
"in_stock": pandamart_match.get("is_available", True),
"vendor_code": pandamart_match.get("vendor_code")
}
# Try to match on HappyFresh
hf_match = find_best_match(product["name"], happyfresh_products)
if hf_match:
entry["platforms"]["happyfresh"] = {
"price": hf_match["price"],
"in_stock": hf_match.get("in_stock", True),
"store_id": hf_match.get("store_id")
}
unified.append(entry)
return unified
def find_best_match(product_name, candidate_products, threshold=0.75):
"""Find the best matching product from a list of candidates."""
from difflib import SequenceMatcher
best_match = None
best_score = 0
name_lower = product_name.lower()
for candidate in candidate_products:
candidate_lower = candidate["name"].lower()
score = SequenceMatcher(None, name_lower, candidate_lower).ratio()
if score > best_score and score >= threshold:
best_score = score
best_match = candidate
return best_matchPrice Analysis
def analyze_price_differences(unified_catalog):
"""Analyze price differences across platforms."""
results = {
"total_products_compared": 0,
"cheapest_platform_counts": {},
"avg_price_differences": {},
"significant_differences": []
}
for product in unified_catalog:
platforms = product["platforms"]
if len(platforms) < 2:
continue
results["total_products_compared"] += 1
prices = {p: data["price"] for p, data in platforms.items() if data["price"]}
if not prices:
continue
cheapest = min(prices, key=prices.get)
results["cheapest_platform_counts"][cheapest] = \
results["cheapest_platform_counts"].get(cheapest, 0) + 1
# Check for significant price differences (>15%)
min_price = min(prices.values())
max_price = max(prices.values())
if min_price > 0:
diff_pct = (max_price - min_price) / min_price * 100
if diff_pct > 15:
results["significant_differences"].append({
"product": product["name"],
"prices": prices,
"difference_pct": round(diff_pct, 1),
"cheapest_at": cheapest,
"potential_saving": round(max_price - min_price, 2)
})
return resultsStock Availability Monitoring
class StockMonitor:
def __init__(self, scrapers):
self.scrapers = scrapers
self.stock_history = {}
def check_stock_status(self, product_ids, platform):
"""Check current stock status for a list of products."""
scraper = self.scrapers[platform]
statuses = []
for pid in product_ids:
product = scraper.get_product_detail(pid)
if product:
status = {
"product_id": pid,
"name": product.get("name"),
"in_stock": product.get("in_stock", True),
"timestamp": datetime.utcnow().isoformat(),
"platform": platform
}
statuses.append(status)
# Track stock changes
key = f"{platform}:{pid}"
if key in self.stock_history:
prev = self.stock_history[key]
if prev["in_stock"] != status["in_stock"]:
status["stock_changed"] = True
status["was_in_stock"] = prev["in_stock"]
self.stock_history[key] = status
return statuses
def get_out_of_stock_report(self, all_statuses):
"""Generate report on out-of-stock items."""
oos_items = [s for s in all_statuses if not s["in_stock"]]
newly_oos = [s for s in oos_items if s.get("stock_changed") and s.get("was_in_stock")]
return {
"total_checked": len(all_statuses),
"out_of_stock": len(oos_items),
"oos_rate": f"{len(oos_items) / len(all_statuses) * 100:.1f}%",
"newly_out_of_stock": len(newly_oos),
"oos_by_platform": {},
"oos_items": oos_items
}Category Analysis
def analyze_category_coverage(platform_catalogs):
"""Compare category coverage across platforms."""
category_data = {}
for platform, products in platform_catalogs.items():
categories = {}
for product in products:
cat = product.get("category", "Unknown")
if cat not in categories:
categories[cat] = {"count": 0, "avg_price": [], "brands": set()}
categories[cat]["count"] += 1
if product.get("price"):
categories[cat]["avg_price"].append(product["price"])
if product.get("brand"):
categories[cat]["brands"].add(product["brand"])
for cat, data in categories.items():
if cat not in category_data:
category_data[cat] = {}
category_data[cat][platform] = {
"product_count": data["count"],
"avg_price": round(sum(data["avg_price"]) / len(data["avg_price"]), 2) if data["avg_price"] else 0,
"brand_count": len(data["brands"])
}
return category_dataProxy Requirements for Grocery Scraping
Grocery delivery platforms present specific scraping challenges:
- Large catalogs: A single store may have 5,000-20,000 products, requiring many paginated requests
- Fresh inventory changes: Stock levels change frequently, requiring real-time access
- Location sensitivity: Product availability and pricing vary by delivery zone
- Multiple platforms: Cross-platform comparison requires reliable access to all targets
DataResearchTools mobile proxies support the high-volume, sustained scraping that grocery catalog monitoring demands. With mobile carrier IPs across all major SEA markets, you can reliably access GrabMart, Pandamart, HappyFresh, and other platforms without detection.
Practical Applications
For CPG Brands
- Monitor your product pricing across all grocery delivery platforms
- Track competitor product availability and promotional activity
- Verify distribution coverage and store-level availability
- Analyze market share by category
For Grocery Retailers
- Benchmark your pricing against competing platforms
- Monitor delivery fee structures and minimum order thresholds
- Track product assortment gaps
- Analyze category trends
For Market Analysts
- Size the online grocery market by category and geography
- Track platform growth through product catalog expansion
- Analyze pricing trends across fresh produce, FMCG, and household goods
- Monitor promotional intensity as a market maturation indicator
Conclusion
Grocery delivery platforms in Southeast Asia represent a rich data source for competitive intelligence. By using DataResearchTools mobile proxies to systematically scrape GrabMart, Pandamart, HappyFresh, and other platforms, businesses can build comprehensive views of pricing, availability, and promotional activity across the region’s fast-growing online grocery market.
Start with one platform and one market, validate your data collection pipeline, and expand as needed. The key is building robust product matching logic that enables meaningful cross-platform comparisons despite differences in naming conventions and catalog structures.
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
Related Reading
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)