Scraping Grocery Delivery Platforms: HappyFresh, Pandamart, GrabMart

Scraping Grocery Delivery Platforms: HappyFresh, Pandamart, GrabMart

Grocery delivery has become a massive segment within Southeast Asia’s food ecosystem. Platforms like GrabMart, Pandamart (Foodpanda), HappyFresh, and Shopee Supermarket now offer same-day or instant delivery of groceries, household goods, and fresh produce across major cities. For CPG brands, grocery retailers, and market analysts, these platforms generate a continuous stream of pricing, product availability, and promotional data.

This guide covers how to scrape grocery delivery platforms in Southeast Asia for competitive intelligence and market analysis.

The Grocery Delivery Data Opportunity

Market Context

Southeast Asia’s online grocery market is growing rapidly:

  • GrabMart: Integrated into the Grab super-app, available across all Grab markets
  • Pandamart: Foodpanda’s grocery vertical, operating dark stores and partnering with retailers
  • HappyFresh: Dedicated grocery delivery platform in Malaysia, Indonesia, and Thailand
  • Shopee Supermarket: Shopee’s grocery offering, tied to the e-commerce ecosystem
  • Lazada Groceries: Alibaba-backed grocery delivery

Data Categories

Data TypeBusiness Use
Product catalogCompetitor product assortment analysis
PricingPrice monitoring and competitive benchmarking
Stock availabilitySupply chain intelligence
PromotionsPromotional strategy analysis
Store coverageDelivery zone and logistics intelligence
Product imagesBrand compliance monitoring
CategoriesCategory management insights

Platform-Specific Scraping Approaches

GrabMart

GrabMart operates through the Grab app, serving grocery items from dark stores, convenience stores, and retail partners.

import requests
import time
import random
from datetime import datetime

class GrabMartScraper:
    def __init__(self, proxy_user, proxy_pass, country="SG"):
        self.session = requests.Session()
        self.country = country

        proxy_host = f"{country.lower()}-mobile.dataresearchtools.com"
        self.session.proxies = {
            "http": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080",
            "https": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080"
        }
        self.session.headers.update({
            "User-Agent": "Grab/5.x (Android 14; Samsung Galaxy S24)",
            "Accept": "application/json",
            "Accept-Language": "en-US,en;q=0.9"
        })

    def get_nearby_stores(self, latitude, longitude):
        """Find GrabMart stores near a location."""
        response = self.session.get(
            "https://api.grab.com/mart/v2/stores",
            params={
                "latitude": latitude,
                "longitude": longitude,
                "limit": 50
            }
        )

        if response.status_code != 200:
            return []

        stores = response.json().get("stores", [])
        return [
            {
                "id": s.get("id"),
                "name": s.get("name"),
                "type": s.get("store_type"),  # dark_store, retail, convenience
                "distance_km": s.get("distance"),
                "delivery_fee": s.get("delivery_fee"),
                "delivery_time": s.get("estimated_delivery_minutes"),
                "is_open": s.get("is_open"),
                "min_order": s.get("minimum_order")
            }
            for s in stores
        ]

    def get_store_catalog(self, store_id):
        """Get full product catalog for a store."""
        all_products = []
        page = 0

        while True:
            response = self.session.get(
                f"https://api.grab.com/mart/v2/stores/{store_id}/products",
                params={"page": page, "limit": 50}
            )

            if response.status_code != 200:
                break

            data = response.json()
            products = data.get("products", [])

            if not products:
                break

            for product in products:
                all_products.append({
                    "product_id": product.get("id"),
                    "name": product.get("name"),
                    "brand": product.get("brand", ""),
                    "category": product.get("category", ""),
                    "subcategory": product.get("subcategory", ""),
                    "price": product.get("price"),
                    "original_price": product.get("original_price"),
                    "unit": product.get("unit", ""),
                    "unit_price": product.get("price_per_unit"),
                    "in_stock": product.get("available", True),
                    "image_url": product.get("image_url"),
                    "weight": product.get("weight"),
                    "store_id": store_id
                })

            page += 1
            time.sleep(random.uniform(1, 3))

        return all_products

Pandamart

Pandamart operates Foodpanda’s dark store grocery delivery:

class PandamartScraper:
    def __init__(self, proxy_user, proxy_pass, country="SG"):
        self.session = requests.Session()
        self.country = country

        proxy_host = f"{country.lower()}-mobile.dataresearchtools.com"
        self.session.proxies = {
            "http": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080",
            "https": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080"
        }

        self.domains = {
            "SG": "www.foodpanda.sg",
            "MY": "www.foodpanda.my",
            "TH": "www.foodpanda.co.th",
            "PH": "www.foodpanda.ph"
        }
        self.domain = self.domains.get(country, self.domains["SG"])

        self.session.headers.update({
            "User-Agent": "Mozilla/5.0 (Linux; Android 14) AppleWebKit/537.36",
            "Accept": "application/json"
        })

    def get_pandamart_stores(self, latitude, longitude):
        """Find Pandamart stores near a location."""
        response = self.session.get(
            f"https://{self.domain}/api/v5/vendors",
            params={
                "latitude": latitude,
                "longitude": longitude,
                "vertical": "shop",
                "limit": 20
            }
        )

        if response.status_code != 200:
            return []

        vendors = response.json().get("data", {}).get("items", [])
        pandamart_stores = [
            v for v in vendors
            if "pandamart" in v.get("name", "").lower() or v.get("is_pandamart")
        ]

        return pandamart_stores

    def get_store_products(self, vendor_code):
        """Get all products from a Pandamart store."""
        response = self.session.get(
            f"https://{self.domain}/api/v5/vendors/{vendor_code}/menu"
        )

        if response.status_code != 200:
            return []

        data = response.json()
        products = []

        for menu in data.get("data", {}).get("menus", []):
            for category in menu.get("menu_categories", []):
                category_name = category.get("name", "")
                for product in category.get("products", []):
                    variations = product.get("product_variations", [{}])
                    primary = variations[0] if variations else {}

                    products.append({
                        "product_id": product.get("id"),
                        "name": product.get("name"),
                        "description": product.get("description", ""),
                        "category": category_name,
                        "price": primary.get("price", 0),
                        "original_price": primary.get("original_price"),
                        "is_available": product.get("is_available", True),
                        "image_url": product.get("file_path", ""),
                        "vendor_code": vendor_code
                    })

        return products

HappyFresh

HappyFresh partners with existing grocery retailers for delivery:

class HappyFreshScraper:
    def __init__(self, proxy_user, proxy_pass, country="MY"):
        self.session = requests.Session()

        proxy_host = f"{country.lower()}-mobile.dataresearchtools.com"
        self.session.proxies = {
            "http": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080",
            "https": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080"
        }

        self.base_url = "https://www.happyfresh.com"
        self.session.headers.update({
            "User-Agent": "Mozilla/5.0 (Linux; Android 14) AppleWebKit/537.36",
            "Accept": "application/json"
        })

    def get_available_stores(self, latitude, longitude):
        """Find HappyFresh partner stores."""
        response = self.session.get(
            f"{self.base_url}/api/v3/stores",
            params={"lat": latitude, "lng": longitude}
        )

        if response.status_code != 200:
            return []

        return response.json().get("stores", [])

    def get_category_products(self, store_id, category_id, page=1):
        """Get products in a category from a specific store."""
        response = self.session.get(
            f"{self.base_url}/api/v3/stores/{store_id}/categories/{category_id}/products",
            params={"page": page, "per_page": 48}
        )

        if response.status_code != 200:
            return []

        return response.json().get("products", [])

    def scrape_full_store(self, store_id):
        """Scrape all products from a HappyFresh partner store."""
        # First get categories
        response = self.session.get(
            f"{self.base_url}/api/v3/stores/{store_id}/categories"
        )

        if response.status_code != 200:
            return []

        categories = response.json().get("categories", [])
        all_products = []

        for category in categories:
            page = 1
            while True:
                products = self.get_category_products(store_id, category["id"], page)
                if not products:
                    break

                for p in products:
                    all_products.append({
                        "product_id": p.get("id"),
                        "name": p.get("name"),
                        "brand": p.get("brand", {}).get("name", ""),
                        "category": category.get("name"),
                        "price": p.get("price"),
                        "original_price": p.get("original_price"),
                        "weight": p.get("weight_or_volume"),
                        "in_stock": p.get("in_stock", True),
                        "store_id": store_id,
                        "image_url": p.get("image_url")
                    })

                if len(products) < 48:
                    break
                page += 1
                time.sleep(random.uniform(1, 2))

            time.sleep(random.uniform(1, 3))

        return all_products

Cross-Platform Price Comparison

Building a Unified Product Database

def build_unified_catalog(grabmart_products, pandamart_products, happyfresh_products):
    """Merge products across platforms for comparison."""
    from difflib import SequenceMatcher

    unified = []

    # Use one platform as base and match others
    for product in grabmart_products:
        entry = {
            "name": product["name"],
            "brand": product.get("brand", ""),
            "category": product.get("category", ""),
            "platforms": {
                "grabmart": {
                    "price": product["price"],
                    "in_stock": product.get("in_stock", True),
                    "store_id": product.get("store_id")
                }
            }
        }

        # Try to match on Pandamart
        pandamart_match = find_best_match(product["name"], pandamart_products)
        if pandamart_match:
            entry["platforms"]["pandamart"] = {
                "price": pandamart_match["price"],
                "in_stock": pandamart_match.get("is_available", True),
                "vendor_code": pandamart_match.get("vendor_code")
            }

        # Try to match on HappyFresh
        hf_match = find_best_match(product["name"], happyfresh_products)
        if hf_match:
            entry["platforms"]["happyfresh"] = {
                "price": hf_match["price"],
                "in_stock": hf_match.get("in_stock", True),
                "store_id": hf_match.get("store_id")
            }

        unified.append(entry)

    return unified

def find_best_match(product_name, candidate_products, threshold=0.75):
    """Find the best matching product from a list of candidates."""
    from difflib import SequenceMatcher
    best_match = None
    best_score = 0

    name_lower = product_name.lower()
    for candidate in candidate_products:
        candidate_lower = candidate["name"].lower()
        score = SequenceMatcher(None, name_lower, candidate_lower).ratio()
        if score > best_score and score >= threshold:
            best_score = score
            best_match = candidate

    return best_match

Price Analysis

def analyze_price_differences(unified_catalog):
    """Analyze price differences across platforms."""
    results = {
        "total_products_compared": 0,
        "cheapest_platform_counts": {},
        "avg_price_differences": {},
        "significant_differences": []
    }

    for product in unified_catalog:
        platforms = product["platforms"]
        if len(platforms) < 2:
            continue

        results["total_products_compared"] += 1

        prices = {p: data["price"] for p, data in platforms.items() if data["price"]}
        if not prices:
            continue

        cheapest = min(prices, key=prices.get)
        results["cheapest_platform_counts"][cheapest] = \
            results["cheapest_platform_counts"].get(cheapest, 0) + 1

        # Check for significant price differences (>15%)
        min_price = min(prices.values())
        max_price = max(prices.values())
        if min_price > 0:
            diff_pct = (max_price - min_price) / min_price * 100
            if diff_pct > 15:
                results["significant_differences"].append({
                    "product": product["name"],
                    "prices": prices,
                    "difference_pct": round(diff_pct, 1),
                    "cheapest_at": cheapest,
                    "potential_saving": round(max_price - min_price, 2)
                })

    return results

Stock Availability Monitoring

class StockMonitor:
    def __init__(self, scrapers):
        self.scrapers = scrapers
        self.stock_history = {}

    def check_stock_status(self, product_ids, platform):
        """Check current stock status for a list of products."""
        scraper = self.scrapers[platform]
        statuses = []

        for pid in product_ids:
            product = scraper.get_product_detail(pid)
            if product:
                status = {
                    "product_id": pid,
                    "name": product.get("name"),
                    "in_stock": product.get("in_stock", True),
                    "timestamp": datetime.utcnow().isoformat(),
                    "platform": platform
                }
                statuses.append(status)

                # Track stock changes
                key = f"{platform}:{pid}"
                if key in self.stock_history:
                    prev = self.stock_history[key]
                    if prev["in_stock"] != status["in_stock"]:
                        status["stock_changed"] = True
                        status["was_in_stock"] = prev["in_stock"]
                self.stock_history[key] = status

        return statuses

    def get_out_of_stock_report(self, all_statuses):
        """Generate report on out-of-stock items."""
        oos_items = [s for s in all_statuses if not s["in_stock"]]
        newly_oos = [s for s in oos_items if s.get("stock_changed") and s.get("was_in_stock")]

        return {
            "total_checked": len(all_statuses),
            "out_of_stock": len(oos_items),
            "oos_rate": f"{len(oos_items) / len(all_statuses) * 100:.1f}%",
            "newly_out_of_stock": len(newly_oos),
            "oos_by_platform": {},
            "oos_items": oos_items
        }

Category Analysis

def analyze_category_coverage(platform_catalogs):
    """Compare category coverage across platforms."""
    category_data = {}

    for platform, products in platform_catalogs.items():
        categories = {}
        for product in products:
            cat = product.get("category", "Unknown")
            if cat not in categories:
                categories[cat] = {"count": 0, "avg_price": [], "brands": set()}
            categories[cat]["count"] += 1
            if product.get("price"):
                categories[cat]["avg_price"].append(product["price"])
            if product.get("brand"):
                categories[cat]["brands"].add(product["brand"])

        for cat, data in categories.items():
            if cat not in category_data:
                category_data[cat] = {}
            category_data[cat][platform] = {
                "product_count": data["count"],
                "avg_price": round(sum(data["avg_price"]) / len(data["avg_price"]), 2) if data["avg_price"] else 0,
                "brand_count": len(data["brands"])
            }

    return category_data

Proxy Requirements for Grocery Scraping

Grocery delivery platforms present specific scraping challenges:

  1. Large catalogs: A single store may have 5,000-20,000 products, requiring many paginated requests
  2. Fresh inventory changes: Stock levels change frequently, requiring real-time access
  3. Location sensitivity: Product availability and pricing vary by delivery zone
  4. Multiple platforms: Cross-platform comparison requires reliable access to all targets

DataResearchTools mobile proxies support the high-volume, sustained scraping that grocery catalog monitoring demands. With mobile carrier IPs across all major SEA markets, you can reliably access GrabMart, Pandamart, HappyFresh, and other platforms without detection.

Practical Applications

For CPG Brands

  • Monitor your product pricing across all grocery delivery platforms
  • Track competitor product availability and promotional activity
  • Verify distribution coverage and store-level availability
  • Analyze market share by category

For Grocery Retailers

  • Benchmark your pricing against competing platforms
  • Monitor delivery fee structures and minimum order thresholds
  • Track product assortment gaps
  • Analyze category trends

For Market Analysts

  • Size the online grocery market by category and geography
  • Track platform growth through product catalog expansion
  • Analyze pricing trends across fresh produce, FMCG, and household goods
  • Monitor promotional intensity as a market maturation indicator

Conclusion

Grocery delivery platforms in Southeast Asia represent a rich data source for competitive intelligence. By using DataResearchTools mobile proxies to systematically scrape GrabMart, Pandamart, HappyFresh, and other platforms, businesses can build comprehensive views of pricing, availability, and promotional activity across the region’s fast-growing online grocery market.

Start with one platform and one market, validate your data collection pipeline, and expand as needed. The key is building robust product matching logic that enables meaningful cross-platform comparisons despite differences in naming conventions and catalog structures.


Related Reading

Scroll to Top