Scraping Street Food and Hawker Centre Data in Singapore and Malaysia

Scraping Street Food and Hawker Centre Data in Singapore and Malaysia

Hawker centres in Singapore and street food stalls in Malaysia represent one of Southeast Asia’s most unique dining ecosystems. These affordable food destinations have increasingly moved onto food delivery platforms, creating a digital data trail that offers unprecedented visibility into a market segment traditionally difficult to analyze.

For F&B analysts, food delivery platforms, and hawker stall operators themselves, scraping this data provides insights into pricing trends, popular dishes, competitive dynamics, and the evolving relationship between traditional street food and digital delivery.

The Hawker and Street Food Data Opportunity

Singapore’s Hawker Ecosystem

Singapore has over 100 hawker centres with approximately 6,000 hawker stalls. Many are now listed on food delivery platforms:

  • GrabFood: Extensive hawker centre coverage, especially in CBD and residential areas
  • Foodpanda: Growing hawker presence with dedicated hawker collections
  • WhyQ: Specialized platform focused on hawker food delivery
  • Deliveroo: Selected hawker stalls in popular centres

Key data available:

  • Individual stall listings within hawker centres
  • Menu items and pricing (often remarkably low compared to restaurants)
  • Customer ratings specific to individual stalls
  • Delivery fees and minimum orders
  • Operating hours and availability

Malaysia’s Street Food Landscape

Malaysia’s street food scene includes:

  • Mamak stalls: 24-hour Indian Muslim eateries
  • Kopitiams: Traditional coffee shops with multiple food vendors
  • Pasar malam: Night market food stalls
  • Roadside stalls: Individual operators across cities

These are increasingly listed on GrabFood Malaysia and Foodpanda Malaysia.

Scraping Hawker Centre Data

Identifying Hawker Stalls on Platforms

Hawker stalls on food delivery platforms have distinct characteristics that help identify them:

import requests
import time
import random
from datetime import datetime

class HawkerDataScraper:
    def __init__(self, proxy_user, proxy_pass, country="SG"):
        self.session = requests.Session()
        self.country = country

        proxy_host = f"{country.lower()}-mobile.dataresearchtools.com"
        self.session.proxies = {
            "http": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080",
            "https": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080"
        }
        self.session.headers.update({
            "User-Agent": "Mozilla/5.0 (Linux; Android 14; Samsung Galaxy S24) "
                         "AppleWebKit/537.36",
            "Accept": "application/json"
        })

    def identify_hawker_stalls(self, restaurants):
        """Identify hawker stalls from a list of restaurant listings."""
        hawker_indicators = [
            "hawker", "food centre", "food center", "kopitiam", "coffee shop",
            "market", "stall", "corner", "eating house", "food court"
        ]

        # Singapore hawker centre names
        sg_hawker_centres = [
            "maxwell", "old airport", "chomp chomp", "lau pa sat",
            "tiong bahru", "amoy street", "golden mile", "albert centre",
            "bedok", "tampines", "ang mo kio", "toa payoh", "clementi",
            "holland village", "newton", "satay by the bay", "tekka",
            "chinatown complex", "hong lim", "peoples park"
        ]

        hawker_stalls = []
        for restaurant in restaurants:
            name = restaurant.get("name", "").lower()
            address = restaurant.get("address", "").lower()

            is_hawker = False
            hawker_centre = None

            # Check name indicators
            for indicator in hawker_indicators:
                if indicator in name or indicator in address:
                    is_hawker = True
                    break

            # Check known hawker centres
            for centre in sg_hawker_centres:
                if centre in name or centre in address:
                    is_hawker = True
                    hawker_centre = centre.title()
                    break

            # Additional heuristics
            if not is_hawker:
                # Low price + high volume = likely hawker
                avg_price = restaurant.get("avg_item_price", 0)
                if avg_price > 0 and avg_price < 8:  # SGD
                    is_hawker = True

            if is_hawker:
                restaurant["is_hawker"] = True
                restaurant["hawker_centre"] = hawker_centre
                hawker_stalls.append(restaurant)

        return hawker_stalls

Scraping by Hawker Centre Location

# Major Singapore hawker centres with coordinates
SG_HAWKER_CENTRES = {
    "Maxwell Food Centre": {"lat": 1.2803, "lng": 103.8448, "stall_count_est": 100},
    "Old Airport Road Food Centre": {"lat": 1.3080, "lng": 103.8833, "stall_count_est": 150},
    "Chinatown Complex Food Centre": {"lat": 1.2830, "lng": 103.8440, "stall_count_est": 260},
    "Tiong Bahru Market": {"lat": 1.2867, "lng": 103.8270, "stall_count_est": 83},
    "Amoy Street Food Centre": {"lat": 1.2793, "lng": 103.8463, "stall_count_est": 100},
    "Lau Pa Sat": {"lat": 1.2803, "lng": 103.8505, "stall_count_est": 60},
    "Newton Food Centre": {"lat": 1.3120, "lng": 103.8395, "stall_count_est": 100},
    "Tekka Centre": {"lat": 1.3060, "lng": 103.8497, "stall_count_est": 50},
    "Golden Mile Food Centre": {"lat": 1.3027, "lng": 103.8637, "stall_count_est": 50},
    "Adam Road Food Centre": {"lat": 1.3243, "lng": 103.8131, "stall_count_est": 30},
    "Chomp Chomp Food Centre": {"lat": 1.3699, "lng": 103.8672, "stall_count_est": 35},
    "Bedok 85": {"lat": 1.3234, "lng": 103.9364, "stall_count_est": 40},
    "Tampines Round Market": {"lat": 1.3536, "lng": 103.9448, "stall_count_est": 50},
    "Toa Payoh Lorong 8 Market": {"lat": 1.3374, "lng": 103.8524, "stall_count_est": 60}
}

# Major Malaysia hawker locations
MY_HAWKER_LOCATIONS = {
    "Jalan Alor": {"lat": 3.1453, "lng": 101.7100, "city": "KL"},
    "Petaling Street": {"lat": 3.1432, "lng": 101.6969, "city": "KL"},
    "Gurney Drive": {"lat": 5.4375, "lng": 100.3098, "city": "Penang"},
    "Lebuh Chulia": {"lat": 5.4145, "lng": 100.3378, "city": "Penang"},
    "Jonker Street": {"lat": 2.1961, "lng": 102.2482, "city": "Malacca"},
    "SS2 Food Court": {"lat": 3.1188, "lng": 101.6224, "city": "PJ"}
}

def scrape_hawker_centre(self, centre_name, centre_info):
    """Scrape all delivery-available stalls from a hawker centre."""
    lat, lng = centre_info["lat"], centre_info["lng"]

    # Search for restaurants very close to the hawker centre
    response = self.session.get(
        "https://food.grab.com/api/v1/restaurants",
        params={
            "latitude": lat,
            "longitude": lng,
            "limit": 50,
            "sort": "distance"
        }
    )

    if response.status_code != 200:
        return []

    all_restaurants = response.json().get("restaurants", [])

    # Filter for stalls within 100m of the hawker centre
    hawker_stalls = []
    for r in all_restaurants:
        r_lat = r.get("latitude", 0)
        r_lng = r.get("longitude", 0)
        distance = self._haversine(lat, lng, r_lat, r_lng)

        if distance < 0.1:  # Within 100 meters
            stall = {
                "name": r.get("name"),
                "hawker_centre": centre_name,
                "platform": "grabfood",
                "cuisine": r.get("cuisine_type", ""),
                "rating": r.get("rating"),
                "review_count": r.get("total_reviews"),
                "delivery_fee": r.get("delivery_fee"),
                "min_order": r.get("minimum_order"),
                "distance_from_centre": round(distance * 1000),  # in meters
                "is_open": r.get("is_open"),
                "platform_id": r.get("id")
            }
            hawker_stalls.append(stall)

    return hawker_stalls

Analyzing Hawker Food Data

Price Analysis

Hawker food pricing is uniquely interesting because it represents the affordable end of the dining spectrum:

def analyze_hawker_pricing(hawker_stalls_with_menus):
    """Analyze pricing patterns across hawker stalls."""
    all_items = []
    for stall in hawker_stalls_with_menus:
        for item in stall.get("menu_items", []):
            item["stall_name"] = stall["name"]
            item["hawker_centre"] = stall["hawker_centre"]
            item["cuisine"] = stall.get("cuisine", "")
            all_items.append(item)

    # Overall price statistics
    prices = [i["price"] for i in all_items if i.get("price")]
    price_stats = {
        "total_items": len(all_items),
        "avg_price": round(sum(prices) / len(prices), 2),
        "median_price": sorted(prices)[len(prices) // 2],
        "min_price": min(prices),
        "max_price": max(prices),
        "under_5_sgd": len([p for p in prices if p < 5]),
        "5_to_10_sgd": len([p for p in prices if 5 <= p < 10]),
        "over_10_sgd": len([p for p in prices if p >= 10])
    }

    # Price by cuisine type
    cuisine_prices = {}
    for item in all_items:
        cuisine = item.get("cuisine", "Unknown")
        if cuisine not in cuisine_prices:
            cuisine_prices[cuisine] = []
        if item.get("price"):
            cuisine_prices[cuisine].append(item["price"])

    price_by_cuisine = {
        cuisine: {
            "avg_price": round(sum(prices) / len(prices), 2),
            "item_count": len(prices),
            "price_range": f"{min(prices):.2f} - {max(prices):.2f}"
        }
        for cuisine, prices in cuisine_prices.items()
        if prices
    }

    # Price by hawker centre
    centre_prices = {}
    for item in all_items:
        centre = item.get("hawker_centre", "Unknown")
        if centre not in centre_prices:
            centre_prices[centre] = []
        if item.get("price"):
            centre_prices[centre].append(item["price"])

    price_by_centre = {
        centre: {
            "avg_price": round(sum(prices) / len(prices), 2),
            "item_count": len(prices)
        }
        for centre, prices in centre_prices.items()
        if prices
    }

    return {
        "overall_stats": price_stats,
        "by_cuisine": price_by_cuisine,
        "by_hawker_centre": price_by_centre
    }

Popular Dishes Analysis

def analyze_popular_dishes(hawker_menus):
    """Identify the most common and highest-rated hawker dishes."""
    from collections import Counter

    dish_names = []
    dish_data = {}

    for stall in hawker_menus:
        stall_rating = stall.get("rating", 0)
        for item in stall.get("menu_items", []):
            name = item.get("name", "").lower().strip()
            dish_names.append(name)

            if name not in dish_data:
                dish_data[name] = {
                    "prices": [],
                    "stall_ratings": [],
                    "stall_count": 0,
                    "original_names": set()
                }

            dish_data[name]["prices"].append(item.get("price", 0))
            dish_data[name]["stall_ratings"].append(stall_rating)
            dish_data[name]["stall_count"] += 1
            dish_data[name]["original_names"].add(item.get("name", ""))

    # Identify iconic hawker dishes
    iconic_dishes = [
        "chicken rice", "char kway teow", "laksa", "nasi lemak",
        "satay", "hokkien mee", "bak chor mee", "carrot cake",
        "roti prata", "mee goreng", "nasi goreng", "rojak",
        "wanton mee", "fish ball noodle", "duck rice", "curry puff",
        "popiah", "ice kacang", "chendol", "kaya toast",
        "mee rebus", "mee siam", "ban mian", "lor mee"
    ]

    iconic_analysis = {}
    for dish in iconic_dishes:
        matching = [
            name for name in dish_data.keys()
            if dish in name
        ]
        if matching:
            combined_prices = []
            combined_ratings = []
            total_stalls = 0
            for match in matching:
                combined_prices.extend(dish_data[match]["prices"])
                combined_ratings.extend(dish_data[match]["stall_ratings"])
                total_stalls += dish_data[match]["stall_count"]

            iconic_analysis[dish] = {
                "available_at_stalls": total_stalls,
                "avg_price": round(sum(combined_prices) / len(combined_prices), 2),
                "price_range": f"{min(combined_prices):.2f} - {max(combined_prices):.2f}",
                "avg_stall_rating": round(
                    sum(combined_ratings) / len(combined_ratings), 2
                ) if combined_ratings else 0
            }

    return {
        "most_common_items": Counter(dish_names).most_common(20),
        "iconic_dish_analysis": iconic_analysis
    }

Hawker Centre Ranking

def rank_hawker_centres(centres_data):
    """Rank hawker centres by various metrics."""
    rankings = {}

    for centre_name, stalls in centres_data.items():
        if not stalls:
            continue

        ratings = [s.get("rating", 0) for s in stalls if s.get("rating")]
        prices = []
        for s in stalls:
            for item in s.get("menu_items", []):
                if item.get("price"):
                    prices.append(item["price"])

        rankings[centre_name] = {
            "stall_count_on_delivery": len(stalls),
            "avg_rating": round(sum(ratings) / len(ratings), 2) if ratings else 0,
            "highest_rated_stall": max(stalls, key=lambda s: s.get("rating", 0))["name"] if stalls else "",
            "avg_item_price": round(sum(prices) / len(prices), 2) if prices else 0,
            "cheapest_item": min(prices) if prices else 0,
            "total_menu_items": sum(len(s.get("menu_items", [])) for s in stalls),
            "delivery_coverage": f"{len([s for s in stalls if s.get('is_open')]) / len(stalls) * 100:.0f}%"
        }

    return dict(sorted(
        rankings.items(),
        key=lambda x: x[1]["avg_rating"],
        reverse=True
    ))

Tracking Hawker Food Trends

Price Trend Monitoring

def track_hawker_price_trends(price_history, dish_name):
    """Track price changes for iconic hawker dishes over time."""
    by_date = {}
    for entry in price_history:
        if dish_name.lower() in entry.get("item_name", "").lower():
            date = entry["scraped_at"].strftime("%Y-%m-%d")
            if date not in by_date:
                by_date[date] = []
            by_date[date].append(entry["price"])

    trend = []
    for date in sorted(by_date.keys()):
        prices = by_date[date]
        trend.append({
            "date": date,
            "avg_price": round(sum(prices) / len(prices), 2),
            "min_price": min(prices),
            "max_price": max(prices),
            "data_points": len(prices)
        })

    if len(trend) >= 2:
        first_avg = trend[0]["avg_price"]
        last_avg = trend[-1]["avg_price"]
        change_pct = round((last_avg - first_avg) / first_avg * 100, 1)
    else:
        change_pct = 0

    return {
        "dish": dish_name,
        "trend_data": trend,
        "price_change_pct": change_pct,
        "direction": "increasing" if change_pct > 2 else
                    "decreasing" if change_pct < -2 else "stable"
    }

Delivery Adoption Rate

Track how many hawker stalls are joining delivery platforms over time:

def track_delivery_adoption(hawker_snapshots_over_time):
    """Track the rate of hawker stall adoption on delivery platforms."""
    adoption_timeline = {}

    for snapshot in hawker_snapshots_over_time:
        date = snapshot["date"]
        centre = snapshot["hawker_centre"]

        if centre not in adoption_timeline:
            adoption_timeline[centre] = {}

        adoption_timeline[centre][date] = {
            "stalls_on_delivery": snapshot["stall_count"],
            "total_stalls_est": snapshot.get("total_stalls_estimated", 0)
        }

    # Calculate adoption rates
    for centre, timeline in adoption_timeline.items():
        dates = sorted(timeline.keys())
        if len(dates) >= 2:
            first_count = timeline[dates[0]]["stalls_on_delivery"]
            last_count = timeline[dates[-1]]["stalls_on_delivery"]
            timeline["growth"] = {
                "first_observed": first_count,
                "latest": last_count,
                "change": last_count - first_count,
                "growth_pct": round(
                    (last_count - first_count) / first_count * 100, 1
                ) if first_count > 0 else 0
            }

    return adoption_timeline

Unique Challenges of Hawker Data

Naming Inconsistencies

Hawker stalls often have informal names that differ across platforms:

def normalize_hawker_name(name):
    """Normalize hawker stall names for matching."""
    import re
    name = name.lower().strip()

    # Remove stall numbers
    name = re.sub(r'#\d+-\d+', '', name)
    name = re.sub(r'stall\s*\d+', '', name)

    # Remove hawker centre name
    centres = ["maxwell", "amoy", "old airport", "chinatown"]
    for centre in centres:
        name = name.replace(centre, "")

    # Remove common suffixes
    name = re.sub(r'\s*(food|stall|hawker|centre|center)\s*', ' ', name)
    name = ' '.join(name.split()).strip()

    return name

Low Price Points

Hawker food prices are extremely low (S$3-8 per item), making percentage-based price changes look dramatic when absolute changes are small. Use absolute price tracking alongside percentages.

Operating Hour Variability

Hawker stalls have irregular hours. Some operate only for breakfast, others only for dinner. Track availability patterns to build accurate operating hour profiles.

Proxy Requirements for Hawker Data

Hawker centre data scraping requires:

  1. Singapore/Malaysia mobile IPs: Platforms serve location-specific hawker data
  2. Fine-grained geo-targeting: Hawker centres are small geographic areas
  3. Frequent access: Hawker stall availability changes throughout the day
  4. Multi-platform coverage: Stalls appear on different platforms

DataResearchTools mobile proxies provide Singapore and Malaysian carrier IPs that enable accurate, location-specific hawker data collection from all major food delivery platforms.

Conclusion

Hawker centre and street food data represents a unique niche within Southeast Asia’s food delivery landscape. By scraping this data using DataResearchTools mobile proxies, analysts and F&B operators can track pricing trends for iconic dishes, measure delivery platform adoption among traditional food vendors, and identify opportunities in one of the region’s most culturally significant food segments.

The combination of rising delivery adoption and the cultural importance of hawker food makes this data increasingly valuable. Start by mapping the hawker centres in your target area, scrape available stall listings, and build a monitoring system that tracks pricing and availability over time.


Related Reading

Scroll to Top