Scraping Street Food and Hawker Centre Data in Singapore and Malaysia
Hawker centres in Singapore and street food stalls in Malaysia represent one of Southeast Asia’s most unique dining ecosystems. These affordable food destinations have increasingly moved onto food delivery platforms, creating a digital data trail that offers unprecedented visibility into a market segment traditionally difficult to analyze.
For F&B analysts, food delivery platforms, and hawker stall operators themselves, scraping this data provides insights into pricing trends, popular dishes, competitive dynamics, and the evolving relationship between traditional street food and digital delivery.
The Hawker and Street Food Data Opportunity
Singapore’s Hawker Ecosystem
Singapore has over 100 hawker centres with approximately 6,000 hawker stalls. Many are now listed on food delivery platforms:
- GrabFood: Extensive hawker centre coverage, especially in CBD and residential areas
- Foodpanda: Growing hawker presence with dedicated hawker collections
- WhyQ: Specialized platform focused on hawker food delivery
- Deliveroo: Selected hawker stalls in popular centres
Key data available:
- Individual stall listings within hawker centres
- Menu items and pricing (often remarkably low compared to restaurants)
- Customer ratings specific to individual stalls
- Delivery fees and minimum orders
- Operating hours and availability
Malaysia’s Street Food Landscape
Malaysia’s street food scene includes:
- Mamak stalls: 24-hour Indian Muslim eateries
- Kopitiams: Traditional coffee shops with multiple food vendors
- Pasar malam: Night market food stalls
- Roadside stalls: Individual operators across cities
These are increasingly listed on GrabFood Malaysia and Foodpanda Malaysia.
Scraping Hawker Centre Data
Identifying Hawker Stalls on Platforms
Hawker stalls on food delivery platforms have distinct characteristics that help identify them:
import requests
import time
import random
from datetime import datetime
class HawkerDataScraper:
def __init__(self, proxy_user, proxy_pass, country="SG"):
self.session = requests.Session()
self.country = country
proxy_host = f"{country.lower()}-mobile.dataresearchtools.com"
self.session.proxies = {
"http": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080",
"https": f"http://{proxy_user}:{proxy_pass}@{proxy_host}:8080"
}
self.session.headers.update({
"User-Agent": "Mozilla/5.0 (Linux; Android 14; Samsung Galaxy S24) "
"AppleWebKit/537.36",
"Accept": "application/json"
})
def identify_hawker_stalls(self, restaurants):
"""Identify hawker stalls from a list of restaurant listings."""
hawker_indicators = [
"hawker", "food centre", "food center", "kopitiam", "coffee shop",
"market", "stall", "corner", "eating house", "food court"
]
# Singapore hawker centre names
sg_hawker_centres = [
"maxwell", "old airport", "chomp chomp", "lau pa sat",
"tiong bahru", "amoy street", "golden mile", "albert centre",
"bedok", "tampines", "ang mo kio", "toa payoh", "clementi",
"holland village", "newton", "satay by the bay", "tekka",
"chinatown complex", "hong lim", "peoples park"
]
hawker_stalls = []
for restaurant in restaurants:
name = restaurant.get("name", "").lower()
address = restaurant.get("address", "").lower()
is_hawker = False
hawker_centre = None
# Check name indicators
for indicator in hawker_indicators:
if indicator in name or indicator in address:
is_hawker = True
break
# Check known hawker centres
for centre in sg_hawker_centres:
if centre in name or centre in address:
is_hawker = True
hawker_centre = centre.title()
break
# Additional heuristics
if not is_hawker:
# Low price + high volume = likely hawker
avg_price = restaurant.get("avg_item_price", 0)
if avg_price > 0 and avg_price < 8: # SGD
is_hawker = True
if is_hawker:
restaurant["is_hawker"] = True
restaurant["hawker_centre"] = hawker_centre
hawker_stalls.append(restaurant)
return hawker_stallsScraping by Hawker Centre Location
# Major Singapore hawker centres with coordinates
SG_HAWKER_CENTRES = {
"Maxwell Food Centre": {"lat": 1.2803, "lng": 103.8448, "stall_count_est": 100},
"Old Airport Road Food Centre": {"lat": 1.3080, "lng": 103.8833, "stall_count_est": 150},
"Chinatown Complex Food Centre": {"lat": 1.2830, "lng": 103.8440, "stall_count_est": 260},
"Tiong Bahru Market": {"lat": 1.2867, "lng": 103.8270, "stall_count_est": 83},
"Amoy Street Food Centre": {"lat": 1.2793, "lng": 103.8463, "stall_count_est": 100},
"Lau Pa Sat": {"lat": 1.2803, "lng": 103.8505, "stall_count_est": 60},
"Newton Food Centre": {"lat": 1.3120, "lng": 103.8395, "stall_count_est": 100},
"Tekka Centre": {"lat": 1.3060, "lng": 103.8497, "stall_count_est": 50},
"Golden Mile Food Centre": {"lat": 1.3027, "lng": 103.8637, "stall_count_est": 50},
"Adam Road Food Centre": {"lat": 1.3243, "lng": 103.8131, "stall_count_est": 30},
"Chomp Chomp Food Centre": {"lat": 1.3699, "lng": 103.8672, "stall_count_est": 35},
"Bedok 85": {"lat": 1.3234, "lng": 103.9364, "stall_count_est": 40},
"Tampines Round Market": {"lat": 1.3536, "lng": 103.9448, "stall_count_est": 50},
"Toa Payoh Lorong 8 Market": {"lat": 1.3374, "lng": 103.8524, "stall_count_est": 60}
}
# Major Malaysia hawker locations
MY_HAWKER_LOCATIONS = {
"Jalan Alor": {"lat": 3.1453, "lng": 101.7100, "city": "KL"},
"Petaling Street": {"lat": 3.1432, "lng": 101.6969, "city": "KL"},
"Gurney Drive": {"lat": 5.4375, "lng": 100.3098, "city": "Penang"},
"Lebuh Chulia": {"lat": 5.4145, "lng": 100.3378, "city": "Penang"},
"Jonker Street": {"lat": 2.1961, "lng": 102.2482, "city": "Malacca"},
"SS2 Food Court": {"lat": 3.1188, "lng": 101.6224, "city": "PJ"}
}
def scrape_hawker_centre(self, centre_name, centre_info):
"""Scrape all delivery-available stalls from a hawker centre."""
lat, lng = centre_info["lat"], centre_info["lng"]
# Search for restaurants very close to the hawker centre
response = self.session.get(
"https://food.grab.com/api/v1/restaurants",
params={
"latitude": lat,
"longitude": lng,
"limit": 50,
"sort": "distance"
}
)
if response.status_code != 200:
return []
all_restaurants = response.json().get("restaurants", [])
# Filter for stalls within 100m of the hawker centre
hawker_stalls = []
for r in all_restaurants:
r_lat = r.get("latitude", 0)
r_lng = r.get("longitude", 0)
distance = self._haversine(lat, lng, r_lat, r_lng)
if distance < 0.1: # Within 100 meters
stall = {
"name": r.get("name"),
"hawker_centre": centre_name,
"platform": "grabfood",
"cuisine": r.get("cuisine_type", ""),
"rating": r.get("rating"),
"review_count": r.get("total_reviews"),
"delivery_fee": r.get("delivery_fee"),
"min_order": r.get("minimum_order"),
"distance_from_centre": round(distance * 1000), # in meters
"is_open": r.get("is_open"),
"platform_id": r.get("id")
}
hawker_stalls.append(stall)
return hawker_stallsAnalyzing Hawker Food Data
Price Analysis
Hawker food pricing is uniquely interesting because it represents the affordable end of the dining spectrum:
def analyze_hawker_pricing(hawker_stalls_with_menus):
"""Analyze pricing patterns across hawker stalls."""
all_items = []
for stall in hawker_stalls_with_menus:
for item in stall.get("menu_items", []):
item["stall_name"] = stall["name"]
item["hawker_centre"] = stall["hawker_centre"]
item["cuisine"] = stall.get("cuisine", "")
all_items.append(item)
# Overall price statistics
prices = [i["price"] for i in all_items if i.get("price")]
price_stats = {
"total_items": len(all_items),
"avg_price": round(sum(prices) / len(prices), 2),
"median_price": sorted(prices)[len(prices) // 2],
"min_price": min(prices),
"max_price": max(prices),
"under_5_sgd": len([p for p in prices if p < 5]),
"5_to_10_sgd": len([p for p in prices if 5 <= p < 10]),
"over_10_sgd": len([p for p in prices if p >= 10])
}
# Price by cuisine type
cuisine_prices = {}
for item in all_items:
cuisine = item.get("cuisine", "Unknown")
if cuisine not in cuisine_prices:
cuisine_prices[cuisine] = []
if item.get("price"):
cuisine_prices[cuisine].append(item["price"])
price_by_cuisine = {
cuisine: {
"avg_price": round(sum(prices) / len(prices), 2),
"item_count": len(prices),
"price_range": f"{min(prices):.2f} - {max(prices):.2f}"
}
for cuisine, prices in cuisine_prices.items()
if prices
}
# Price by hawker centre
centre_prices = {}
for item in all_items:
centre = item.get("hawker_centre", "Unknown")
if centre not in centre_prices:
centre_prices[centre] = []
if item.get("price"):
centre_prices[centre].append(item["price"])
price_by_centre = {
centre: {
"avg_price": round(sum(prices) / len(prices), 2),
"item_count": len(prices)
}
for centre, prices in centre_prices.items()
if prices
}
return {
"overall_stats": price_stats,
"by_cuisine": price_by_cuisine,
"by_hawker_centre": price_by_centre
}Popular Dishes Analysis
def analyze_popular_dishes(hawker_menus):
"""Identify the most common and highest-rated hawker dishes."""
from collections import Counter
dish_names = []
dish_data = {}
for stall in hawker_menus:
stall_rating = stall.get("rating", 0)
for item in stall.get("menu_items", []):
name = item.get("name", "").lower().strip()
dish_names.append(name)
if name not in dish_data:
dish_data[name] = {
"prices": [],
"stall_ratings": [],
"stall_count": 0,
"original_names": set()
}
dish_data[name]["prices"].append(item.get("price", 0))
dish_data[name]["stall_ratings"].append(stall_rating)
dish_data[name]["stall_count"] += 1
dish_data[name]["original_names"].add(item.get("name", ""))
# Identify iconic hawker dishes
iconic_dishes = [
"chicken rice", "char kway teow", "laksa", "nasi lemak",
"satay", "hokkien mee", "bak chor mee", "carrot cake",
"roti prata", "mee goreng", "nasi goreng", "rojak",
"wanton mee", "fish ball noodle", "duck rice", "curry puff",
"popiah", "ice kacang", "chendol", "kaya toast",
"mee rebus", "mee siam", "ban mian", "lor mee"
]
iconic_analysis = {}
for dish in iconic_dishes:
matching = [
name for name in dish_data.keys()
if dish in name
]
if matching:
combined_prices = []
combined_ratings = []
total_stalls = 0
for match in matching:
combined_prices.extend(dish_data[match]["prices"])
combined_ratings.extend(dish_data[match]["stall_ratings"])
total_stalls += dish_data[match]["stall_count"]
iconic_analysis[dish] = {
"available_at_stalls": total_stalls,
"avg_price": round(sum(combined_prices) / len(combined_prices), 2),
"price_range": f"{min(combined_prices):.2f} - {max(combined_prices):.2f}",
"avg_stall_rating": round(
sum(combined_ratings) / len(combined_ratings), 2
) if combined_ratings else 0
}
return {
"most_common_items": Counter(dish_names).most_common(20),
"iconic_dish_analysis": iconic_analysis
}Hawker Centre Ranking
def rank_hawker_centres(centres_data):
"""Rank hawker centres by various metrics."""
rankings = {}
for centre_name, stalls in centres_data.items():
if not stalls:
continue
ratings = [s.get("rating", 0) for s in stalls if s.get("rating")]
prices = []
for s in stalls:
for item in s.get("menu_items", []):
if item.get("price"):
prices.append(item["price"])
rankings[centre_name] = {
"stall_count_on_delivery": len(stalls),
"avg_rating": round(sum(ratings) / len(ratings), 2) if ratings else 0,
"highest_rated_stall": max(stalls, key=lambda s: s.get("rating", 0))["name"] if stalls else "",
"avg_item_price": round(sum(prices) / len(prices), 2) if prices else 0,
"cheapest_item": min(prices) if prices else 0,
"total_menu_items": sum(len(s.get("menu_items", [])) for s in stalls),
"delivery_coverage": f"{len([s for s in stalls if s.get('is_open')]) / len(stalls) * 100:.0f}%"
}
return dict(sorted(
rankings.items(),
key=lambda x: x[1]["avg_rating"],
reverse=True
))Tracking Hawker Food Trends
Price Trend Monitoring
def track_hawker_price_trends(price_history, dish_name):
"""Track price changes for iconic hawker dishes over time."""
by_date = {}
for entry in price_history:
if dish_name.lower() in entry.get("item_name", "").lower():
date = entry["scraped_at"].strftime("%Y-%m-%d")
if date not in by_date:
by_date[date] = []
by_date[date].append(entry["price"])
trend = []
for date in sorted(by_date.keys()):
prices = by_date[date]
trend.append({
"date": date,
"avg_price": round(sum(prices) / len(prices), 2),
"min_price": min(prices),
"max_price": max(prices),
"data_points": len(prices)
})
if len(trend) >= 2:
first_avg = trend[0]["avg_price"]
last_avg = trend[-1]["avg_price"]
change_pct = round((last_avg - first_avg) / first_avg * 100, 1)
else:
change_pct = 0
return {
"dish": dish_name,
"trend_data": trend,
"price_change_pct": change_pct,
"direction": "increasing" if change_pct > 2 else
"decreasing" if change_pct < -2 else "stable"
}Delivery Adoption Rate
Track how many hawker stalls are joining delivery platforms over time:
def track_delivery_adoption(hawker_snapshots_over_time):
"""Track the rate of hawker stall adoption on delivery platforms."""
adoption_timeline = {}
for snapshot in hawker_snapshots_over_time:
date = snapshot["date"]
centre = snapshot["hawker_centre"]
if centre not in adoption_timeline:
adoption_timeline[centre] = {}
adoption_timeline[centre][date] = {
"stalls_on_delivery": snapshot["stall_count"],
"total_stalls_est": snapshot.get("total_stalls_estimated", 0)
}
# Calculate adoption rates
for centre, timeline in adoption_timeline.items():
dates = sorted(timeline.keys())
if len(dates) >= 2:
first_count = timeline[dates[0]]["stalls_on_delivery"]
last_count = timeline[dates[-1]]["stalls_on_delivery"]
timeline["growth"] = {
"first_observed": first_count,
"latest": last_count,
"change": last_count - first_count,
"growth_pct": round(
(last_count - first_count) / first_count * 100, 1
) if first_count > 0 else 0
}
return adoption_timelineUnique Challenges of Hawker Data
Naming Inconsistencies
Hawker stalls often have informal names that differ across platforms:
def normalize_hawker_name(name):
"""Normalize hawker stall names for matching."""
import re
name = name.lower().strip()
# Remove stall numbers
name = re.sub(r'#\d+-\d+', '', name)
name = re.sub(r'stall\s*\d+', '', name)
# Remove hawker centre name
centres = ["maxwell", "amoy", "old airport", "chinatown"]
for centre in centres:
name = name.replace(centre, "")
# Remove common suffixes
name = re.sub(r'\s*(food|stall|hawker|centre|center)\s*', ' ', name)
name = ' '.join(name.split()).strip()
return nameLow Price Points
Hawker food prices are extremely low (S$3-8 per item), making percentage-based price changes look dramatic when absolute changes are small. Use absolute price tracking alongside percentages.
Operating Hour Variability
Hawker stalls have irregular hours. Some operate only for breakfast, others only for dinner. Track availability patterns to build accurate operating hour profiles.
Proxy Requirements for Hawker Data
Hawker centre data scraping requires:
- Singapore/Malaysia mobile IPs: Platforms serve location-specific hawker data
- Fine-grained geo-targeting: Hawker centres are small geographic areas
- Frequent access: Hawker stall availability changes throughout the day
- Multi-platform coverage: Stalls appear on different platforms
DataResearchTools mobile proxies provide Singapore and Malaysian carrier IPs that enable accurate, location-specific hawker data collection from all major food delivery platforms.
Conclusion
Hawker centre and street food data represents a unique niche within Southeast Asia’s food delivery landscape. By scraping this data using DataResearchTools mobile proxies, analysts and F&B operators can track pricing trends for iconic dishes, measure delivery platform adoption among traditional food vendors, and identify opportunities in one of the region’s most culturally significant food segments.
The combination of rising delivery adoption and the cultural importance of hawker food makes this data increasingly valuable. Start by mapping the hawker centres in your target area, scrape available stall listings, and build a monitoring system that tracks pricing and availability over time.
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
Related Reading
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)