How to Scrape Food Influencer Content for Marketing Intelligence
Food influencers wield enormous power in Southeast Asia’s dining and food delivery markets. From Instagram food photographers in Singapore to TikTok food reviewers in Thailand and YouTube mukbang creators in Indonesia, these content creators shape consumer preferences and drive restaurant traffic. For F&B brands, understanding this influencer landscape through systematic data collection provides a significant marketing advantage.
This guide covers how to scrape and analyze food influencer content across social media platforms in Southeast Asia.
The Food Influencer Landscape in SEA
Platform Distribution
Food influencer content is spread across multiple platforms, each with different content formats and audiences:
| Platform | Content Type | Key SEA Markets | Audience Profile |
|---|---|---|---|
| Photos, Reels, Stories | All SEA markets | 18-35, visual-first | |
| TikTok | Short videos, reviews | TH, ID, PH, MY | 16-30, trend-driven |
| YouTube | Long-form reviews, vlogs | All SEA markets | 20-40, research-oriented |
| Reviews, live streams | PH, TH, MY | 25-45, community-driven | |
| XiaoHongShu | Photo reviews, guides | SG, MY (Chinese speakers) | 20-35, lifestyle-focused |
Data Opportunities
Scraping food influencer content reveals:
- Trending restaurants: Which restaurants are getting influencer attention
- Popular cuisines: What food types are generating the most content
- Promotion effectiveness: How sponsored content performs vs. organic
- Sentiment patterns: What influencers praise or criticize
- Competitor coverage: Which competitors are investing in influencer marketing
- Content gaps: Underserved niches in food content
Building an Influencer Intelligence System
Core Architecture
import requests
import time
import random
import json
from datetime import datetime
from dataclasses import dataclass, field
from typing import List, Optional
@dataclass
class InfluencerProfile:
platform: str
username: str
display_name: str
followers: int
following: int
post_count: int
bio: str
country: str
engagement_rate: float = 0.0
avg_likes: int = 0
avg_comments: int = 0
food_content_ratio: float = 0.0
categories: List[str] = field(default_factory=list)
@dataclass
class FoodPost:
platform: str
post_id: str
author: str
content_text: str
hashtags: List[str]
mentions: List[str]
likes: int
comments: int
shares: int
posted_at: datetime
location: Optional[str] = None
restaurant_mentioned: Optional[str] = None
is_sponsored: bool = False
media_urls: List[str] = field(default_factory=list)
engagement_rate: float = 0.0
class FoodInfluencerScraper:
def __init__(self, proxy_user, proxy_pass):
self.proxy_user = proxy_user
self.proxy_pass = proxy_pass
def _get_session(self, country="SG"):
session = requests.Session()
proxy_host = f"{country.lower()}-mobile.dataresearchtools.com"
session.proxies = {
"http": f"http://{self.proxy_user}:{self.proxy_pass}@{proxy_host}:8080",
"https": f"http://{self.proxy_user}:{self.proxy_pass}@{proxy_host}:8080"
}
session.headers.update({
"User-Agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X) "
"AppleWebKit/605.1.15 Mobile/15E148",
"Accept": "application/json"
})
return sessionInstagram Food Content Scraping
def scrape_instagram_hashtag(self, hashtag, country="SG", max_posts=100):
"""Scrape posts from a food-related Instagram hashtag."""
session = self._get_session(country)
posts = []
# Use Instagram's web API
response = session.get(
f"https://www.instagram.com/api/v1/tags/{hashtag}/sections/",
headers={"X-IG-App-ID": "936619743392459"}
)
if response.status_code != 200:
return posts
data = response.json()
sections = data.get("sections", [])
for section in sections:
medias = section.get("layout_content", {}).get("medias", [])
for media_item in medias:
media = media_item.get("media", {})
post = self._parse_instagram_post(media)
if post:
posts.append(post)
if len(posts) >= max_posts:
break
return posts
def _parse_instagram_post(self, media):
"""Parse an Instagram media object into a FoodPost."""
caption = media.get("caption", {})
caption_text = caption.get("text", "") if caption else ""
hashtags = self._extract_hashtags(caption_text)
mentions = self._extract_mentions(caption_text)
user = media.get("user", {})
return FoodPost(
platform="instagram",
post_id=str(media.get("pk", "")),
author=user.get("username", ""),
content_text=caption_text,
hashtags=hashtags,
mentions=mentions,
likes=media.get("like_count", 0),
comments=media.get("comment_count", 0),
shares=0,
posted_at=datetime.fromtimestamp(media.get("taken_at", 0)),
location=media.get("location", {}).get("name") if media.get("location") else None,
is_sponsored="paid_partnership" in str(media.get("sponsor_tags", [])),
media_urls=[
media.get("image_versions2", {}).get("candidates", [{}])[0].get("url", "")
]
)
def _extract_hashtags(self, text):
"""Extract hashtags from text."""
import re
return re.findall(r'#(\w+)', text)
def _extract_mentions(self, text):
"""Extract @mentions from text."""
import re
return re.findall(r'@(\w+)', text)TikTok Food Content Scraping
def scrape_tiktok_food_content(self, hashtag, country="TH", max_videos=50):
"""Scrape food-related TikTok videos."""
session = self._get_session(country)
videos = []
response = session.get(
f"https://www.tiktok.com/api/challenge/item_list/",
params={
"challengeName": hashtag,
"count": 30,
"cursor": 0
}
)
if response.status_code != 200:
return videos
data = response.json()
items = data.get("itemList", [])
for item in items[:max_videos]:
author = item.get("author", {})
stats = item.get("stats", {})
video = FoodPost(
platform="tiktok",
post_id=item.get("id", ""),
author=author.get("uniqueId", ""),
content_text=item.get("desc", ""),
hashtags=[c.get("title", "") for c in item.get("challenges", [])],
mentions=self._extract_mentions(item.get("desc", "")),
likes=stats.get("diggCount", 0),
comments=stats.get("commentCount", 0),
shares=stats.get("shareCount", 0),
posted_at=datetime.fromtimestamp(item.get("createTime", 0)),
media_urls=[item.get("video", {}).get("cover", "")]
)
# Detect restaurant mentions
video.restaurant_mentioned = self._detect_restaurant_mention(
video.content_text, video.hashtags
)
videos.append(video)
return videosAnalyzing Food Influencer Data
Restaurant Mention Analysis
def analyze_restaurant_mentions(posts, known_restaurants=None):
"""Analyze which restaurants are most frequently mentioned by influencers."""
restaurant_mentions = {}
for post in posts:
# Check location tags
if post.location:
location = post.location
if location not in restaurant_mentions:
restaurant_mentions[location] = {
"mention_count": 0,
"total_engagement": 0,
"avg_engagement": 0,
"posts": [],
"platforms": set()
}
restaurant_mentions[location]["mention_count"] += 1
restaurant_mentions[location]["total_engagement"] += (
post.likes + post.comments + post.shares
)
restaurant_mentions[location]["posts"].append(post.post_id)
restaurant_mentions[location]["platforms"].add(post.platform)
# Check text mentions
if post.restaurant_mentioned:
name = post.restaurant_mentioned
if name not in restaurant_mentions:
restaurant_mentions[name] = {
"mention_count": 0,
"total_engagement": 0,
"avg_engagement": 0,
"posts": [],
"platforms": set()
}
restaurant_mentions[name]["mention_count"] += 1
restaurant_mentions[name]["total_engagement"] += (
post.likes + post.comments + post.shares
)
restaurant_mentions[name]["posts"].append(post.post_id)
restaurant_mentions[name]["platforms"].add(post.platform)
# Calculate averages and format
for name, data in restaurant_mentions.items():
data["avg_engagement"] = round(
data["total_engagement"] / data["mention_count"]
)
data["platforms"] = list(data["platforms"])
return dict(sorted(
restaurant_mentions.items(),
key=lambda x: x[1]["total_engagement"],
reverse=True
))Trending Food Analysis
def analyze_food_trends(posts, timeframe_days=30):
"""Identify trending food topics from influencer content."""
from collections import Counter
from datetime import timedelta
cutoff = datetime.utcnow() - timedelta(days=timeframe_days)
recent_posts = [p for p in posts if p.posted_at >= cutoff]
# Analyze hashtags
all_hashtags = []
for post in recent_posts:
all_hashtags.extend([h.lower() for h in post.hashtags])
food_related_hashtags = [
h for h in all_hashtags
if any(keyword in h for keyword in [
"food", "eat", "restaurant", "cafe", "makan", "กิน", "makanan",
"delivery", "yummy", "delicious", "brunch", "dinner", "lunch",
"noodle", "rice", "chicken", "burger", "pizza", "sushi",
"coffee", "boba", "tea", "dessert", "cake"
])
]
hashtag_counts = Counter(food_related_hashtags)
# Calculate engagement per hashtag
hashtag_engagement = {}
for post in recent_posts:
engagement = post.likes + post.comments + post.shares
for hashtag in post.hashtags:
h = hashtag.lower()
if h not in hashtag_engagement:
hashtag_engagement[h] = {"total": 0, "count": 0}
hashtag_engagement[h]["total"] += engagement
hashtag_engagement[h]["count"] += 1
trending = []
for hashtag, count in hashtag_counts.most_common(50):
eng_data = hashtag_engagement.get(hashtag, {"total": 0, "count": 1})
trending.append({
"hashtag": f"#{hashtag}",
"post_count": count,
"total_engagement": eng_data["total"],
"avg_engagement": round(eng_data["total"] / eng_data["count"]),
"trend_score": count * (eng_data["total"] / eng_data["count"]) / 1000
})
trending.sort(key=lambda x: x["trend_score"], reverse=True)
return trendingInfluencer Identification and Scoring
def score_food_influencers(profiles, posts_by_author):
"""Score and rank food influencers by relevance and influence."""
scored_influencers = []
for profile in profiles:
author_posts = posts_by_author.get(profile.username, [])
if not author_posts:
continue
# Calculate engagement metrics
total_engagement = sum(
p.likes + p.comments + p.shares for p in author_posts
)
avg_engagement = total_engagement / len(author_posts) if author_posts else 0
engagement_rate = (avg_engagement / profile.followers * 100) if profile.followers > 0 else 0
# Food content ratio
food_posts = [
p for p in author_posts
if any(h.lower() in ' '.join([
"food", "eat", "restaurant", "makan", "delicious", "yummy"
]) for h in p.hashtags)
]
food_ratio = len(food_posts) / len(author_posts) if author_posts else 0
# Sponsored content ratio
sponsored = [p for p in author_posts if p.is_sponsored]
sponsored_ratio = len(sponsored) / len(author_posts) if author_posts else 0
# Calculate influence score
influence_score = (
min(profile.followers / 10000, 30) + # Reach (max 30 pts)
min(engagement_rate * 10, 30) + # Engagement (max 30 pts)
food_ratio * 20 + # Food focus (max 20 pts)
min(len(author_posts) / 10, 10) + # Activity (max 10 pts)
(10 if profile.country else 0) # Location verified (10 pts)
)
scored_influencers.append({
"username": profile.username,
"platform": profile.platform,
"followers": profile.followers,
"engagement_rate": round(engagement_rate, 2),
"food_content_ratio": round(food_ratio, 2),
"sponsored_ratio": round(sponsored_ratio, 2),
"influence_score": round(influence_score, 1),
"avg_engagement": round(avg_engagement),
"country": profile.country,
"tier": classify_influencer_tier(profile.followers)
})
return sorted(scored_influencers, key=lambda x: x["influence_score"], reverse=True)
def classify_influencer_tier(followers):
"""Classify influencer by follower count."""
if followers >= 1000000:
return "mega"
elif followers >= 100000:
return "macro"
elif followers >= 10000:
return "mid"
elif followers >= 1000:
return "micro"
else:
return "nano"SEA-Specific Food Hashtags
Track these popular food hashtags across SEA markets:
Singapore
- #sgfood, #sgeats, #singaporefood, #sgfoodie, #burpple, #hungrygowhere
- #hawkerfood, #sgcafe, #sgfoodporn, #sgrestaurant
Malaysia
- #myfood, #malaysianfood, #klfood, #makansedap, #penangfood
- #mamak, #kopitiam, #streetfoodmalaysia
Thailand
- #bangkokfood, #thaifood, #กินเที่ยว, #อาหารอร่อย, #ร้านอาหาร
- #streetfoodthailand, #bangkokeats, #thaifoodie
Philippines
- #foodph, #manilafood, #filipinofood, #kainanph, #eatsph
- #foodmanila, #cebufoodeats
Indonesia
- #kuliner, #kulinerindonesia, #makanenak, #jakartafood
- #kulinerjakarta, #makanmakan, #jajanan
Competitive Influencer Marketing Analysis
def analyze_competitor_influencer_strategy(competitor_name, posts):
"""Analyze how a competitor uses food influencers."""
competitor_mentions = [
p for p in posts
if competitor_name.lower() in p.content_text.lower() or
competitor_name.lower() in [m.lower() for m in p.mentions]
]
if not competitor_mentions:
return {"competitor": competitor_name, "influencer_activity": "none_detected"}
sponsored = [p for p in competitor_mentions if p.is_sponsored]
organic = [p for p in competitor_mentions if not p.is_sponsored]
authors = set(p.author for p in competitor_mentions)
platforms = set(p.platform for p in competitor_mentions)
return {
"competitor": competitor_name,
"total_mentions": len(competitor_mentions),
"sponsored_posts": len(sponsored),
"organic_posts": len(organic),
"unique_influencers": len(authors),
"platforms_used": list(platforms),
"total_reach": sum(p.likes + p.comments + p.shares for p in competitor_mentions),
"avg_engagement_per_post": round(
sum(p.likes + p.comments for p in competitor_mentions) / len(competitor_mentions)
),
"top_influencers": list(authors)[:10],
"estimated_sponsored_spend": estimate_influencer_spend(sponsored)
}
def estimate_influencer_spend(sponsored_posts):
"""Estimate influencer marketing spend based on post metrics."""
total_estimate = 0
for post in sponsored_posts:
followers_estimate = post.likes * 20 # Rough follower estimate
if followers_estimate >= 1000000:
total_estimate += 5000 # Mega influencer
elif followers_estimate >= 100000:
total_estimate += 1500 # Macro
elif followers_estimate >= 10000:
total_estimate += 500 # Mid
else:
total_estimate += 150 # Micro
return total_estimateWhy Mobile Proxies for Social Media Scraping
Social media platforms implement aggressive anti-scraping measures that make mobile proxies essential:
- Rate limiting by IP: Social platforms restrict requests per IP, mobile IPs have higher trust
- Geo-restricted content: Content relevance depends on location, mobile proxies provide authentic geo-targeting
- Mobile-first APIs: Social apps serve different content to mobile vs. desktop users
- Account safety: Using mobile IPs reduces the risk of triggering security challenges
DataResearchTools mobile proxies provide the authentic mobile carrier IPs needed to access social media content across all SEA markets, ensuring you collect comprehensive food influencer data without detection.
Conclusion
Food influencer intelligence gives F&B brands a powerful lens into consumer trends, competitive marketing strategies, and brand perception across Southeast Asia. By systematically scraping and analyzing influencer content with DataResearchTools mobile proxies, businesses can identify trending restaurants, discover effective content strategies, and make data-driven decisions about their own influencer marketing investments.
The key is building a systematic monitoring pipeline that tracks relevant hashtags, influencer profiles, and competitor mentions across platforms. Over time, this data reveals patterns in consumer preferences and marketing effectiveness that are impossible to see through manual observation alone.
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
Related Reading
- Best Proxies for Food Delivery Platform Scraping
- How Cloud Kitchens Use Proxies for Competitive Menu Analysis
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)