Scraping J&T Express, Ninja Van, and Flash Express Tracking Data
Southeast Asia’s e-commerce boom has created a massive courier network dominated by regional carriers like J&T Express, Ninja Van, and Flash Express. These carriers collectively handle billions of parcels annually across Indonesia, Thailand, Vietnam, the Philippines, Malaysia, and Singapore. For logistics companies, e-commerce platforms, and supply chain analysts, access to tracking data from these carriers is essential for monitoring delivery performance, optimizing carrier selection, and ensuring customer satisfaction.
This guide covers the technical aspects of collecting tracking data from these three major SEA courier networks, including the proxy infrastructure, coding techniques, and data management strategies you need.
Overview of Major SEA Courier Networks
J&T Express
J&T Express is one of the largest courier companies in Southeast Asia, with operations in Indonesia, Vietnam, Thailand, the Philippines, Malaysia, Singapore, and China. Founded in Indonesia in 2015, J&T has grown explosively alongside the region’s e-commerce boom.
Tracking system characteristics:
- Web-based tracking available at each country-specific domain (jtexpress.co.id, jtexpress.co.th, etc.)
- API endpoints available for bulk tracking in some markets
- Tracking pages use a combination of server-side rendering and JavaScript
- Rate limiting is moderate but increases during peak periods
- Each country’s tracking system operates somewhat independently
Ninja Van
Ninja Van operates across Singapore, Malaysia, Indonesia, Vietnam, Thailand, and the Philippines. Known for its technology-first approach, Ninja Van has robust tracking capabilities and API infrastructure.
Tracking system characteristics:
- Centralized tracking portal at ninjaVan.co with country-specific subdomains
- Well-structured API that returns JSON responses
- More sophisticated anti-bot protections than some competitors
- Real-time tracking updates with detailed status codes
- Webhook capabilities for programmatic tracking
Flash Express
Flash Express is a major player in Thailand and has expanded to other SEA markets. It is known for competitive pricing and strong network coverage in Thailand.
Tracking system characteristics:
- Primary tracking at flashexpress.com with Thai-focused interface
- REST API endpoints for tracking queries
- Moderate anti-bot protections
- Status updates include hub-level detail
- Thai language responses by default for Thailand operations
Why You Need Proxies for Courier Tracking Data
Country-Specific Access Requirements
Each carrier’s tracking system is primarily designed for domestic users. Accessing tracking data from outside the target country creates several issues:
Content localization: Tracking pages from a foreign IP may redirect to English-language versions that contain less detail than the local language versions. Hub names, status descriptions, and timing information may be simplified or omitted.
Access restrictions: Some carrier tracking APIs reject requests from non-local IP addresses or apply stricter rate limits to international traffic.
Data accuracy: Tracking timestamps and estimated delivery times may be adjusted based on the requester’s timezone, potentially causing confusion in data analysis.
Anti-Bot Protections
Courier tracking pages are common targets for automated access, and carriers have implemented various protections:
CAPTCHA challenges: Many tracking pages present CAPTCHA when they detect automated patterns. Mobile IPs from DataResearchTools rarely trigger these challenges because they appear as legitimate mobile users checking their deliveries.
Rate limiting: Carriers limit the number of tracking queries per IP address per time period. Mobile proxies with automatic rotation distribute queries across many IPs, staying well within per-IP limits.
Browser fingerprinting: Advanced tracking systems check browser characteristics to identify automation tools. Proper configuration of browser automation tools combined with mobile proxies creates a convincing profile.
Building a Multi-Carrier Tracking System
Architecture Design
A robust tracking data collection system needs several components:
Tracking Numbers Proxy Layer Collectors Storage
(from orders DB) (DataResearchTools) (per carrier) (Database)
| | | |
v v v v
Order System ---> Mobile Proxies ---> J&T Collector ---> PostgreSQL
(country-specific) Ninja Collector TimescaleDB
Flash Collector
|
v
Parser/Normalizer
|
v
Alert EngineStep 1: Configure Country-Specific Proxies
Set up proxy connections for each country where you need to track parcels:
class CourierProxyConfig:
"""Manage proxy connections for multi-country courier tracking."""
CARRIER_COUNTRIES = {
"jt_express": ["ID", "TH", "VN", "PH", "MY", "SG"],
"ninja_van": ["SG", "MY", "ID", "VN", "TH", "PH"],
"flash_express": ["TH", "PH", "MY", "VN"],
}
def __init__(self, proxy_base, username, password):
self.proxy_base = proxy_base
self.username = username
self.password = password
def get_proxy(self, country_code, session_id=None):
"""Get proxy for specific country with optional sticky session."""
url = (
f"http://{self.username}:{self.password}"
f"@{country_code.lower()}.{self.proxy_base}"
)
if session_id:
url += f"?session={session_id}"
return {"http": url, "https": url}
def get_carrier_proxy(self, carrier, tracking_number):
"""Determine the correct country proxy for a tracking number."""
country = self._detect_country(carrier, tracking_number)
return self.get_proxy(country)
def _detect_country(self, carrier, tracking_number):
"""Detect country from tracking number prefix or format."""
# J&T tracking numbers often have country-specific prefixes
prefixes = {
"JP": "ID", # J&T Indonesia
"JT": "TH", # J&T Thailand
"JV": "VN", # J&T Vietnam
}
prefix = tracking_number[:2].upper()
return prefixes.get(prefix, "ID") # Default to IndonesiaStep 2: Build Carrier-Specific Collectors
Each carrier requires a tailored collection approach:
import requests
from datetime import datetime
from dataclasses import dataclass, field
from typing import List, Optional
@dataclass
class TrackingEvent:
timestamp: str
status: str
description: str
location: str
raw_status_code: Optional[str] = None
@dataclass
class TrackingResult:
tracking_number: str
carrier: str
country: str
current_status: str
events: List[TrackingEvent] = field(default_factory=list)
estimated_delivery: Optional[str] = None
collected_at: str = ""
class JTExpressCollector:
"""Collect tracking data from J&T Express."""
COUNTRY_URLS = {
"ID": "https://jtexpress.co.id/tracking",
"TH": "https://jtexpress.co.th/tracking",
"VN": "https://jtexpress.vn/tracking",
"PH": "https://jtexpress.ph/tracking",
"MY": "https://jtexpress.my/tracking",
"SG": "https://jtexpress.sg/tracking",
}
def __init__(self, proxy_config):
self.proxy_config = proxy_config
self.session = requests.Session()
def track(self, tracking_number, country="ID"):
"""Track a J&T Express parcel."""
proxy = self.proxy_config.get_proxy(country)
self.session.proxies = proxy
self.session.headers.update({
"User-Agent": (
"Mozilla/5.0 (Linux; Android 13; SM-A546B) "
"AppleWebKit/537.36 Chrome/120.0.0.0 Mobile Safari/537.36"
),
"Accept": "application/json",
"Content-Type": "application/json",
})
try:
# J&T typically has an API endpoint behind the tracking page
response = self.session.post(
f"{self.COUNTRY_URLS[country]}/api/track",
json={"tracking_number": tracking_number},
timeout=30,
)
if response.status_code == 200:
return self._parse_response(
response.json(), tracking_number, country
)
except requests.RequestException as e:
print(f"J&T tracking error for {tracking_number}: {e}")
return None
def _parse_response(self, data, tracking_number, country):
"""Parse J&T API response into TrackingResult."""
events = []
for event in data.get("details", []):
events.append(TrackingEvent(
timestamp=event.get("date", ""),
status=event.get("status", ""),
description=event.get("description", ""),
location=event.get("city", ""),
raw_status_code=event.get("code"),
))
return TrackingResult(
tracking_number=tracking_number,
carrier="jt_express",
country=country,
current_status=data.get("status", "unknown"),
events=events,
estimated_delivery=data.get("estimated_delivery"),
collected_at=datetime.utcnow().isoformat(),
)
class NinjaVanCollector:
"""Collect tracking data from Ninja Van."""
COUNTRY_URLS = {
"SG": "https://www.ninjavan.co/sg",
"MY": "https://www.ninjavan.co/my",
"ID": "https://www.ninjavan.co/id",
"VN": "https://www.ninjavan.co/vn",
"TH": "https://www.ninjavan.co/th",
"PH": "https://www.ninjavan.co/ph",
}
def __init__(self, proxy_config):
self.proxy_config = proxy_config
self.session = requests.Session()
def track(self, tracking_number, country="SG"):
"""Track a Ninja Van parcel."""
proxy = self.proxy_config.get_proxy(country)
self.session.proxies = proxy
self.session.headers.update({
"User-Agent": (
"Mozilla/5.0 (iPhone; CPU iPhone OS 17_2 like Mac OS X) "
"AppleWebKit/605.1.15 (KHTML, like Gecko) "
"Version/17.2 Mobile/15E148 Safari/604.1"
),
"Accept": "application/json",
})
try:
response = self.session.get(
f"{self.COUNTRY_URLS[country]}/api/tracking/{tracking_number}",
timeout=30,
)
if response.status_code == 200:
return self._parse_response(
response.json(), tracking_number, country
)
except requests.RequestException as e:
print(f"Ninja Van tracking error for {tracking_number}: {e}")
return None
def _parse_response(self, data, tracking_number, country):
"""Parse Ninja Van API response into TrackingResult."""
events = []
for event in data.get("events", []):
events.append(TrackingEvent(
timestamp=event.get("timestamp", ""),
status=event.get("status", ""),
description=event.get("description", ""),
location=event.get("hub_name", ""),
raw_status_code=event.get("status_code"),
))
return TrackingResult(
tracking_number=tracking_number,
carrier="ninja_van",
country=country,
current_status=data.get("current_status", "unknown"),
events=events,
estimated_delivery=data.get("eta"),
collected_at=datetime.utcnow().isoformat(),
)
class FlashExpressCollector:
"""Collect tracking data from Flash Express."""
def __init__(self, proxy_config):
self.proxy_config = proxy_config
self.session = requests.Session()
def track(self, tracking_number, country="TH"):
"""Track a Flash Express parcel."""
proxy = self.proxy_config.get_proxy(country)
self.session.proxies = proxy
self.session.headers.update({
"User-Agent": (
"Mozilla/5.0 (Linux; Android 14; OPPO A78) "
"AppleWebKit/537.36 Chrome/121.0.0.0 Mobile Safari/537.36"
),
"Accept": "application/json",
"Accept-Language": "th-TH,th;q=0.9,en;q=0.8",
})
try:
response = self.session.get(
"https://flashexpress.com/api/tracking",
params={"tracking_number": tracking_number},
timeout=30,
)
if response.status_code == 200:
return self._parse_response(
response.json(), tracking_number, country
)
except requests.RequestException as e:
print(f"Flash Express tracking error for {tracking_number}: {e}")
return None
def _parse_response(self, data, tracking_number, country):
"""Parse Flash Express response into TrackingResult."""
events = []
for event in data.get("tracking_details", []):
events.append(TrackingEvent(
timestamp=event.get("datetime", ""),
status=event.get("status_text", ""),
description=event.get("detail", ""),
location=event.get("location", ""),
raw_status_code=event.get("status_code"),
))
return TrackingResult(
tracking_number=tracking_number,
carrier="flash_express",
country=country,
current_status=data.get("status", "unknown"),
events=events,
estimated_delivery=data.get("expected_delivery"),
collected_at=datetime.utcnow().isoformat(),
)Step 3: Implement Batch Tracking
For monitoring large numbers of shipments, implement efficient batch processing:
import time
import random
from concurrent.futures import ThreadPoolExecutor, as_completed
class BatchTracker:
"""Track multiple parcels across carriers with rate limiting."""
def __init__(self, collectors, max_workers=3):
self.collectors = collectors
self.max_workers = max_workers
def track_batch(self, tracking_requests):
"""
Track a batch of parcels with controlled concurrency.
tracking_requests: list of dicts with keys:
carrier, tracking_number, country
"""
results = []
# Group by carrier to manage rate limits per platform
by_carrier = {}
for req in tracking_requests:
carrier = req["carrier"]
if carrier not in by_carrier:
by_carrier[carrier] = []
by_carrier[carrier].append(req)
for carrier, requests_list in by_carrier.items():
collector = self.collectors.get(carrier)
if not collector:
continue
for req in requests_list:
result = collector.track(
req["tracking_number"], req["country"]
)
if result:
results.append(result)
# Rate limiting: wait between requests
time.sleep(random.uniform(2, 4))
return resultsStep 4: Normalize Tracking Statuses
Different carriers use different status codes. Normalize them for consistent analysis:
class StatusNormalizer:
"""Normalize tracking statuses across carriers to common format."""
STATUS_MAP = {
# J&T Express statuses
"picked_up": "PICKED_UP",
"in_transit": "IN_TRANSIT",
"arrived_at_sorting_center": "IN_TRANSIT",
"out_for_delivery": "OUT_FOR_DELIVERY",
"delivered": "DELIVERED",
"delivery_failed": "FAILED_ATTEMPT",
"returned": "RETURNED",
# Ninja Van statuses
"Pending Pickup": "PENDING_PICKUP",
"En-route to Sorting Hub": "IN_TRANSIT",
"Arrived at Sorting Hub": "IN_TRANSIT",
"Arrived at Origin Hub": "IN_TRANSIT",
"On Vehicle for Delivery": "OUT_FOR_DELIVERY",
"Completed": "DELIVERED",
"Pending Reschedule": "FAILED_ATTEMPT",
"Returned to Sender": "RETURNED",
# Flash Express statuses
"รับพัสดุ": "PICKED_UP",
"กำลังจัดส่ง": "IN_TRANSIT",
"อยู่ระหว่างการจัดส่ง": "OUT_FOR_DELIVERY",
"จัดส่งสำเร็จ": "DELIVERED",
"จัดส่งไม่สำเร็จ": "FAILED_ATTEMPT",
}
STANDARD_STATUSES = [
"PENDING_PICKUP", "PICKED_UP", "IN_TRANSIT",
"OUT_FOR_DELIVERY", "DELIVERED", "FAILED_ATTEMPT",
"RETURNED", "UNKNOWN"
]
def normalize(self, raw_status):
"""Convert carrier-specific status to standard status."""
return self.STATUS_MAP.get(raw_status, "UNKNOWN")Step 5: Build Delivery Performance Analytics
With normalized tracking data, analyze carrier performance:
import pandas as pd
from datetime import datetime, timedelta
class DeliveryPerformanceAnalyzer:
"""Analyze delivery performance from collected tracking data."""
def calculate_delivery_times(self, tracking_results):
"""Calculate actual delivery times from tracking events."""
delivery_times = []
for result in tracking_results:
if result.current_status != "DELIVERED":
continue
pickup_time = None
delivery_time = None
for event in result.events:
normalized = StatusNormalizer().normalize(event.status)
if normalized == "PICKED_UP" and not pickup_time:
pickup_time = datetime.fromisoformat(event.timestamp)
elif normalized == "DELIVERED":
delivery_time = datetime.fromisoformat(event.timestamp)
if pickup_time and delivery_time:
transit_hours = (
delivery_time - pickup_time
).total_seconds() / 3600
delivery_times.append({
"carrier": result.carrier,
"country": result.country,
"tracking_number": result.tracking_number,
"transit_hours": transit_hours,
"transit_days": transit_hours / 24,
})
return pd.DataFrame(delivery_times)
def carrier_comparison(self, delivery_df):
"""Compare delivery performance across carriers."""
summary = delivery_df.groupby(["carrier", "country"]).agg({
"transit_hours": ["mean", "median", "std", "min", "max", "count"],
}).round(2)
return summary
def on_time_rate(self, delivery_df, sla_hours=72):
"""Calculate on-time delivery rate based on SLA threshold."""
delivery_df["on_time"] = delivery_df["transit_hours"] <= sla_hours
rates = delivery_df.groupby("carrier").agg(
total=("on_time", "count"),
on_time_count=("on_time", "sum"),
)
rates["on_time_pct"] = (
rates["on_time_count"] / rates["total"] * 100
).round(1)
return ratesHandling Common Challenges
Rate Limiting Across Carriers
Each carrier has different rate limit thresholds. With DataResearchTools mobile proxies and automatic rotation, you can maintain higher throughput while staying within acceptable limits:
- J&T Express: Moderate rate limits. Use 3-5 second delays between queries.
- Ninja Van: Stricter rate limiting. Use 5-8 second delays and rotate sessions.
- Flash Express: Moderate limits. Use 3-6 second delays.
Multi-Language Tracking Data
Flash Express returns tracking data in Thai by default. J&T Express in Indonesia returns data in Bahasa Indonesia. Handle multi-language data:
def translate_status(status_text, source_language):
"""Map common courier status phrases to English."""
translations = {
"th": {
"รับพัสดุแล้ว": "Parcel picked up",
"ถึงศูนย์คัดแยก": "Arrived at sorting center",
"กำลังนำส่ง": "Out for delivery",
"นำส่งสำเร็จ": "Delivered successfully",
},
"id": {
"Paket telah dipickup": "Parcel picked up",
"Paket diterima di gudang": "Arrived at warehouse",
"Paket sedang dikirim": "Out for delivery",
"Paket telah diterima": "Delivered",
},
"vi": {
"Đã lấy hàng": "Parcel picked up",
"Đang vận chuyển": "In transit",
"Đang giao hàng": "Out for delivery",
"Giao thành công": "Delivered",
},
}
lang_map = translations.get(source_language, {})
return lang_map.get(status_text, status_text)Stale and Missing Tracking Data
Not all tracking queries return useful data. Handle edge cases:
- No data found: The parcel may not have been scanned yet. Retry after a delay.
- Stale data: The last event may be days old, indicating a potential issue. Flag for investigation.
- Incomplete data: Some events may be missing from the tracking history. Cross-reference with carrier customer service if critical.
Practical Applications
E-Commerce Customer Service
Automate tracking updates to customers by collecting data from carriers via proxies and pushing notifications through your own channels. This reduces “where is my package?” inquiries by proactively informing customers.
Carrier Performance Benchmarking
Compare carriers on actual delivery performance rather than promises. DataResearchTools proxies enable you to collect tracking data at scale across all major SEA carriers, building a comprehensive performance database.
SLA Monitoring and Enforcement
For logistics companies with SLA commitments, automated tracking data collection enables real-time SLA monitoring. Set up alerts when parcels approach SLA deadlines without delivery.
Route and Network Analysis
Analyzing tracking event locations reveals carrier network structures, identifying which hubs serve which areas and where bottlenecks occur. This intelligence informs route optimization and carrier selection.
Conclusion
Scraping tracking data from J&T Express, Ninja Van, and Flash Express is a foundational capability for any company operating in Southeast Asian e-commerce logistics. The combination of carrier-specific collectors, proper proxy infrastructure from DataResearchTools, and normalized data storage creates a powerful monitoring system.
DataResearchTools mobile proxies are essential for this use case because they provide authentic mobile connections from each SEA country, matching the profile of the millions of consumers and sellers who check tracking information on their phones daily. This ensures reliable access without triggering the anti-bot protections that carriers have implemented.
Whether you are building a multi-carrier tracking aggregator, monitoring your own shipping performance, or analyzing the logistics landscape for strategic decisions, systematic tracking data collection is a capability worth investing in.
- Building a Delivery SLA Monitoring System with Proxies
- Building a Freight Rate Comparison Engine with Proxy Infrastructure
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- Best Proxies for Logistics and Supply Chain Data Collection
- Building a Delivery SLA Monitoring System with Proxies
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Best Proxies for Logistics and Supply Chain Data Collection
- Building a Delivery SLA Monitoring System with Proxies
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
Related Reading
- Best Proxies for Logistics and Supply Chain Data Collection
- Building a Delivery SLA Monitoring System with Proxies
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
last updated: April 3, 2026