Scraping J&T Express, Ninja Van, and Flash Express Tracking Data

Scraping J&T Express, Ninja Van, and Flash Express Tracking Data

Southeast Asia’s e-commerce boom has created a massive courier network dominated by regional carriers like J&T Express, Ninja Van, and Flash Express. These carriers collectively handle billions of parcels annually across Indonesia, Thailand, Vietnam, the Philippines, Malaysia, and Singapore. For logistics companies, e-commerce platforms, and supply chain analysts, access to tracking data from these carriers is essential for monitoring delivery performance, optimizing carrier selection, and ensuring customer satisfaction.

This guide covers the technical aspects of collecting tracking data from these three major SEA courier networks, including the proxy infrastructure, coding techniques, and data management strategies you need.

Overview of Major SEA Courier Networks

J&T Express

J&T Express is one of the largest courier companies in Southeast Asia, with operations in Indonesia, Vietnam, Thailand, the Philippines, Malaysia, Singapore, and China. Founded in Indonesia in 2015, J&T has grown explosively alongside the region’s e-commerce boom.

Tracking system characteristics:

  • Web-based tracking available at each country-specific domain (jtexpress.co.id, jtexpress.co.th, etc.)
  • API endpoints available for bulk tracking in some markets
  • Tracking pages use a combination of server-side rendering and JavaScript
  • Rate limiting is moderate but increases during peak periods
  • Each country’s tracking system operates somewhat independently

Ninja Van

Ninja Van operates across Singapore, Malaysia, Indonesia, Vietnam, Thailand, and the Philippines. Known for its technology-first approach, Ninja Van has robust tracking capabilities and API infrastructure.

Tracking system characteristics:

  • Centralized tracking portal at ninjaVan.co with country-specific subdomains
  • Well-structured API that returns JSON responses
  • More sophisticated anti-bot protections than some competitors
  • Real-time tracking updates with detailed status codes
  • Webhook capabilities for programmatic tracking

Flash Express

Flash Express is a major player in Thailand and has expanded to other SEA markets. It is known for competitive pricing and strong network coverage in Thailand.

Tracking system characteristics:

  • Primary tracking at flashexpress.com with Thai-focused interface
  • REST API endpoints for tracking queries
  • Moderate anti-bot protections
  • Status updates include hub-level detail
  • Thai language responses by default for Thailand operations

Why You Need Proxies for Courier Tracking Data

Country-Specific Access Requirements

Each carrier’s tracking system is primarily designed for domestic users. Accessing tracking data from outside the target country creates several issues:

Content localization: Tracking pages from a foreign IP may redirect to English-language versions that contain less detail than the local language versions. Hub names, status descriptions, and timing information may be simplified or omitted.

Access restrictions: Some carrier tracking APIs reject requests from non-local IP addresses or apply stricter rate limits to international traffic.

Data accuracy: Tracking timestamps and estimated delivery times may be adjusted based on the requester’s timezone, potentially causing confusion in data analysis.

Anti-Bot Protections

Courier tracking pages are common targets for automated access, and carriers have implemented various protections:

CAPTCHA challenges: Many tracking pages present CAPTCHA when they detect automated patterns. Mobile IPs from DataResearchTools rarely trigger these challenges because they appear as legitimate mobile users checking their deliveries.

Rate limiting: Carriers limit the number of tracking queries per IP address per time period. Mobile proxies with automatic rotation distribute queries across many IPs, staying well within per-IP limits.

Browser fingerprinting: Advanced tracking systems check browser characteristics to identify automation tools. Proper configuration of browser automation tools combined with mobile proxies creates a convincing profile.

Building a Multi-Carrier Tracking System

Architecture Design

A robust tracking data collection system needs several components:

Tracking Numbers     Proxy Layer           Collectors          Storage
(from orders DB)     (DataResearchTools)   (per carrier)       (Database)
     |                    |                    |                   |
     v                    v                    v                   v
Order System  --->  Mobile Proxies  --->  J&T Collector  --->  PostgreSQL
                    (country-specific)    Ninja Collector      TimescaleDB
                                         Flash Collector
                                              |
                                              v
                                         Parser/Normalizer
                                              |
                                              v
                                         Alert Engine

Step 1: Configure Country-Specific Proxies

Set up proxy connections for each country where you need to track parcels:

class CourierProxyConfig:
    """Manage proxy connections for multi-country courier tracking."""

    CARRIER_COUNTRIES = {
        "jt_express": ["ID", "TH", "VN", "PH", "MY", "SG"],
        "ninja_van": ["SG", "MY", "ID", "VN", "TH", "PH"],
        "flash_express": ["TH", "PH", "MY", "VN"],
    }

    def __init__(self, proxy_base, username, password):
        self.proxy_base = proxy_base
        self.username = username
        self.password = password

    def get_proxy(self, country_code, session_id=None):
        """Get proxy for specific country with optional sticky session."""
        url = (
            f"http://{self.username}:{self.password}"
            f"@{country_code.lower()}.{self.proxy_base}"
        )
        if session_id:
            url += f"?session={session_id}"
        return {"http": url, "https": url}

    def get_carrier_proxy(self, carrier, tracking_number):
        """Determine the correct country proxy for a tracking number."""
        country = self._detect_country(carrier, tracking_number)
        return self.get_proxy(country)

    def _detect_country(self, carrier, tracking_number):
        """Detect country from tracking number prefix or format."""
        # J&T tracking numbers often have country-specific prefixes
        prefixes = {
            "JP": "ID",   # J&T Indonesia
            "JT": "TH",   # J&T Thailand
            "JV": "VN",   # J&T Vietnam
        }
        prefix = tracking_number[:2].upper()
        return prefixes.get(prefix, "ID")  # Default to Indonesia

Step 2: Build Carrier-Specific Collectors

Each carrier requires a tailored collection approach:

import requests
from datetime import datetime
from dataclasses import dataclass, field
from typing import List, Optional

@dataclass
class TrackingEvent:
    timestamp: str
    status: str
    description: str
    location: str
    raw_status_code: Optional[str] = None

@dataclass
class TrackingResult:
    tracking_number: str
    carrier: str
    country: str
    current_status: str
    events: List[TrackingEvent] = field(default_factory=list)
    estimated_delivery: Optional[str] = None
    collected_at: str = ""

class JTExpressCollector:
    """Collect tracking data from J&T Express."""

    COUNTRY_URLS = {
        "ID": "https://jtexpress.co.id/tracking",
        "TH": "https://jtexpress.co.th/tracking",
        "VN": "https://jtexpress.vn/tracking",
        "PH": "https://jtexpress.ph/tracking",
        "MY": "https://jtexpress.my/tracking",
        "SG": "https://jtexpress.sg/tracking",
    }

    def __init__(self, proxy_config):
        self.proxy_config = proxy_config
        self.session = requests.Session()

    def track(self, tracking_number, country="ID"):
        """Track a J&T Express parcel."""
        proxy = self.proxy_config.get_proxy(country)
        self.session.proxies = proxy
        self.session.headers.update({
            "User-Agent": (
                "Mozilla/5.0 (Linux; Android 13; SM-A546B) "
                "AppleWebKit/537.36 Chrome/120.0.0.0 Mobile Safari/537.36"
            ),
            "Accept": "application/json",
            "Content-Type": "application/json",
        })

        try:
            # J&T typically has an API endpoint behind the tracking page
            response = self.session.post(
                f"{self.COUNTRY_URLS[country]}/api/track",
                json={"tracking_number": tracking_number},
                timeout=30,
            )
            if response.status_code == 200:
                return self._parse_response(
                    response.json(), tracking_number, country
                )
        except requests.RequestException as e:
            print(f"J&T tracking error for {tracking_number}: {e}")
            return None

    def _parse_response(self, data, tracking_number, country):
        """Parse J&T API response into TrackingResult."""
        events = []
        for event in data.get("details", []):
            events.append(TrackingEvent(
                timestamp=event.get("date", ""),
                status=event.get("status", ""),
                description=event.get("description", ""),
                location=event.get("city", ""),
                raw_status_code=event.get("code"),
            ))

        return TrackingResult(
            tracking_number=tracking_number,
            carrier="jt_express",
            country=country,
            current_status=data.get("status", "unknown"),
            events=events,
            estimated_delivery=data.get("estimated_delivery"),
            collected_at=datetime.utcnow().isoformat(),
        )


class NinjaVanCollector:
    """Collect tracking data from Ninja Van."""

    COUNTRY_URLS = {
        "SG": "https://www.ninjavan.co/sg",
        "MY": "https://www.ninjavan.co/my",
        "ID": "https://www.ninjavan.co/id",
        "VN": "https://www.ninjavan.co/vn",
        "TH": "https://www.ninjavan.co/th",
        "PH": "https://www.ninjavan.co/ph",
    }

    def __init__(self, proxy_config):
        self.proxy_config = proxy_config
        self.session = requests.Session()

    def track(self, tracking_number, country="SG"):
        """Track a Ninja Van parcel."""
        proxy = self.proxy_config.get_proxy(country)
        self.session.proxies = proxy
        self.session.headers.update({
            "User-Agent": (
                "Mozilla/5.0 (iPhone; CPU iPhone OS 17_2 like Mac OS X) "
                "AppleWebKit/605.1.15 (KHTML, like Gecko) "
                "Version/17.2 Mobile/15E148 Safari/604.1"
            ),
            "Accept": "application/json",
        })

        try:
            response = self.session.get(
                f"{self.COUNTRY_URLS[country]}/api/tracking/{tracking_number}",
                timeout=30,
            )
            if response.status_code == 200:
                return self._parse_response(
                    response.json(), tracking_number, country
                )
        except requests.RequestException as e:
            print(f"Ninja Van tracking error for {tracking_number}: {e}")
            return None

    def _parse_response(self, data, tracking_number, country):
        """Parse Ninja Van API response into TrackingResult."""
        events = []
        for event in data.get("events", []):
            events.append(TrackingEvent(
                timestamp=event.get("timestamp", ""),
                status=event.get("status", ""),
                description=event.get("description", ""),
                location=event.get("hub_name", ""),
                raw_status_code=event.get("status_code"),
            ))

        return TrackingResult(
            tracking_number=tracking_number,
            carrier="ninja_van",
            country=country,
            current_status=data.get("current_status", "unknown"),
            events=events,
            estimated_delivery=data.get("eta"),
            collected_at=datetime.utcnow().isoformat(),
        )


class FlashExpressCollector:
    """Collect tracking data from Flash Express."""

    def __init__(self, proxy_config):
        self.proxy_config = proxy_config
        self.session = requests.Session()

    def track(self, tracking_number, country="TH"):
        """Track a Flash Express parcel."""
        proxy = self.proxy_config.get_proxy(country)
        self.session.proxies = proxy
        self.session.headers.update({
            "User-Agent": (
                "Mozilla/5.0 (Linux; Android 14; OPPO A78) "
                "AppleWebKit/537.36 Chrome/121.0.0.0 Mobile Safari/537.36"
            ),
            "Accept": "application/json",
            "Accept-Language": "th-TH,th;q=0.9,en;q=0.8",
        })

        try:
            response = self.session.get(
                "https://flashexpress.com/api/tracking",
                params={"tracking_number": tracking_number},
                timeout=30,
            )
            if response.status_code == 200:
                return self._parse_response(
                    response.json(), tracking_number, country
                )
        except requests.RequestException as e:
            print(f"Flash Express tracking error for {tracking_number}: {e}")
            return None

    def _parse_response(self, data, tracking_number, country):
        """Parse Flash Express response into TrackingResult."""
        events = []
        for event in data.get("tracking_details", []):
            events.append(TrackingEvent(
                timestamp=event.get("datetime", ""),
                status=event.get("status_text", ""),
                description=event.get("detail", ""),
                location=event.get("location", ""),
                raw_status_code=event.get("status_code"),
            ))

        return TrackingResult(
            tracking_number=tracking_number,
            carrier="flash_express",
            country=country,
            current_status=data.get("status", "unknown"),
            events=events,
            estimated_delivery=data.get("expected_delivery"),
            collected_at=datetime.utcnow().isoformat(),
        )

Step 3: Implement Batch Tracking

For monitoring large numbers of shipments, implement efficient batch processing:

import time
import random
from concurrent.futures import ThreadPoolExecutor, as_completed

class BatchTracker:
    """Track multiple parcels across carriers with rate limiting."""

    def __init__(self, collectors, max_workers=3):
        self.collectors = collectors
        self.max_workers = max_workers

    def track_batch(self, tracking_requests):
        """
        Track a batch of parcels with controlled concurrency.

        tracking_requests: list of dicts with keys:
            carrier, tracking_number, country
        """
        results = []

        # Group by carrier to manage rate limits per platform
        by_carrier = {}
        for req in tracking_requests:
            carrier = req["carrier"]
            if carrier not in by_carrier:
                by_carrier[carrier] = []
            by_carrier[carrier].append(req)

        for carrier, requests_list in by_carrier.items():
            collector = self.collectors.get(carrier)
            if not collector:
                continue

            for req in requests_list:
                result = collector.track(
                    req["tracking_number"], req["country"]
                )
                if result:
                    results.append(result)
                # Rate limiting: wait between requests
                time.sleep(random.uniform(2, 4))

        return results

Step 4: Normalize Tracking Statuses

Different carriers use different status codes. Normalize them for consistent analysis:

class StatusNormalizer:
    """Normalize tracking statuses across carriers to common format."""

    STATUS_MAP = {
        # J&T Express statuses
        "picked_up": "PICKED_UP",
        "in_transit": "IN_TRANSIT",
        "arrived_at_sorting_center": "IN_TRANSIT",
        "out_for_delivery": "OUT_FOR_DELIVERY",
        "delivered": "DELIVERED",
        "delivery_failed": "FAILED_ATTEMPT",
        "returned": "RETURNED",

        # Ninja Van statuses
        "Pending Pickup": "PENDING_PICKUP",
        "En-route to Sorting Hub": "IN_TRANSIT",
        "Arrived at Sorting Hub": "IN_TRANSIT",
        "Arrived at Origin Hub": "IN_TRANSIT",
        "On Vehicle for Delivery": "OUT_FOR_DELIVERY",
        "Completed": "DELIVERED",
        "Pending Reschedule": "FAILED_ATTEMPT",
        "Returned to Sender": "RETURNED",

        # Flash Express statuses
        "รับพัสดุ": "PICKED_UP",
        "กำลังจัดส่ง": "IN_TRANSIT",
        "อยู่ระหว่างการจัดส่ง": "OUT_FOR_DELIVERY",
        "จัดส่งสำเร็จ": "DELIVERED",
        "จัดส่งไม่สำเร็จ": "FAILED_ATTEMPT",
    }

    STANDARD_STATUSES = [
        "PENDING_PICKUP", "PICKED_UP", "IN_TRANSIT",
        "OUT_FOR_DELIVERY", "DELIVERED", "FAILED_ATTEMPT",
        "RETURNED", "UNKNOWN"
    ]

    def normalize(self, raw_status):
        """Convert carrier-specific status to standard status."""
        return self.STATUS_MAP.get(raw_status, "UNKNOWN")

Step 5: Build Delivery Performance Analytics

With normalized tracking data, analyze carrier performance:

import pandas as pd
from datetime import datetime, timedelta

class DeliveryPerformanceAnalyzer:
    """Analyze delivery performance from collected tracking data."""

    def calculate_delivery_times(self, tracking_results):
        """Calculate actual delivery times from tracking events."""
        delivery_times = []

        for result in tracking_results:
            if result.current_status != "DELIVERED":
                continue

            pickup_time = None
            delivery_time = None

            for event in result.events:
                normalized = StatusNormalizer().normalize(event.status)
                if normalized == "PICKED_UP" and not pickup_time:
                    pickup_time = datetime.fromisoformat(event.timestamp)
                elif normalized == "DELIVERED":
                    delivery_time = datetime.fromisoformat(event.timestamp)

            if pickup_time and delivery_time:
                transit_hours = (
                    delivery_time - pickup_time
                ).total_seconds() / 3600
                delivery_times.append({
                    "carrier": result.carrier,
                    "country": result.country,
                    "tracking_number": result.tracking_number,
                    "transit_hours": transit_hours,
                    "transit_days": transit_hours / 24,
                })

        return pd.DataFrame(delivery_times)

    def carrier_comparison(self, delivery_df):
        """Compare delivery performance across carriers."""
        summary = delivery_df.groupby(["carrier", "country"]).agg({
            "transit_hours": ["mean", "median", "std", "min", "max", "count"],
        }).round(2)

        return summary

    def on_time_rate(self, delivery_df, sla_hours=72):
        """Calculate on-time delivery rate based on SLA threshold."""
        delivery_df["on_time"] = delivery_df["transit_hours"] <= sla_hours

        rates = delivery_df.groupby("carrier").agg(
            total=("on_time", "count"),
            on_time_count=("on_time", "sum"),
        )
        rates["on_time_pct"] = (
            rates["on_time_count"] / rates["total"] * 100
        ).round(1)

        return rates

Handling Common Challenges

Rate Limiting Across Carriers

Each carrier has different rate limit thresholds. With DataResearchTools mobile proxies and automatic rotation, you can maintain higher throughput while staying within acceptable limits:

  • J&T Express: Moderate rate limits. Use 3-5 second delays between queries.
  • Ninja Van: Stricter rate limiting. Use 5-8 second delays and rotate sessions.
  • Flash Express: Moderate limits. Use 3-6 second delays.

Multi-Language Tracking Data

Flash Express returns tracking data in Thai by default. J&T Express in Indonesia returns data in Bahasa Indonesia. Handle multi-language data:

def translate_status(status_text, source_language):
    """Map common courier status phrases to English."""
    translations = {
        "th": {
            "รับพัสดุแล้ว": "Parcel picked up",
            "ถึงศูนย์คัดแยก": "Arrived at sorting center",
            "กำลังนำส่ง": "Out for delivery",
            "นำส่งสำเร็จ": "Delivered successfully",
        },
        "id": {
            "Paket telah dipickup": "Parcel picked up",
            "Paket diterima di gudang": "Arrived at warehouse",
            "Paket sedang dikirim": "Out for delivery",
            "Paket telah diterima": "Delivered",
        },
        "vi": {
            "Đã lấy hàng": "Parcel picked up",
            "Đang vận chuyển": "In transit",
            "Đang giao hàng": "Out for delivery",
            "Giao thành công": "Delivered",
        },
    }
    lang_map = translations.get(source_language, {})
    return lang_map.get(status_text, status_text)

Stale and Missing Tracking Data

Not all tracking queries return useful data. Handle edge cases:

  • No data found: The parcel may not have been scanned yet. Retry after a delay.
  • Stale data: The last event may be days old, indicating a potential issue. Flag for investigation.
  • Incomplete data: Some events may be missing from the tracking history. Cross-reference with carrier customer service if critical.

Practical Applications

E-Commerce Customer Service

Automate tracking updates to customers by collecting data from carriers via proxies and pushing notifications through your own channels. This reduces “where is my package?” inquiries by proactively informing customers.

Carrier Performance Benchmarking

Compare carriers on actual delivery performance rather than promises. DataResearchTools proxies enable you to collect tracking data at scale across all major SEA carriers, building a comprehensive performance database.

SLA Monitoring and Enforcement

For logistics companies with SLA commitments, automated tracking data collection enables real-time SLA monitoring. Set up alerts when parcels approach SLA deadlines without delivery.

Route and Network Analysis

Analyzing tracking event locations reveals carrier network structures, identifying which hubs serve which areas and where bottlenecks occur. This intelligence informs route optimization and carrier selection.

Conclusion

Scraping tracking data from J&T Express, Ninja Van, and Flash Express is a foundational capability for any company operating in Southeast Asian e-commerce logistics. The combination of carrier-specific collectors, proper proxy infrastructure from DataResearchTools, and normalized data storage creates a powerful monitoring system.

DataResearchTools mobile proxies are essential for this use case because they provide authentic mobile connections from each SEA country, matching the profile of the millions of consumers and sellers who check tracking information on their phones daily. This ensures reliable access without triggering the anti-bot protections that carriers have implemented.

Whether you are building a multi-carrier tracking aggregator, monitoring your own shipping performance, or analyzing the logistics landscape for strategic decisions, systematic tracking data collection is a capability worth investing in.


Related Reading

last updated: April 3, 2026

Scroll to Top

Resources

Proxy Signals Podcast
Operator-level insights on mobile proxies and access infrastructure.

Multi-Account Proxies: Setup, Types, Tools & Mistakes (2026)