How to Scrape Freight Rate Data from Shipping Platforms

How to Scrape Freight Rate Data from Shipping Platforms

Freight rates fluctuate constantly. Ocean shipping costs can swing by 30-50% within a single quarter depending on demand surges, fuel prices, port congestion, and geopolitical disruptions. For freight forwarders, shippers, and logistics technology companies, having real-time access to rate data across multiple platforms is not a luxury but a necessity for making informed decisions.

This guide walks through the practical process of collecting freight rate data from shipping platforms, including the tools, proxy infrastructure, and techniques you need to build a reliable rate monitoring system.

Understanding the Freight Rate Data Landscape

Freight rate data lives across dozens of platforms, each with different structures, access methods, and anti-scraping protections. Understanding this landscape is the first step to building an effective collection system.

Major Freight Rate Platforms

Freightos is one of the largest digital freight marketplaces, offering rates for ocean, air, and land freight. Their platform aggregates quotes from multiple carriers and freight forwarders. Rate data on Freightos changes frequently and varies by origin-destination pair, container type, and booking timeline.

Xeneta provides benchmarking data based on contracted and spot rates from a large network of shippers. While much of their data is behind a subscription, publicly accessible rate indices and trends can be collected for market intelligence purposes.

Container xChange focuses on container leasing and trading, with valuable data on container availability and one-way leasing rates. Their platform provides insights into equipment costs that directly impact total shipping expenses.

Individual carrier portals from companies like Maersk, MSC, CMA CGM, Evergreen, and Hapag-Lloyd each publish their own spot rates, surcharges, and service schedules. These are primary sources for actual carrier pricing.

Regional platforms serving Southeast Asian trade lanes include platforms specific to intra-Asia shipping routes, which often carry different pricing dynamics than major East-West trade lanes.

Types of Freight Rate Data

The data you can collect from these platforms includes:

  • Spot rates: Current market prices for immediate or near-term shipments
  • Contract rates: Longer-term negotiated rates (often partially visible)
  • Surcharges: Fuel surcharges (BAF/BAS), currency adjustment factors (CAF), peak season surcharges
  • Transit times: Door-to-door and port-to-port estimated durations
  • Service schedules: Sailing frequencies, vessel assignments, port rotation sequences
  • Equipment availability: Container type availability by location

Why Proxies Are Essential for Freight Rate Scraping

Shipping platforms employ multiple layers of protection against automated data collection. Understanding these protections helps you design a more effective collection strategy.

Platform Protection Mechanisms

IP-based rate limiting is the most common defense. Platforms track the number of requests per IP address and throttle or block addresses that exceed normal usage patterns. A human user might check 5-10 rate quotes per session, while a data collection script might attempt hundreds or thousands.

Geographic content serving means that the same rate query can return different results depending on where the request originates. A rate quote requested from a Singapore IP may differ from one requested from a US IP, reflecting local pricing, currency, and service availability.

Bot detection systems analyze request patterns, browser fingerprints, and behavioral signals to distinguish automated traffic from human users. Modern systems use JavaScript challenges, CAPTCHA, and behavioral analysis.

Why Mobile Proxies Excel for Freight Data

Mobile proxies are the most effective proxy type for freight rate collection because:

  1. High trust level: Mobile IPs are shared among thousands of users through CGNAT, making them nearly impossible to block without affecting legitimate users
  2. Natural traffic patterns: Requests from mobile IPs match the profile of users checking rates on their phones, which is increasingly common in the logistics industry
  3. Geographic authenticity: Mobile proxies from DataResearchTools provide genuine connections through local carriers, ensuring you receive locally accurate rate data

DataResearchTools offers mobile proxy connections through carriers across Southeast Asia, which is particularly valuable for collecting rate data on intra-Asia trade lanes that are underserved by Western-focused proxy providers.

Step-by-Step Guide to Scraping Freight Rates

Step 1: Identify Your Target Routes and Data Points

Before writing any code, define exactly what data you need:

Target routes:
- Singapore to Bangkok (ocean FCL, LCL)
- Jakarta to Ho Chi Minh City (ocean FCL)
- Manila to Kuala Lumpur (air cargo)
- Shenzhen to Singapore (ocean FCL, 20ft and 40ft)

Data points per route:
- Base rate per container/weight
- Fuel surcharge
- Terminal handling charges
- Transit time
- Carrier name
- Valid from/to dates

Step 2: Analyze Target Platform Structure

Before building scrapers, manually explore each target platform to understand its structure. Use your browser’s developer tools to examine:

  • Page structure: How rate information is displayed on the page
  • API calls: Many platforms load rate data through AJAX/API calls that return structured JSON, which is much easier to parse than HTML
  • Authentication: Whether rate queries require login or can be accessed anonymously
  • Request parameters: What parameters are needed for rate queries (origin, destination, container type, date)
# Example: Analyzing a shipping platform's API calls
# After inspecting network traffic, you might find an API endpoint like:
# GET /api/v2/rates?origin=SGSIN&destination=THBKK&container=40HC&date=2026-03-15

# This is much more efficient than scraping rendered HTML

Step 3: Set Up Your Proxy Infrastructure

Configure your DataResearchTools mobile proxies for the collection job:

import requests
from itertools import cycle

# Configure proxy endpoints for different SEA countries
proxy_endpoints = {
    "singapore": "http://user:pass@sg.dataresearchtools.com:port",
    "thailand": "http://user:pass@th.dataresearchtools.com:port",
    "indonesia": "http://user:pass@id.dataresearchtools.com:port",
    "vietnam": "http://user:pass@vn.dataresearchtools.com:port",
    "philippines": "http://user:pass@ph.dataresearchtools.com:port",
    "malaysia": "http://user:pass@my.dataresearchtools.com:port",
}

def get_proxy_for_route(origin_country):
    """Select proxy based on the origin country of the freight route."""
    proxy_url = proxy_endpoints.get(origin_country, proxy_endpoints["singapore"])
    return {"http": proxy_url, "https": proxy_url}

Step 4: Build Your Rate Collection Scripts

Here is a structured approach to building freight rate scrapers:

import requests
import json
import time
import random
from datetime import datetime
from dataclasses import dataclass
from typing import List, Optional

@dataclass
class FreightRate:
    origin: str
    destination: str
    carrier: str
    container_type: str
    base_rate: float
    currency: str
    fuel_surcharge: float
    transit_days: int
    valid_from: str
    valid_to: str
    collected_at: str
    source_platform: str

class FreightRateCollector:
    def __init__(self, proxy_config):
        self.proxy_config = proxy_config
        self.session = requests.Session()
        self.session.headers.update({
            "User-Agent": "Mozilla/5.0 (Linux; Android 13; SM-S918B) "
                          "AppleWebKit/537.36 (KHTML, like Gecko) "
                          "Chrome/120.0.0.0 Mobile Safari/537.36",
            "Accept": "application/json, text/html",
            "Accept-Language": "en-US,en;q=0.9",
        })

    def collect_rate(self, origin, destination, container_type, proxy):
        """Collect a single freight rate quote."""
        self.session.proxies = proxy

        try:
            response = self.session.get(
                f"https://platform.example.com/api/rates",
                params={
                    "origin": origin,
                    "destination": destination,
                    "equipment": container_type,
                },
                timeout=30
            )
            response.raise_for_status()
            return self.parse_rate_response(response.json(), origin, destination)
        except requests.RequestException as e:
            print(f"Error collecting rate {origin}-{destination}: {e}")
            return None

    def parse_rate_response(self, data, origin, destination):
        """Parse API response into FreightRate objects."""
        rates = []
        for quote in data.get("quotes", []):
            rate = FreightRate(
                origin=origin,
                destination=destination,
                carrier=quote["carrier_name"],
                container_type=quote["equipment"],
                base_rate=quote["total_rate"],
                currency=quote["currency"],
                fuel_surcharge=quote.get("baf", 0),
                transit_days=quote["transit_time"],
                valid_from=quote["valid_from"],
                valid_to=quote["valid_to"],
                collected_at=datetime.utcnow().isoformat(),
                source_platform="platform_name"
            )
            rates.append(rate)
        return rates

    def collect_all_routes(self, routes):
        """Collect rates for all defined routes with delays."""
        all_rates = []
        for route in routes:
            proxy = get_proxy_for_route(route["origin_country"])
            rates = self.collect_rate(
                route["origin"],
                route["destination"],
                route["container_type"],
                proxy
            )
            if rates:
                all_rates.extend(rates)
            # Respectful delay between requests
            time.sleep(random.uniform(3, 7))
        return all_rates

Step 5: Handle Anti-Bot Challenges

Freight platforms increasingly use JavaScript rendering and bot detection. For these cases, use browser automation with proxy support:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

def create_proxied_browser(proxy_url):
    """Create a Selenium browser instance routed through a proxy."""
    chrome_options = Options()
    chrome_options.add_argument(f"--proxy-server={proxy_url}")
    chrome_options.add_argument("--disable-blink-features=AutomationControlled")
    chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])

    driver = webdriver.Chrome(options=chrome_options)
    return driver

Step 6: Store and Analyze Rate Data

Store collected rates in a structured database for trend analysis:

import sqlite3
from datetime import datetime

def store_rates(rates, db_path="freight_rates.db"):
    """Store collected freight rates in SQLite database."""
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()

    cursor.execute("""
        CREATE TABLE IF NOT EXISTS freight_rates (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            origin TEXT,
            destination TEXT,
            carrier TEXT,
            container_type TEXT,
            base_rate REAL,
            currency TEXT,
            fuel_surcharge REAL,
            transit_days INTEGER,
            valid_from TEXT,
            valid_to TEXT,
            collected_at TEXT,
            source_platform TEXT
        )
    """)

    for rate in rates:
        cursor.execute("""
            INSERT INTO freight_rates
            (origin, destination, carrier, container_type, base_rate,
             currency, fuel_surcharge, transit_days, valid_from,
             valid_to, collected_at, source_platform)
            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
        """, (
            rate.origin, rate.destination, rate.carrier,
            rate.container_type, rate.base_rate, rate.currency,
            rate.fuel_surcharge, rate.transit_days, rate.valid_from,
            rate.valid_to, rate.collected_at, rate.source_platform
        ))

    conn.commit()
    conn.close()

Handling Common Challenges

Dynamic Pricing Pages

Many freight platforms load rate data dynamically through JavaScript. Static HTML scraping will not capture this data. Solutions include:

  • Browser automation (Selenium, Playwright) to render JavaScript before extracting data
  • API interception to identify and directly call the underlying data APIs
  • Headless browser services for scalable JavaScript rendering

Multi-Step Rate Queries

Some platforms require multi-step interactions: selecting origin, then destination, then container type, before displaying rates. Use sticky sessions with DataResearchTools to maintain the same IP throughout a multi-step interaction:

# Use sticky session to maintain the same IP for a complete rate query
session_proxy = "http://user:pass@sg.dataresearchtools.com:port?session=rate_query_001"

Rate Data Validation

Not all collected data is accurate. Implement validation rules:

def validate_rate(rate):
    """Basic validation for collected freight rates."""
    if rate.base_rate <= 0 or rate.base_rate > 50000:
        return False  # Unreasonable rate
    if rate.transit_days <= 0 or rate.transit_days > 90:
        return False  # Unreasonable transit time
    if rate.valid_from > rate.valid_to:
        return False  # Invalid date range
    return True

Scheduling and Automation

Freight rates change frequently, so set up automated collection schedules:

  • Spot rates: Collect daily or twice daily
  • Contract rate benchmarks: Collect weekly
  • Surcharges: Collect weekly or when alerts indicate changes
  • Service schedules: Collect weekly

Use cron jobs, Airflow, or similar schedulers to automate your collection pipeline. Each run should use DataResearchTools proxies to ensure reliable access.

Practical Applications of Collected Freight Rate Data

Rate Benchmarking

Compare your contracted rates against market spot rates to identify renegotiation opportunities. A database of historical rates lets you demonstrate market trends during carrier negotiations.

Route Optimization

Identify the most cost-effective routes by comparing rates across multiple carriers and transshipment options. Sometimes a slightly longer route through a different hub port offers significantly lower rates.

Cost Forecasting

Historical rate data enables statistical modeling of future rate trends. Machine learning models trained on collected rate data, combined with external factors like fuel prices and demand indicators, can predict rate movements with useful accuracy.

Customer Quoting

Freight forwarders use collected rate data to generate competitive customer quotes quickly, knowing they are pricing based on current market conditions rather than outdated spreadsheets.

Conclusion

Scraping freight rate data from shipping platforms is technically challenging but immensely valuable for any company in the logistics space. The combination of well-structured collection scripts, reliable proxy infrastructure from DataResearchTools, and thoughtful data storage creates a powerful intelligence system.

By using mobile proxies from DataResearchTools, particularly for Southeast Asian trade lanes where their coverage is strongest, you ensure consistent access to rate data without the disruptions that come from IP blocking and geographic restrictions. Start with a focused set of routes and platforms, prove the value with your team, and expand your collection scope as the system demonstrates its worth.

The companies that win in freight are the ones with the best information. Building a robust rate scraping pipeline is one of the highest-ROI investments a logistics company can make.


Related Reading

last updated: April 3, 2026

Scroll to Top

Resources

Proxy Signals Podcast
Operator-level insights on mobile proxies and access infrastructure.

Multi-Account Proxies: Setup, Types, Tools & Mistakes (2026)