Mobile Proxies for Scraping Car Marketplace Apps (Carro, Carsome)

Mobile Proxies for Scraping Car Marketplace Apps (Carro, Carsome)

Southeast Asia’s used car market is increasingly dominated by mobile-first platforms. Carro, Carsome, and similar car marketplace apps have transformed how vehicles are bought and sold across the region. These platforms process thousands of transactions monthly and hold vast datasets of vehicle pricing, condition reports, and market demand signals.

For businesses that need this data, scraping these mobile apps presents unique challenges that demand a specialized approach. This guide covers everything you need to know about using mobile proxies to extract data from car marketplace apps.

Why Mobile Apps Are Different from Websites

API-First Architecture

Unlike traditional websites that serve HTML pages, mobile apps communicate with backend servers through APIs. These APIs return structured data, usually in JSON format, which is actually easier to parse than HTML once you can access it.

However, these APIs implement their own security measures:

  • API key authentication: Requests must include valid API keys that are embedded in the app binary
  • Certificate pinning: The app validates the server’s SSL certificate, preventing standard proxy interception
  • Device fingerprinting: The API validates device identifiers, OS version, and app version
  • Connection type validation: Some APIs check whether the request originates from a mobile network

Why Mobile Proxies Are Essential

This last point is critical. Car marketplace apps in Southeast Asia often validate the connection type of incoming API requests. A request that claims to come from a mobile device but arrives through a datacenter IP will be flagged and rejected.

Mobile proxies solve this problem by routing your traffic through actual mobile network connections. When your API requests arrive from a genuine mobile carrier IP (like Singtel, Maxis, or AIS), they match the expected traffic pattern perfectly.

DataResearchTools specializes in mobile proxies from Southeast Asian carriers, making them the ideal infrastructure for scraping apps like Carro and Carsome.

Understanding Carro’s Technical Architecture

Platform Overview

Carro operates across Singapore, Malaysia, Thailand, and Indonesia. The platform offers:

  • Buy and sell used cars
  • Car financing
  • Insurance
  • After-sale services

API Structure

Carro’s mobile app typically communicates through REST APIs with endpoints organized by function:

# Typical Carro API endpoints (illustrative)
GET /api/v2/listings - Search and list vehicles
GET /api/v2/listings/{id} - Get specific listing details
GET /api/v2/listings/{id}/inspection - Vehicle inspection report
GET /api/v2/makes - List available car makes
GET /api/v2/models?make_id={id} - List models for a make
POST /api/v2/search - Advanced search with filters

Data Available Through Carro

  • Vehicle make, model, year, and variant
  • Price and financing options
  • Mileage and condition details
  • Inspection report findings
  • Photos (usually 30-50 per listing)
  • Seller information
  • Vehicle history
  • Market value estimation

Understanding Carsome’s Technical Architecture

Platform Overview

Carsome is the largest integrated car e-commerce platform in Southeast Asia, operating in Malaysia, Singapore, Thailand, and Indonesia. They handle both C2B (selling your car) and B2C (buying a certified car) transactions.

API Structure

Carsome’s API is typically more tightly secured than Carro’s:

# Typical Carsome API patterns (illustrative)
GET /api/cars - List available cars
GET /api/cars/{id} - Car details
GET /api/cars/{id}/report - Inspection report
POST /api/cars/search - Filtered search
GET /api/valuations - Market valuation data

Unique Data Points

Carsome’s certified pre-owned program provides particularly valuable data:

  • 175-point inspection results
  • Certified pricing vs. market pricing
  • Warranty and after-sale terms
  • Refurbishment details
  • Price transparency metrics

Setting Up Mobile Proxy Infrastructure

Choosing the Right Carrier

Different carriers in each country may yield different results:

Singapore:

  • Singtel – Largest carrier, widest IP pool
  • StarHub – Good IP diversity
  • M1 – Smaller pool but less commonly blocked

Malaysia:

  • Maxis – Largest mobile network
  • Celcom – Wide coverage
  • Digi – Good for data-heavy operations

Thailand:

  • AIS – Market leader
  • DTAC – Strong data network
  • True Move – Growing coverage

Indonesia:

  • Telkomsel – Dominant carrier
  • Indosat – Good alternative
  • XL Axiata – Useful for rotation

DataResearchTools provides access to IPs from all major carriers across these countries, giving you the flexibility to rotate between carriers if one becomes less effective.

Proxy Configuration for Mobile API Scraping

class MobileAppProxyConfig:
    def __init__(self, api_key):
        self.api_key = api_key
        self.base_endpoint = "mobile.dataresearchtools.com"

    def get_proxy(self, country, carrier=None):
        session_id = str(uuid4())[:8]
        auth = f"{self.api_key}:country-{country}-type-mobile"

        if carrier:
            auth += f"-carrier-{carrier}"

        auth += f"-session-{session_id}"

        return {
            "http": f"http://{auth}@{self.base_endpoint}:8080",
            "https": f"http://{auth}@{self.base_endpoint}:8080"
        }

    def get_sticky_proxy(self, country, duration_sec=600):
        session_id = f"sticky-{int(time.time())}"
        auth = f"{self.api_key}:country-{country}-type-mobile-session-{session_id}-ttl-{duration_sec}"

        return {
            "http": f"http://{auth}@{self.base_endpoint}:8080",
            "https": f"http://{auth}@{self.base_endpoint}:8080"
        }

Extracting Data from Carro

Method 1: API Replay

The most efficient method is to capture and replay the app’s API requests:

  1. Set up a MITM proxy (like mitmproxy) on your device
  2. Install the app’s CA certificate to bypass SSL pinning (on a test device)
  3. Browse through the app normally, capturing all API requests
  4. Analyze the captured requests to understand endpoints, headers, and parameters
  5. Replay these requests through your mobile proxy infrastructure
class CarroScraper:
    def __init__(self, proxy_config):
        self.proxy_config = proxy_config
        self.base_url = "https://api.carro.co"
        self.headers = {
            "User-Agent": "Carro/5.2.1 (iPhone; iOS 17.0; Scale/3.0)",
            "Accept": "application/json",
            "Accept-Language": "en-SG",
            "X-App-Version": "5.2.1",
            "X-Platform": "ios",
            "X-Device-Id": self.generate_device_id(),
        }

    def search_listings(self, country="sg", make=None, model=None, page=1):
        proxy = self.proxy_config.get_proxy(country.upper())

        params = {
            "page": page,
            "per_page": 30,
            "sort": "latest",
            "country": country,
        }

        if make:
            params["make"] = make
        if model:
            params["model"] = model

        response = requests.get(
            f"{self.base_url}/api/v2/listings",
            params=params,
            headers=self.headers,
            proxies=proxy,
            timeout=30
        )

        if response.status_code == 200:
            return response.json()
        return None

    def get_listing_detail(self, listing_id, country="sg"):
        proxy = self.proxy_config.get_sticky_proxy(country.upper())

        response = requests.get(
            f"{self.base_url}/api/v2/listings/{listing_id}",
            headers=self.headers,
            proxies=proxy,
            timeout=30
        )

        if response.status_code == 200:
            data = response.json()
            return {
                "id": data.get("id"),
                "make": data.get("make"),
                "model": data.get("model"),
                "year": data.get("year"),
                "price": data.get("price"),
                "mileage": data.get("mileage_km"),
                "transmission": data.get("transmission"),
                "fuel_type": data.get("fuel_type"),
                "inspection_score": data.get("inspection", {}).get("score"),
                "photos": [p.get("url") for p in data.get("photos", [])],
                "features": data.get("features", []),
                "description": data.get("description"),
            }
        return None

    def generate_device_id(self):
        return str(uuid4()).upper()

Method 2: Web Interface Scraping

If API access proves too difficult, fall back to scraping Carro’s web interface:

from playwright.sync_api import sync_playwright

def scrape_carro_web(proxy_config, country="sg"):
    proxy = proxy_config.get_proxy(country.upper())

    with sync_playwright() as p:
        browser = p.chromium.launch(proxy={"server": proxy["http"]})
        context = browser.new_context(
            user_agent="Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X)",
            viewport={"width": 390, "height": 844}
        )

        page = context.new_page()
        page.goto(f"https://carro.{country}/buy-car")
        page.wait_for_selector('[class*="listing"]')

        # Extract listing data
        listings = page.evaluate("""
            () => {
                const cards = document.querySelectorAll('[class*="car-card"]');
                return Array.from(cards).map(card => ({
                    title: card.querySelector('h3')?.textContent,
                    price: card.querySelector('[class*="price"]')?.textContent,
                    mileage: card.querySelector('[class*="mileage"]')?.textContent,
                    year: card.querySelector('[class*="year"]')?.textContent,
                }));
            }
        """)

        browser.close()
        return listings

Extracting Data from Carsome

API Access Strategy

class CarsomeScraper:
    def __init__(self, proxy_config):
        self.proxy_config = proxy_config
        self.base_url = "https://www.carsome.my/api"
        self.headers = {
            "User-Agent": "Carsome/4.8.0 (Linux; Android 13; SM-S908B)",
            "Accept": "application/json",
            "Content-Type": "application/json",
            "X-App-Version": "4.8.0",
            "X-Platform": "android",
        }

    def search_cars(self, country="my", filters=None, page=1):
        proxy = self.proxy_config.get_proxy(country.upper())

        payload = {
            "page": page,
            "pageSize": 24,
            "filters": filters or {},
            "sort": {"field": "latest", "order": "desc"}
        }

        response = requests.post(
            f"{self.base_url}/cars/search",
            json=payload,
            headers=self.headers,
            proxies=proxy,
            timeout=30
        )

        return response.json() if response.status_code == 200 else None

    def get_car_details(self, car_id, country="my"):
        proxy = self.proxy_config.get_sticky_proxy(country.upper())

        response = requests.get(
            f"{self.base_url}/cars/{car_id}",
            headers=self.headers,
            proxies=proxy,
            timeout=30
        )

        if response.status_code == 200:
            data = response.json()
            return self.parse_car_details(data)
        return None

    def parse_car_details(self, data):
        return {
            "id": data.get("id"),
            "make": data.get("make"),
            "model": data.get("model"),
            "variant": data.get("variant"),
            "year": data.get("year"),
            "price": data.get("price"),
            "original_price": data.get("originalPrice"),
            "mileage_km": data.get("mileage"),
            "transmission": data.get("transmission"),
            "fuel_type": data.get("fuelType"),
            "body_type": data.get("bodyType"),
            "color": data.get("color"),
            "inspection_points": data.get("inspectionReport", {}).get("totalPoints"),
            "inspection_passed": data.get("inspectionReport", {}).get("passedPoints"),
            "warranty_months": data.get("warranty", {}).get("durationMonths"),
            "photos": data.get("photos", []),
            "certified": data.get("isCertified", False),
        }

Handling Common Challenges

Certificate Pinning

Many apps implement SSL certificate pinning, which prevents standard proxy interception. Solutions include:

  • Frida framework: Dynamic instrumentation to bypass pinning at runtime
  • Modified APKs: Repackage the app with pinning disabled (for analysis purposes only)
  • Web fallback: Use the mobile web version which does not implement pinning

Request Signing

Some APIs sign requests using HMAC or similar mechanisms. Your scraper must replicate this signing:

import hmac
import hashlib

def sign_request(url, timestamp, secret_key):
    message = f"{url}:{timestamp}"
    signature = hmac.new(
        secret_key.encode(),
        message.encode(),
        hashlib.sha256
    ).hexdigest()
    return signature

Rate Limiting

Mobile APIs typically have strict per-device rate limits. Strategies to manage this:

  • Rotate device IDs alongside proxy rotation
  • Implement delays that mimic natural app browsing speed (5-15 seconds between requests)
  • Use DataResearchTools’ session management to maintain consistent identities per session

App Version Updates

Mobile APIs change with app updates. Monitor for:

  • New required headers
  • Changed endpoint paths
  • Updated authentication mechanisms
  • Modified response formats

Build your scraper to gracefully handle API changes and alert you when responses no longer match expected formats.

Building a Cross-Platform Data Pipeline

Combine data from multiple car marketplace apps into a unified pipeline:

class CarMarketplacePipeline:
    def __init__(self, proxy_config):
        self.carro = CarroScraper(proxy_config)
        self.carsome = CarsomeScraper(proxy_config)

    def collect_all_listings(self, country):
        all_listings = []

        # Collect from Carro
        carro_data = self.collect_from_carro(country)
        all_listings.extend(carro_data)

        # Collect from Carsome
        carsome_data = self.collect_from_carsome(country)
        all_listings.extend(carsome_data)

        # Deduplicate based on vehicle characteristics
        deduplicated = self.deduplicate(all_listings)

        # Normalize pricing
        normalized = self.normalize_prices(deduplicated, country)

        return normalized

    def deduplicate(self, listings):
        seen = set()
        unique = []
        for listing in listings:
            key = f"{listing['make']}_{listing['model']}_{listing['year']}_{listing.get('mileage_km', 0)}"
            if key not in seen:
                seen.add(key)
                unique.append(listing)
        return unique

Scaling Your Mobile App Scraping

Parallel Collection

Run scrapers for different countries and platforms concurrently:

from concurrent.futures import ThreadPoolExecutor

def scrape_all_markets(proxy_config):
    countries = ["SG", "MY", "TH", "ID"]
    pipeline = CarMarketplacePipeline(proxy_config)

    with ThreadPoolExecutor(max_workers=len(countries)) as executor:
        futures = {
            executor.submit(pipeline.collect_all_listings, country): country
            for country in countries
        }

        results = {}
        for future in futures:
            country = futures[future]
            results[country] = future.result()

    return results

Data Quality Monitoring

Implement checks to ensure your scraped data remains accurate:

  • Validate price ranges against historical norms
  • Check for missing required fields
  • Monitor scraper success rates by platform and country
  • Alert when data volumes drop unexpectedly

Conclusion

Scraping car marketplace apps like Carro and Carsome requires mobile proxies that match the platforms’ expected traffic patterns. Standard datacenter or even residential proxies will fail because these apps validate that requests originate from genuine mobile network connections.

DataResearchTools mobile proxies provide the carrier-level IPs needed to access these APIs reliably. With coverage across all major Southeast Asian carriers in Singapore, Malaysia, Thailand, and Indonesia, DataResearchTools gives you the infrastructure to build comprehensive automotive data pipelines from the region’s leading car marketplace apps.

Whether you are building a price comparison tool, conducting market research, or powering a competing platform, mobile proxy access to Carro and Carsome data provides a foundation for data-driven decision-making in Southeast Asia’s rapidly evolving automotive market.


Related Reading

Scroll to Top