Mobile Proxies for Scraping Car Marketplace Apps (Carro, Carsome)
Southeast Asia’s used car market is increasingly dominated by mobile-first platforms. Carro, Carsome, and similar car marketplace apps have transformed how vehicles are bought and sold across the region. These platforms process thousands of transactions monthly and hold vast datasets of vehicle pricing, condition reports, and market demand signals.
For businesses that need this data, scraping these mobile apps presents unique challenges that demand a specialized approach. This guide covers everything you need to know about using mobile proxies to extract data from car marketplace apps.
Why Mobile Apps Are Different from Websites
API-First Architecture
Unlike traditional websites that serve HTML pages, mobile apps communicate with backend servers through APIs. These APIs return structured data, usually in JSON format, which is actually easier to parse than HTML once you can access it.
However, these APIs implement their own security measures:
- API key authentication: Requests must include valid API keys that are embedded in the app binary
- Certificate pinning: The app validates the server’s SSL certificate, preventing standard proxy interception
- Device fingerprinting: The API validates device identifiers, OS version, and app version
- Connection type validation: Some APIs check whether the request originates from a mobile network
Why Mobile Proxies Are Essential
This last point is critical. Car marketplace apps in Southeast Asia often validate the connection type of incoming API requests. A request that claims to come from a mobile device but arrives through a datacenter IP will be flagged and rejected.
Mobile proxies solve this problem by routing your traffic through actual mobile network connections. When your API requests arrive from a genuine mobile carrier IP (like Singtel, Maxis, or AIS), they match the expected traffic pattern perfectly.
DataResearchTools specializes in mobile proxies from Southeast Asian carriers, making them the ideal infrastructure for scraping apps like Carro and Carsome.
Understanding Carro’s Technical Architecture
Platform Overview
Carro operates across Singapore, Malaysia, Thailand, and Indonesia. The platform offers:
- Buy and sell used cars
- Car financing
- Insurance
- After-sale services
API Structure
Carro’s mobile app typically communicates through REST APIs with endpoints organized by function:
# Typical Carro API endpoints (illustrative)
GET /api/v2/listings - Search and list vehicles
GET /api/v2/listings/{id} - Get specific listing details
GET /api/v2/listings/{id}/inspection - Vehicle inspection report
GET /api/v2/makes - List available car makes
GET /api/v2/models?make_id={id} - List models for a make
POST /api/v2/search - Advanced search with filtersData Available Through Carro
- Vehicle make, model, year, and variant
- Price and financing options
- Mileage and condition details
- Inspection report findings
- Photos (usually 30-50 per listing)
- Seller information
- Vehicle history
- Market value estimation
Understanding Carsome’s Technical Architecture
Platform Overview
Carsome is the largest integrated car e-commerce platform in Southeast Asia, operating in Malaysia, Singapore, Thailand, and Indonesia. They handle both C2B (selling your car) and B2C (buying a certified car) transactions.
API Structure
Carsome’s API is typically more tightly secured than Carro’s:
# Typical Carsome API patterns (illustrative)
GET /api/cars - List available cars
GET /api/cars/{id} - Car details
GET /api/cars/{id}/report - Inspection report
POST /api/cars/search - Filtered search
GET /api/valuations - Market valuation dataUnique Data Points
Carsome’s certified pre-owned program provides particularly valuable data:
- 175-point inspection results
- Certified pricing vs. market pricing
- Warranty and after-sale terms
- Refurbishment details
- Price transparency metrics
Setting Up Mobile Proxy Infrastructure
Choosing the Right Carrier
Different carriers in each country may yield different results:
Singapore:
- Singtel – Largest carrier, widest IP pool
- StarHub – Good IP diversity
- M1 – Smaller pool but less commonly blocked
Malaysia:
- Maxis – Largest mobile network
- Celcom – Wide coverage
- Digi – Good for data-heavy operations
Thailand:
- AIS – Market leader
- DTAC – Strong data network
- True Move – Growing coverage
Indonesia:
- Telkomsel – Dominant carrier
- Indosat – Good alternative
- XL Axiata – Useful for rotation
DataResearchTools provides access to IPs from all major carriers across these countries, giving you the flexibility to rotate between carriers if one becomes less effective.
Proxy Configuration for Mobile API Scraping
class MobileAppProxyConfig:
def __init__(self, api_key):
self.api_key = api_key
self.base_endpoint = "mobile.dataresearchtools.com"
def get_proxy(self, country, carrier=None):
session_id = str(uuid4())[:8]
auth = f"{self.api_key}:country-{country}-type-mobile"
if carrier:
auth += f"-carrier-{carrier}"
auth += f"-session-{session_id}"
return {
"http": f"http://{auth}@{self.base_endpoint}:8080",
"https": f"http://{auth}@{self.base_endpoint}:8080"
}
def get_sticky_proxy(self, country, duration_sec=600):
session_id = f"sticky-{int(time.time())}"
auth = f"{self.api_key}:country-{country}-type-mobile-session-{session_id}-ttl-{duration_sec}"
return {
"http": f"http://{auth}@{self.base_endpoint}:8080",
"https": f"http://{auth}@{self.base_endpoint}:8080"
}Extracting Data from Carro
Method 1: API Replay
The most efficient method is to capture and replay the app’s API requests:
- Set up a MITM proxy (like mitmproxy) on your device
- Install the app’s CA certificate to bypass SSL pinning (on a test device)
- Browse through the app normally, capturing all API requests
- Analyze the captured requests to understand endpoints, headers, and parameters
- Replay these requests through your mobile proxy infrastructure
class CarroScraper:
def __init__(self, proxy_config):
self.proxy_config = proxy_config
self.base_url = "https://api.carro.co"
self.headers = {
"User-Agent": "Carro/5.2.1 (iPhone; iOS 17.0; Scale/3.0)",
"Accept": "application/json",
"Accept-Language": "en-SG",
"X-App-Version": "5.2.1",
"X-Platform": "ios",
"X-Device-Id": self.generate_device_id(),
}
def search_listings(self, country="sg", make=None, model=None, page=1):
proxy = self.proxy_config.get_proxy(country.upper())
params = {
"page": page,
"per_page": 30,
"sort": "latest",
"country": country,
}
if make:
params["make"] = make
if model:
params["model"] = model
response = requests.get(
f"{self.base_url}/api/v2/listings",
params=params,
headers=self.headers,
proxies=proxy,
timeout=30
)
if response.status_code == 200:
return response.json()
return None
def get_listing_detail(self, listing_id, country="sg"):
proxy = self.proxy_config.get_sticky_proxy(country.upper())
response = requests.get(
f"{self.base_url}/api/v2/listings/{listing_id}",
headers=self.headers,
proxies=proxy,
timeout=30
)
if response.status_code == 200:
data = response.json()
return {
"id": data.get("id"),
"make": data.get("make"),
"model": data.get("model"),
"year": data.get("year"),
"price": data.get("price"),
"mileage": data.get("mileage_km"),
"transmission": data.get("transmission"),
"fuel_type": data.get("fuel_type"),
"inspection_score": data.get("inspection", {}).get("score"),
"photos": [p.get("url") for p in data.get("photos", [])],
"features": data.get("features", []),
"description": data.get("description"),
}
return None
def generate_device_id(self):
return str(uuid4()).upper()Method 2: Web Interface Scraping
If API access proves too difficult, fall back to scraping Carro’s web interface:
from playwright.sync_api import sync_playwright
def scrape_carro_web(proxy_config, country="sg"):
proxy = proxy_config.get_proxy(country.upper())
with sync_playwright() as p:
browser = p.chromium.launch(proxy={"server": proxy["http"]})
context = browser.new_context(
user_agent="Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X)",
viewport={"width": 390, "height": 844}
)
page = context.new_page()
page.goto(f"https://carro.{country}/buy-car")
page.wait_for_selector('[class*="listing"]')
# Extract listing data
listings = page.evaluate("""
() => {
const cards = document.querySelectorAll('[class*="car-card"]');
return Array.from(cards).map(card => ({
title: card.querySelector('h3')?.textContent,
price: card.querySelector('[class*="price"]')?.textContent,
mileage: card.querySelector('[class*="mileage"]')?.textContent,
year: card.querySelector('[class*="year"]')?.textContent,
}));
}
""")
browser.close()
return listingsExtracting Data from Carsome
API Access Strategy
class CarsomeScraper:
def __init__(self, proxy_config):
self.proxy_config = proxy_config
self.base_url = "https://www.carsome.my/api"
self.headers = {
"User-Agent": "Carsome/4.8.0 (Linux; Android 13; SM-S908B)",
"Accept": "application/json",
"Content-Type": "application/json",
"X-App-Version": "4.8.0",
"X-Platform": "android",
}
def search_cars(self, country="my", filters=None, page=1):
proxy = self.proxy_config.get_proxy(country.upper())
payload = {
"page": page,
"pageSize": 24,
"filters": filters or {},
"sort": {"field": "latest", "order": "desc"}
}
response = requests.post(
f"{self.base_url}/cars/search",
json=payload,
headers=self.headers,
proxies=proxy,
timeout=30
)
return response.json() if response.status_code == 200 else None
def get_car_details(self, car_id, country="my"):
proxy = self.proxy_config.get_sticky_proxy(country.upper())
response = requests.get(
f"{self.base_url}/cars/{car_id}",
headers=self.headers,
proxies=proxy,
timeout=30
)
if response.status_code == 200:
data = response.json()
return self.parse_car_details(data)
return None
def parse_car_details(self, data):
return {
"id": data.get("id"),
"make": data.get("make"),
"model": data.get("model"),
"variant": data.get("variant"),
"year": data.get("year"),
"price": data.get("price"),
"original_price": data.get("originalPrice"),
"mileage_km": data.get("mileage"),
"transmission": data.get("transmission"),
"fuel_type": data.get("fuelType"),
"body_type": data.get("bodyType"),
"color": data.get("color"),
"inspection_points": data.get("inspectionReport", {}).get("totalPoints"),
"inspection_passed": data.get("inspectionReport", {}).get("passedPoints"),
"warranty_months": data.get("warranty", {}).get("durationMonths"),
"photos": data.get("photos", []),
"certified": data.get("isCertified", False),
}Handling Common Challenges
Certificate Pinning
Many apps implement SSL certificate pinning, which prevents standard proxy interception. Solutions include:
- Frida framework: Dynamic instrumentation to bypass pinning at runtime
- Modified APKs: Repackage the app with pinning disabled (for analysis purposes only)
- Web fallback: Use the mobile web version which does not implement pinning
Request Signing
Some APIs sign requests using HMAC or similar mechanisms. Your scraper must replicate this signing:
import hmac
import hashlib
def sign_request(url, timestamp, secret_key):
message = f"{url}:{timestamp}"
signature = hmac.new(
secret_key.encode(),
message.encode(),
hashlib.sha256
).hexdigest()
return signatureRate Limiting
Mobile APIs typically have strict per-device rate limits. Strategies to manage this:
- Rotate device IDs alongside proxy rotation
- Implement delays that mimic natural app browsing speed (5-15 seconds between requests)
- Use DataResearchTools’ session management to maintain consistent identities per session
App Version Updates
Mobile APIs change with app updates. Monitor for:
- New required headers
- Changed endpoint paths
- Updated authentication mechanisms
- Modified response formats
Build your scraper to gracefully handle API changes and alert you when responses no longer match expected formats.
Building a Cross-Platform Data Pipeline
Combine data from multiple car marketplace apps into a unified pipeline:
class CarMarketplacePipeline:
def __init__(self, proxy_config):
self.carro = CarroScraper(proxy_config)
self.carsome = CarsomeScraper(proxy_config)
def collect_all_listings(self, country):
all_listings = []
# Collect from Carro
carro_data = self.collect_from_carro(country)
all_listings.extend(carro_data)
# Collect from Carsome
carsome_data = self.collect_from_carsome(country)
all_listings.extend(carsome_data)
# Deduplicate based on vehicle characteristics
deduplicated = self.deduplicate(all_listings)
# Normalize pricing
normalized = self.normalize_prices(deduplicated, country)
return normalized
def deduplicate(self, listings):
seen = set()
unique = []
for listing in listings:
key = f"{listing['make']}_{listing['model']}_{listing['year']}_{listing.get('mileage_km', 0)}"
if key not in seen:
seen.add(key)
unique.append(listing)
return uniqueScaling Your Mobile App Scraping
Parallel Collection
Run scrapers for different countries and platforms concurrently:
from concurrent.futures import ThreadPoolExecutor
def scrape_all_markets(proxy_config):
countries = ["SG", "MY", "TH", "ID"]
pipeline = CarMarketplacePipeline(proxy_config)
with ThreadPoolExecutor(max_workers=len(countries)) as executor:
futures = {
executor.submit(pipeline.collect_all_listings, country): country
for country in countries
}
results = {}
for future in futures:
country = futures[future]
results[country] = future.result()
return resultsData Quality Monitoring
Implement checks to ensure your scraped data remains accurate:
- Validate price ranges against historical norms
- Check for missing required fields
- Monitor scraper success rates by platform and country
- Alert when data volumes drop unexpectedly
Conclusion
Scraping car marketplace apps like Carro and Carsome requires mobile proxies that match the platforms’ expected traffic patterns. Standard datacenter or even residential proxies will fail because these apps validate that requests originate from genuine mobile network connections.
DataResearchTools mobile proxies provide the carrier-level IPs needed to access these APIs reliably. With coverage across all major Southeast Asian carriers in Singapore, Malaysia, Thailand, and Indonesia, DataResearchTools gives you the infrastructure to build comprehensive automotive data pipelines from the region’s leading car marketplace apps.
Whether you are building a price comparison tool, conducting market research, or powering a competing platform, mobile proxy access to Carro and Carsome data provides a foundation for data-driven decision-making in Southeast Asia’s rapidly evolving automotive market.
- Automotive Inventory Tracking Across Multiple Dealer Websites
- Automotive Review Aggregation Using Proxy Networks
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- Automotive Inventory Tracking Across Multiple Dealer Websites
- Automotive Review Aggregation Using Proxy Networks
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Automotive Inventory Tracking Across Multiple Dealer Websites
- Automotive Review Aggregation Using Proxy Networks
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Automotive Inventory Tracking Across Multiple Dealer Websites
- Automotive Review Aggregation Using Proxy Networks
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
Related Reading
- Automotive Inventory Tracking Across Multiple Dealer Websites
- Automotive Review Aggregation Using Proxy Networks
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)