Automotive Inventory Tracking Across Multiple Dealer Websites
Tracking automotive inventory across multiple dealer websites provides a powerful lens into market dynamics. By monitoring which vehicles enter and leave dealer stock, how long they sit on lots, and how pricing changes over their listing lifecycle, businesses can extract insights that drive better purchasing, pricing, and strategic decisions.
This article covers the technical and strategic aspects of building an automotive inventory tracking system that monitors dealer websites at scale using proxy infrastructure.
The Value of Inventory Tracking Data
For Dealerships
Understanding competitor inventory helps dealerships:
- Identify gaps: Spot vehicle segments where competitors are understocked
- Predict behavior: Recognize when competitors are likely to discount aging inventory
- Plan acquisitions: Buy vehicles that fill market gaps before competitors
- Benchmark performance: Compare your inventory turn rate against the market
For Automotive Platforms
Marketplaces and aggregator platforms use inventory tracking to:
- Measure platform health: Track total active listings and new listing velocity
- Detect fraud: Identify listings that reappear with manipulated details
- Improve recommendations: Understand which vehicles sell fastest in each region
- Support advertising: Sell data-driven insights to dealer advertisers
For Analysts and Investors
Automotive inventory data reveals:
- Market supply trends: Rising or falling inventory levels by segment
- Demand signals: Fast-selling vehicles indicate high demand
- Regional variations: How inventory composition differs by geography
- Seasonal patterns: Predictable inventory cycles throughout the year
Architecture of an Inventory Tracking System
Component Overview
[Scheduler] --> [Scraper Fleet] --> [Proxy Layer] --> [Target Dealer Sites]
| |
v v
[Raw Data Store] --> [Change Detector] --> [Analytics Engine]
|
v
[Alert System] --> [Dashboard]The Scraper Fleet
Your scraper fleet consists of multiple specialized scrapers, each designed for a specific type of dealer website:
- Template scrapers: For dealers using common website platforms like DealerSocket, VinSolutions, or CDK
- Custom scrapers: For dealers with bespoke websites
- Platform scrapers: For listings on Carousell, Mudah, SGCarMart, and other marketplaces
- API scrapers: For dealers that expose inventory through APIs or data feeds
The Proxy Layer
The proxy layer is critical because dealer websites will block your scrapers if all requests come from the same IP addresses. DataResearchTools mobile proxies provide the foundation for reliable inventory tracking because:
- Mobile IPs blend with the majority of legitimate traffic these sites receive
- Geographic targeting ensures you access location-appropriate inventory
- Session management supports the multi-page navigation required to capture full inventory
- High IP rotation keeps your scraping fingerprint distributed
class InventoryProxyManager:
def __init__(self, api_key):
self.api_key = api_key
self.endpoint = "proxy.dataresearchtools.com"
def get_proxy_for_dealer(self, dealer_country):
return {
"http": f"http://{self.api_key}:country-{dealer_country}-type-mobile@{self.endpoint}:8080",
"https": f"http://{self.api_key}:country-{dealer_country}-type-mobile@{self.endpoint}:8080"
}Building the Inventory Scraper
Discovering Dealer Inventory Pages
Most dealer websites follow predictable URL patterns for their inventory:
INVENTORY_URL_PATTERNS = [
"{domain}/inventory",
"{domain}/used-cars",
"{domain}/new-cars",
"{domain}/vehicles",
"{domain}/stock",
"{domain}/cars-for-sale",
"{domain}/pre-owned",
"{domain}/certified-pre-owned",
]
def discover_inventory_page(dealer_domain, proxy):
for pattern in INVENTORY_URL_PATTERNS:
url = pattern.format(domain=dealer_domain)
try:
response = requests.head(url, proxies=proxy, timeout=10, allow_redirects=True)
if response.status_code == 200:
return url
except:
continue
return NoneExtracting Inventory Data
A generic inventory extractor that handles common dealer website structures:
class DealerInventoryExtractor:
def __init__(self, proxy_manager):
self.proxy_manager = proxy_manager
def extract_inventory(self, dealer):
proxy = self.proxy_manager.get_proxy_for_dealer(dealer["country"])
session = requests.Session()
session.proxies.update(proxy)
session.headers.update({
"User-Agent": get_random_mobile_ua(),
"Accept-Language": "en-US,en;q=0.9"
})
inventory = []
page = 1
while True:
url = self.build_page_url(dealer["inventory_url"], page)
response = session.get(url, timeout=30)
if response.status_code != 200:
break
page_vehicles = self.parse_inventory_page(response.text, dealer["parser_type"])
if not page_vehicles:
break
inventory.extend(page_vehicles)
page += 1
time.sleep(random.uniform(2, 5))
return inventory
def parse_inventory_page(self, html, parser_type):
soup = BeautifulSoup(html, 'html.parser')
if parser_type == "dealersocket":
return self.parse_dealersocket(soup)
elif parser_type == "generic":
return self.parse_generic(soup)
elif parser_type == "structured_data":
return self.parse_structured_data(soup)
return []
def parse_generic(self, soup):
vehicles = []
# Try common listing card patterns
selectors = [
'.vehicle-card', '.inventory-item', '.car-listing',
'[class*="vehicle"]', '[class*="inventory"]', '[class*="listing"]'
]
for selector in selectors:
items = soup.select(selector)
if items:
for item in items:
vehicle = self.extract_vehicle_data(item)
if vehicle.get("title"):
vehicles.append(vehicle)
break
return vehicles
def extract_vehicle_data(self, element):
return {
"title": self.find_text(element, ['h2', 'h3', 'h4', '.title', '.name']),
"price": self.find_text(element, ['.price', '[class*="price"]', '.cost']),
"year": self.extract_year(element),
"mileage": self.find_text(element, ['.mileage', '[class*="mileage"]', '.odometer']),
"stock_number": self.find_text(element, ['.stock', '[class*="stock"]']),
"vin": self.find_text(element, ['.vin', '[class*="vin"]']),
"url": self.find_link(element),
"image_url": self.find_image(element),
}Handling JavaScript-Rendered Inventory Pages
Many modern dealer websites render inventory dynamically. Use a headless browser for these:
from playwright.sync_api import sync_playwright
class JSInventoryExtractor:
def extract_with_browser(self, dealer, proxy):
with sync_playwright() as p:
browser = p.chromium.launch(
proxy={"server": proxy["http"]}
)
context = browser.new_context(
user_agent=get_random_mobile_ua(),
viewport={"width": 412, "height": 915}
)
page = context.new_page()
page.goto(dealer["inventory_url"], wait_until="networkidle")
# Handle infinite scroll
vehicles = []
last_count = 0
while True:
page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
page.wait_for_timeout(2000)
current_items = page.query_selector_all('[class*="vehicle"], [class*="inventory"]')
current_count = len(current_items)
if current_count == last_count:
break
last_count = current_count
# Extract data from all loaded items
for item in current_items:
vehicle = self.extract_from_element(page, item)
if vehicle:
vehicles.append(vehicle)
browser.close()
return vehiclesChange Detection Engine
The change detection engine is what transforms raw inventory snapshots into actionable intelligence.
Detecting New Arrivals
class InventoryChangeDetector:
def __init__(self, db):
self.db = db
def detect_new_arrivals(self, dealer_id, current_inventory):
previous_ids = set(self.db.get_active_vehicle_ids(dealer_id))
current_ids = set(v["vin"] or v["stock_number"] for v in current_inventory)
new_ids = current_ids - previous_ids
new_arrivals = [v for v in current_inventory
if (v["vin"] or v["stock_number"]) in new_ids]
return new_arrivals
def detect_sold_vehicles(self, dealer_id, current_inventory):
previous_ids = set(self.db.get_active_vehicle_ids(dealer_id))
current_ids = set(v["vin"] or v["stock_number"] for v in current_inventory)
sold_ids = previous_ids - current_ids
sold_vehicles = self.db.get_vehicles_by_ids(dealer_id, sold_ids)
return sold_vehicles
def detect_price_changes(self, dealer_id, current_inventory):
changes = []
for vehicle in current_inventory:
vehicle_id = vehicle["vin"] or vehicle["stock_number"]
previous_price = self.db.get_latest_price(dealer_id, vehicle_id)
if previous_price and previous_price != vehicle["price"]:
changes.append({
"vehicle": vehicle,
"previous_price": previous_price,
"new_price": vehicle["price"],
"change_amount": vehicle["price"] - previous_price,
"change_percent": ((vehicle["price"] - previous_price) / previous_price) * 100
})
return changesTracking Days on Market
def calculate_days_on_market(self, dealer_id):
active_vehicles = self.db.get_active_vehicles(dealer_id)
dom_data = []
for vehicle in active_vehicles:
first_seen = vehicle["first_seen_date"]
days = (datetime.now() - first_seen).days
dom_data.append({
"vehicle": vehicle,
"days_on_market": days,
"category": self.categorize_dom(days)
})
return dom_data
def categorize_dom(self, days):
if days <= 14:
return "fresh"
elif days <= 30:
return "normal"
elif days <= 60:
return "aging"
elif days <= 90:
return "stale"
else:
return "problem"Data Storage Design
Schema for Inventory Tracking
CREATE TABLE dealers (
dealer_id SERIAL PRIMARY KEY,
name VARCHAR(200),
website VARCHAR(500),
inventory_url VARCHAR(500),
country VARCHAR(5),
region VARCHAR(100),
parser_type VARCHAR(50),
last_scraped TIMESTAMP,
is_active BOOLEAN DEFAULT true
);
CREATE TABLE vehicles (
vehicle_id SERIAL PRIMARY KEY,
dealer_id INTEGER REFERENCES dealers(dealer_id),
vin VARCHAR(17),
stock_number VARCHAR(50),
make VARCHAR(100),
model VARCHAR(200),
year INTEGER,
trim VARCHAR(200),
mileage_km INTEGER,
color VARCHAR(50),
transmission VARCHAR(20),
fuel_type VARCHAR(30),
first_seen TIMESTAMP,
last_seen TIMESTAMP,
is_active BOOLEAN DEFAULT true,
listing_url VARCHAR(500),
image_url VARCHAR(500)
);
CREATE TABLE vehicle_price_history (
id SERIAL PRIMARY KEY,
vehicle_id INTEGER REFERENCES vehicles(vehicle_id),
price DECIMAL(15, 2),
currency VARCHAR(5),
recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE inventory_snapshots (
snapshot_id SERIAL PRIMARY KEY,
dealer_id INTEGER REFERENCES dealers(dealer_id),
snapshot_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
total_vehicles INTEGER,
new_arrivals INTEGER,
sold_vehicles INTEGER,
price_changes INTEGER,
avg_price DECIMAL(15, 2),
avg_days_on_market DECIMAL(5, 1)
);Analytics and Reporting
Inventory Composition Analysis
def analyze_inventory_composition(dealer_id):
vehicles = db.get_active_vehicles(dealer_id)
composition = {
"by_make": Counter(v["make"] for v in vehicles),
"by_year_range": {
"current_year": len([v for v in vehicles if v["year"] >= 2025]),
"1_3_years": len([v for v in vehicles if 2022 <= v["year"] < 2025]),
"4_6_years": len([v for v in vehicles if 2019 <= v["year"] < 2022]),
"older": len([v for v in vehicles if v["year"] < 2019]),
},
"by_price_range": {
"under_20k": len([v for v in vehicles if parse_price(v["price"]) < 20000]),
"20k_40k": len([v for v in vehicles if 20000 <= parse_price(v["price"]) < 40000]),
"40k_70k": len([v for v in vehicles if 40000 <= parse_price(v["price"]) < 70000]),
"over_70k": len([v for v in vehicles if parse_price(v["price"]) >= 70000]),
},
"total_vehicles": len(vehicles),
"total_value": sum(parse_price(v["price"]) for v in vehicles),
}
return compositionTurn Rate Analysis
def calculate_turn_rates(dealer_id, period_days=30):
start_date = datetime.now() - timedelta(days=period_days)
starting_inventory = db.count_active_vehicles(dealer_id, at_date=start_date)
ending_inventory = db.count_active_vehicles(dealer_id)
vehicles_sold = db.count_sold_vehicles(dealer_id, since=start_date)
vehicles_added = db.count_new_arrivals(dealer_id, since=start_date)
avg_inventory = (starting_inventory + ending_inventory) / 2
return {
"turn_rate": vehicles_sold / avg_inventory if avg_inventory > 0 else 0,
"sell_through_rate": vehicles_sold / (starting_inventory + vehicles_added) * 100,
"avg_days_to_sell": db.get_avg_days_to_sell(dealer_id, since=start_date),
"inventory_growth": ending_inventory - starting_inventory,
}Scaling to Hundreds of Dealers
Distributed Scraping
When tracking hundreds of dealers, distribute scraping across multiple workers:
from celery import Celery
app = Celery('inventory_tracker')
@app.task
def scrape_dealer_inventory(dealer_id):
dealer = db.get_dealer(dealer_id)
proxy_manager = InventoryProxyManager(API_KEY)
extractor = DealerInventoryExtractor(proxy_manager)
inventory = extractor.extract_inventory(dealer)
change_detector = InventoryChangeDetector(db)
new_arrivals = change_detector.detect_new_arrivals(dealer_id, inventory)
sold = change_detector.detect_sold_vehicles(dealer_id, inventory)
price_changes = change_detector.detect_price_changes(dealer_id, inventory)
db.save_inventory_snapshot(dealer_id, inventory, new_arrivals, sold, price_changes)
return {
"dealer": dealer["name"],
"total": len(inventory),
"new": len(new_arrivals),
"sold": len(sold),
"price_changes": len(price_changes)
}Proxy Bandwidth Management
Track proxy usage by dealer to optimize costs:
class BandwidthTracker:
def __init__(self):
self.usage = {}
def track_request(self, dealer_id, bytes_transferred):
if dealer_id not in self.usage:
self.usage[dealer_id] = {"requests": 0, "bytes": 0}
self.usage[dealer_id]["requests"] += 1
self.usage[dealer_id]["bytes"] += bytes_transferred
def get_cost_per_dealer(self, price_per_gb):
costs = {}
for dealer_id, usage in self.usage.items():
gb_used = usage["bytes"] / (1024 ** 3)
costs[dealer_id] = gb_used * price_per_gb
return costsConclusion
Automotive inventory tracking across multiple dealer websites is a data-intensive operation that requires reliable proxy infrastructure, flexible scraping tools, and sophisticated change detection. The insights generated, including new arrival alerts, price change tracking, days-on-market analysis, and competitive inventory composition, are valuable across the automotive industry.
DataResearchTools provides the mobile proxy infrastructure needed to sustain this type of continuous monitoring. With carrier-level IPs across Southeast Asian countries, geographic targeting, and high-concurrency support, DataResearchTools ensures your inventory trackers can access dealer websites reliably without triggering anti-bot protections.
Start by tracking a small set of key competitors, validate your data quality, and then expand systematically as your system matures. The compound value of inventory tracking data grows with every scrape cycle, building a historical dataset that reveals market patterns invisible to those without it.
- Automotive Review Aggregation Using Proxy Networks
- Best Proxies for Automotive Data Scraping in 2026
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- Automotive Review Aggregation Using Proxy Networks
- Best Proxies for Automotive Data Scraping in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Automotive Review Aggregation Using Proxy Networks
- Best Proxies for Automotive Data Scraping in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Automotive Review Aggregation Using Proxy Networks
- Best Proxies for Automotive Data Scraping in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Automotive Review Aggregation Using Proxy Networks
- Best Proxies for Automotive Data Scraping in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Automotive Review Aggregation Using Proxy Networks
- Best Proxies for Automotive Data Scraping in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Automotive Review Aggregation Using Proxy Networks
- Best Proxies for Automotive Data Scraping in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- Automotive Review Aggregation Using Proxy Networks
- Best Proxies for Automotive Data Scraping in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
Related Reading
- Automotive Review Aggregation Using Proxy Networks
- Best Proxies for Automotive Data Scraping in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)