Automotive Inventory Tracking Across Multiple Dealer Websites

Automotive Inventory Tracking Across Multiple Dealer Websites

Tracking automotive inventory across multiple dealer websites provides a powerful lens into market dynamics. By monitoring which vehicles enter and leave dealer stock, how long they sit on lots, and how pricing changes over their listing lifecycle, businesses can extract insights that drive better purchasing, pricing, and strategic decisions.

This article covers the technical and strategic aspects of building an automotive inventory tracking system that monitors dealer websites at scale using proxy infrastructure.

The Value of Inventory Tracking Data

For Dealerships

Understanding competitor inventory helps dealerships:

  • Identify gaps: Spot vehicle segments where competitors are understocked
  • Predict behavior: Recognize when competitors are likely to discount aging inventory
  • Plan acquisitions: Buy vehicles that fill market gaps before competitors
  • Benchmark performance: Compare your inventory turn rate against the market

For Automotive Platforms

Marketplaces and aggregator platforms use inventory tracking to:

  • Measure platform health: Track total active listings and new listing velocity
  • Detect fraud: Identify listings that reappear with manipulated details
  • Improve recommendations: Understand which vehicles sell fastest in each region
  • Support advertising: Sell data-driven insights to dealer advertisers

For Analysts and Investors

Automotive inventory data reveals:

  • Market supply trends: Rising or falling inventory levels by segment
  • Demand signals: Fast-selling vehicles indicate high demand
  • Regional variations: How inventory composition differs by geography
  • Seasonal patterns: Predictable inventory cycles throughout the year

Architecture of an Inventory Tracking System

Component Overview

[Scheduler] --> [Scraper Fleet] --> [Proxy Layer] --> [Target Dealer Sites]
                     |                                        |
                     v                                        v
              [Raw Data Store] --> [Change Detector] --> [Analytics Engine]
                                        |
                                        v
                                 [Alert System] --> [Dashboard]

The Scraper Fleet

Your scraper fleet consists of multiple specialized scrapers, each designed for a specific type of dealer website:

  • Template scrapers: For dealers using common website platforms like DealerSocket, VinSolutions, or CDK
  • Custom scrapers: For dealers with bespoke websites
  • Platform scrapers: For listings on Carousell, Mudah, SGCarMart, and other marketplaces
  • API scrapers: For dealers that expose inventory through APIs or data feeds

The Proxy Layer

The proxy layer is critical because dealer websites will block your scrapers if all requests come from the same IP addresses. DataResearchTools mobile proxies provide the foundation for reliable inventory tracking because:

  • Mobile IPs blend with the majority of legitimate traffic these sites receive
  • Geographic targeting ensures you access location-appropriate inventory
  • Session management supports the multi-page navigation required to capture full inventory
  • High IP rotation keeps your scraping fingerprint distributed
class InventoryProxyManager:
    def __init__(self, api_key):
        self.api_key = api_key
        self.endpoint = "proxy.dataresearchtools.com"

    def get_proxy_for_dealer(self, dealer_country):
        return {
            "http": f"http://{self.api_key}:country-{dealer_country}-type-mobile@{self.endpoint}:8080",
            "https": f"http://{self.api_key}:country-{dealer_country}-type-mobile@{self.endpoint}:8080"
        }

Building the Inventory Scraper

Discovering Dealer Inventory Pages

Most dealer websites follow predictable URL patterns for their inventory:

INVENTORY_URL_PATTERNS = [
    "{domain}/inventory",
    "{domain}/used-cars",
    "{domain}/new-cars",
    "{domain}/vehicles",
    "{domain}/stock",
    "{domain}/cars-for-sale",
    "{domain}/pre-owned",
    "{domain}/certified-pre-owned",
]

def discover_inventory_page(dealer_domain, proxy):
    for pattern in INVENTORY_URL_PATTERNS:
        url = pattern.format(domain=dealer_domain)
        try:
            response = requests.head(url, proxies=proxy, timeout=10, allow_redirects=True)
            if response.status_code == 200:
                return url
        except:
            continue
    return None

Extracting Inventory Data

A generic inventory extractor that handles common dealer website structures:

class DealerInventoryExtractor:
    def __init__(self, proxy_manager):
        self.proxy_manager = proxy_manager

    def extract_inventory(self, dealer):
        proxy = self.proxy_manager.get_proxy_for_dealer(dealer["country"])

        session = requests.Session()
        session.proxies.update(proxy)
        session.headers.update({
            "User-Agent": get_random_mobile_ua(),
            "Accept-Language": "en-US,en;q=0.9"
        })

        inventory = []
        page = 1

        while True:
            url = self.build_page_url(dealer["inventory_url"], page)
            response = session.get(url, timeout=30)

            if response.status_code != 200:
                break

            page_vehicles = self.parse_inventory_page(response.text, dealer["parser_type"])

            if not page_vehicles:
                break

            inventory.extend(page_vehicles)
            page += 1
            time.sleep(random.uniform(2, 5))

        return inventory

    def parse_inventory_page(self, html, parser_type):
        soup = BeautifulSoup(html, 'html.parser')

        if parser_type == "dealersocket":
            return self.parse_dealersocket(soup)
        elif parser_type == "generic":
            return self.parse_generic(soup)
        elif parser_type == "structured_data":
            return self.parse_structured_data(soup)
        return []

    def parse_generic(self, soup):
        vehicles = []

        # Try common listing card patterns
        selectors = [
            '.vehicle-card', '.inventory-item', '.car-listing',
            '[class*="vehicle"]', '[class*="inventory"]', '[class*="listing"]'
        ]

        for selector in selectors:
            items = soup.select(selector)
            if items:
                for item in items:
                    vehicle = self.extract_vehicle_data(item)
                    if vehicle.get("title"):
                        vehicles.append(vehicle)
                break

        return vehicles

    def extract_vehicle_data(self, element):
        return {
            "title": self.find_text(element, ['h2', 'h3', 'h4', '.title', '.name']),
            "price": self.find_text(element, ['.price', '[class*="price"]', '.cost']),
            "year": self.extract_year(element),
            "mileage": self.find_text(element, ['.mileage', '[class*="mileage"]', '.odometer']),
            "stock_number": self.find_text(element, ['.stock', '[class*="stock"]']),
            "vin": self.find_text(element, ['.vin', '[class*="vin"]']),
            "url": self.find_link(element),
            "image_url": self.find_image(element),
        }

Handling JavaScript-Rendered Inventory Pages

Many modern dealer websites render inventory dynamically. Use a headless browser for these:

from playwright.sync_api import sync_playwright

class JSInventoryExtractor:
    def extract_with_browser(self, dealer, proxy):
        with sync_playwright() as p:
            browser = p.chromium.launch(
                proxy={"server": proxy["http"]}
            )
            context = browser.new_context(
                user_agent=get_random_mobile_ua(),
                viewport={"width": 412, "height": 915}
            )

            page = context.new_page()
            page.goto(dealer["inventory_url"], wait_until="networkidle")

            # Handle infinite scroll
            vehicles = []
            last_count = 0

            while True:
                page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
                page.wait_for_timeout(2000)

                current_items = page.query_selector_all('[class*="vehicle"], [class*="inventory"]')
                current_count = len(current_items)

                if current_count == last_count:
                    break
                last_count = current_count

            # Extract data from all loaded items
            for item in current_items:
                vehicle = self.extract_from_element(page, item)
                if vehicle:
                    vehicles.append(vehicle)

            browser.close()
            return vehicles

Change Detection Engine

The change detection engine is what transforms raw inventory snapshots into actionable intelligence.

Detecting New Arrivals

class InventoryChangeDetector:
    def __init__(self, db):
        self.db = db

    def detect_new_arrivals(self, dealer_id, current_inventory):
        previous_ids = set(self.db.get_active_vehicle_ids(dealer_id))
        current_ids = set(v["vin"] or v["stock_number"] for v in current_inventory)

        new_ids = current_ids - previous_ids
        new_arrivals = [v for v in current_inventory
                       if (v["vin"] or v["stock_number"]) in new_ids]

        return new_arrivals

    def detect_sold_vehicles(self, dealer_id, current_inventory):
        previous_ids = set(self.db.get_active_vehicle_ids(dealer_id))
        current_ids = set(v["vin"] or v["stock_number"] for v in current_inventory)

        sold_ids = previous_ids - current_ids
        sold_vehicles = self.db.get_vehicles_by_ids(dealer_id, sold_ids)

        return sold_vehicles

    def detect_price_changes(self, dealer_id, current_inventory):
        changes = []
        for vehicle in current_inventory:
            vehicle_id = vehicle["vin"] or vehicle["stock_number"]
            previous_price = self.db.get_latest_price(dealer_id, vehicle_id)

            if previous_price and previous_price != vehicle["price"]:
                changes.append({
                    "vehicle": vehicle,
                    "previous_price": previous_price,
                    "new_price": vehicle["price"],
                    "change_amount": vehicle["price"] - previous_price,
                    "change_percent": ((vehicle["price"] - previous_price) / previous_price) * 100
                })

        return changes

Tracking Days on Market

def calculate_days_on_market(self, dealer_id):
    active_vehicles = self.db.get_active_vehicles(dealer_id)
    dom_data = []

    for vehicle in active_vehicles:
        first_seen = vehicle["first_seen_date"]
        days = (datetime.now() - first_seen).days

        dom_data.append({
            "vehicle": vehicle,
            "days_on_market": days,
            "category": self.categorize_dom(days)
        })

    return dom_data

def categorize_dom(self, days):
    if days <= 14:
        return "fresh"
    elif days <= 30:
        return "normal"
    elif days <= 60:
        return "aging"
    elif days <= 90:
        return "stale"
    else:
        return "problem"

Data Storage Design

Schema for Inventory Tracking

CREATE TABLE dealers (
    dealer_id SERIAL PRIMARY KEY,
    name VARCHAR(200),
    website VARCHAR(500),
    inventory_url VARCHAR(500),
    country VARCHAR(5),
    region VARCHAR(100),
    parser_type VARCHAR(50),
    last_scraped TIMESTAMP,
    is_active BOOLEAN DEFAULT true
);

CREATE TABLE vehicles (
    vehicle_id SERIAL PRIMARY KEY,
    dealer_id INTEGER REFERENCES dealers(dealer_id),
    vin VARCHAR(17),
    stock_number VARCHAR(50),
    make VARCHAR(100),
    model VARCHAR(200),
    year INTEGER,
    trim VARCHAR(200),
    mileage_km INTEGER,
    color VARCHAR(50),
    transmission VARCHAR(20),
    fuel_type VARCHAR(30),
    first_seen TIMESTAMP,
    last_seen TIMESTAMP,
    is_active BOOLEAN DEFAULT true,
    listing_url VARCHAR(500),
    image_url VARCHAR(500)
);

CREATE TABLE vehicle_price_history (
    id SERIAL PRIMARY KEY,
    vehicle_id INTEGER REFERENCES vehicles(vehicle_id),
    price DECIMAL(15, 2),
    currency VARCHAR(5),
    recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE inventory_snapshots (
    snapshot_id SERIAL PRIMARY KEY,
    dealer_id INTEGER REFERENCES dealers(dealer_id),
    snapshot_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    total_vehicles INTEGER,
    new_arrivals INTEGER,
    sold_vehicles INTEGER,
    price_changes INTEGER,
    avg_price DECIMAL(15, 2),
    avg_days_on_market DECIMAL(5, 1)
);

Analytics and Reporting

Inventory Composition Analysis

def analyze_inventory_composition(dealer_id):
    vehicles = db.get_active_vehicles(dealer_id)

    composition = {
        "by_make": Counter(v["make"] for v in vehicles),
        "by_year_range": {
            "current_year": len([v for v in vehicles if v["year"] >= 2025]),
            "1_3_years": len([v for v in vehicles if 2022 <= v["year"] < 2025]),
            "4_6_years": len([v for v in vehicles if 2019 <= v["year"] < 2022]),
            "older": len([v for v in vehicles if v["year"] < 2019]),
        },
        "by_price_range": {
            "under_20k": len([v for v in vehicles if parse_price(v["price"]) < 20000]),
            "20k_40k": len([v for v in vehicles if 20000 <= parse_price(v["price"]) < 40000]),
            "40k_70k": len([v for v in vehicles if 40000 <= parse_price(v["price"]) < 70000]),
            "over_70k": len([v for v in vehicles if parse_price(v["price"]) >= 70000]),
        },
        "total_vehicles": len(vehicles),
        "total_value": sum(parse_price(v["price"]) for v in vehicles),
    }

    return composition

Turn Rate Analysis

def calculate_turn_rates(dealer_id, period_days=30):
    start_date = datetime.now() - timedelta(days=period_days)

    starting_inventory = db.count_active_vehicles(dealer_id, at_date=start_date)
    ending_inventory = db.count_active_vehicles(dealer_id)
    vehicles_sold = db.count_sold_vehicles(dealer_id, since=start_date)
    vehicles_added = db.count_new_arrivals(dealer_id, since=start_date)

    avg_inventory = (starting_inventory + ending_inventory) / 2

    return {
        "turn_rate": vehicles_sold / avg_inventory if avg_inventory > 0 else 0,
        "sell_through_rate": vehicles_sold / (starting_inventory + vehicles_added) * 100,
        "avg_days_to_sell": db.get_avg_days_to_sell(dealer_id, since=start_date),
        "inventory_growth": ending_inventory - starting_inventory,
    }

Scaling to Hundreds of Dealers

Distributed Scraping

When tracking hundreds of dealers, distribute scraping across multiple workers:

from celery import Celery

app = Celery('inventory_tracker')

@app.task
def scrape_dealer_inventory(dealer_id):
    dealer = db.get_dealer(dealer_id)
    proxy_manager = InventoryProxyManager(API_KEY)

    extractor = DealerInventoryExtractor(proxy_manager)
    inventory = extractor.extract_inventory(dealer)

    change_detector = InventoryChangeDetector(db)
    new_arrivals = change_detector.detect_new_arrivals(dealer_id, inventory)
    sold = change_detector.detect_sold_vehicles(dealer_id, inventory)
    price_changes = change_detector.detect_price_changes(dealer_id, inventory)

    db.save_inventory_snapshot(dealer_id, inventory, new_arrivals, sold, price_changes)

    return {
        "dealer": dealer["name"],
        "total": len(inventory),
        "new": len(new_arrivals),
        "sold": len(sold),
        "price_changes": len(price_changes)
    }

Proxy Bandwidth Management

Track proxy usage by dealer to optimize costs:

class BandwidthTracker:
    def __init__(self):
        self.usage = {}

    def track_request(self, dealer_id, bytes_transferred):
        if dealer_id not in self.usage:
            self.usage[dealer_id] = {"requests": 0, "bytes": 0}
        self.usage[dealer_id]["requests"] += 1
        self.usage[dealer_id]["bytes"] += bytes_transferred

    def get_cost_per_dealer(self, price_per_gb):
        costs = {}
        for dealer_id, usage in self.usage.items():
            gb_used = usage["bytes"] / (1024 ** 3)
            costs[dealer_id] = gb_used * price_per_gb
        return costs

Conclusion

Automotive inventory tracking across multiple dealer websites is a data-intensive operation that requires reliable proxy infrastructure, flexible scraping tools, and sophisticated change detection. The insights generated, including new arrival alerts, price change tracking, days-on-market analysis, and competitive inventory composition, are valuable across the automotive industry.

DataResearchTools provides the mobile proxy infrastructure needed to sustain this type of continuous monitoring. With carrier-level IPs across Southeast Asian countries, geographic targeting, and high-concurrency support, DataResearchTools ensures your inventory trackers can access dealer websites reliably without triggering anti-bot protections.

Start by tracking a small set of key competitors, validate your data quality, and then expand systematically as your system matures. The compound value of inventory tracking data grows with every scrape cycle, building a historical dataset that reveals market patterns invisible to those without it.


Related Reading

Scroll to Top