How to Scrape Movie Theater Showtimes and Seat Availability

How to Scrape Movie Theater Showtimes and Seat Availability

Movie theater data, including showtimes, seat availability, and pricing, is valuable for a range of applications from building aggregator services to analyzing the cinema industry. Scraping this data across multiple cinema chains and markets requires proxy infrastructure that can handle geographic restrictions, anti-bot measures, and high-volume data collection. This guide covers the technical and practical aspects of movie theater data scraping.

The Value of Movie Theater Data

For App Developers and Aggregators

Movie data aggregation is a significant business:

  • Showtime aggregator apps: Compile showtimes from multiple cinema chains
  • Price comparison tools: Help consumers find the cheapest screenings
  • Seat selection services: Show real-time seat availability across theaters
  • Social planning apps: Enable groups to find mutually convenient showtimes

For Market Researchers

Cinema industry analysis relies on theater data:

  • Box office tracking: Monitor screening frequency as a demand indicator
  • Market sizing: Count screens, showtimes, and capacity across markets
  • Competitive analysis: Compare cinema chains’ offerings and pricing
  • Distribution patterns: Track how films roll out across theaters and regions

For Film Distributors

Studios and distributors use theater data to:

  • Monitor screen allocation: Track how many screens each film receives
  • Verify agreements: Ensure theaters honor contractual screening commitments
  • Analyze geographic performance: Compare screening volumes across markets
  • Plan marketing: Target promotions to markets with available capacity

For Cinema Operators

Theater operators use competitive data to:

  • Benchmark pricing: Compare their ticket prices with competitors
  • Optimize scheduling: Identify underserved time slots or genres
  • Track market share: Monitor their share of total screen time in each market
  • Plan expansion: Identify markets with high demand and limited supply

SEA Cinema Landscape

Major Cinema Chains by Country

Singapore:

  • Golden Village (GV): Largest chain with multiple locations
  • Cathay Cineplexes: Premium cinema operator
  • Shaw Theatres: Historic chain with modern venues
  • FilmGarde: Smaller chain with select locations
  • The Projector: Independent art-house cinema

Thailand:

  • Major Cineplex: Largest chain with 800+ screens
  • SF Cinema: Second largest with premium venues
  • Icon Cineconic: Boutique cinema experience

Malaysia:

  • Golden Screen Cinemas (GSC): Largest with 500+ screens
  • TGV Cinemas: Major competitor
  • MBO Cinemas: Regional chain
  • mmCineplexes: Growing chain

Indonesia:

  • Cinema XXI / CGV: Dominant chains with hundreds of screens
  • Cinepolis: International chain with Indonesian presence
  • Platinum Cineplex: Budget-friendly option

Philippines:

  • SM Cinema: Largest chain tied to SM malls
  • Robinsons Movieworld: Major chain
  • Ayala Malls Cinemas: Premium locations

Vietnam:

  • CGV: South Korean chain dominating the market
  • Lotte Cinema: Second major chain
  • Galaxy Cinema: Local chain

Online Booking Platforms

In addition to cinema websites, several aggregator platforms provide showtime data:

  • Fandango: International presence
  • BookMyShow: Growing in SEA markets
  • Google Movies: Aggregated showtime listings
  • Cinema chain apps: Most major chains have their own booking apps

Technical Approach to Movie Data Scraping

Data Sources and Access Methods

Cinema chain websites: Most cinema chains display showtimes and seat maps on their websites. These are typically JavaScript-rendered pages requiring headless browser scraping.

Cinema chain APIs: Some chains expose APIs for their mobile apps. These APIs often return structured data that is easier to parse than web pages.

Aggregator platforms: Platforms like Google Movies and Fandango compile data from multiple sources, providing a consolidated view.

Proxy Requirements

Movie theater websites have varying levels of anti-bot protection:

Lower protection (most regional cinema sites):

  • Rate limiting based on request frequency
  • Basic bot detection
  • Geographic access restrictions

Higher protection (major chains and aggregators):

  • JavaScript-based bot detection
  • CAPTCHA challenges
  • Sophisticated fingerprinting
  • Aggressive rate limiting

DataResearchTools mobile proxies provide the trust level needed to scrape both types effectively, with country-specific IPs for accessing regional cinema platforms.

Scraping Architecture

Data Sources
    |-- Cinema chain websites (GV, Cathay, Major Cineplex, etc.)
    |-- Cinema chain mobile APIs (reverse-engineered endpoints)
    |-- Aggregator platforms (Google Movies, Fandango)
    |
Proxy Layer (DataResearchTools)
    |-- Singapore proxy for Singapore cinemas
    |-- Thai proxy for Thai cinemas
    |-- Malaysian proxy for Malaysian cinemas
    |-- Indonesian proxy for Indonesian cinemas
    |-- Philippine proxy for Philippine cinemas
    |-- Vietnamese proxy for Vietnamese cinemas
    |
Scraping Engine
    |-- Playwright/Puppeteer for web scraping
    |-- HTTP client for API scraping
    |-- Request scheduling and rate limiting
    |
Data Processing
    |-- HTML/JSON parsing
    |-- Data normalization
    |-- Deduplication
    |-- Validation
    |
Storage
    |-- Showtime database
    |-- Seat availability snapshots
    |-- Price history records
    |-- Movie metadata

Scraping Showtimes

Step 1: Identify Target Cinemas

Create a list of cinemas to monitor:

Cinema ChainCountryWebsiteAPI AvailableProxy Needed
Golden VillageSingaporegv.com.sgYes (mobile)Singapore
CathaySingaporecathaycineplexes.com.sgLimitedSingapore
Major CineplexThailandmajorcineplex.comYes (mobile)Thailand
GSCMalaysiagsc.com.myYes (mobile)Malaysia
Cinema XXIIndonesia21cineplex.comYesIndonesia
SM CinemaPhilippinessmcinema.comLimitedPhilippines
CGV VietnamVietnamcgv.vnYes (mobile)Vietnam

Step 2: Build Platform-Specific Scrapers

Each cinema chain has a different website structure. Build scrapers for each:

For JavaScript-rendered sites (most cinema websites):

# Pseudocode for showtime scraping
async def scrape_showtimes(cinema_chain, date, proxy):
    browser = await launch_browser(proxy=proxy)
    page = await browser.new_page()

    # Navigate to showtime page
    await page.goto(f"{cinema_chain.url}/showtimes?date={date}")

    # Wait for showtime data to load
    await page.wait_for_selector(".showtime-listing")

    # Extract showtime data
    movies = await page.query_selector_all(".movie-card")
    showtimes = []

    for movie in movies:
        title = await movie.get_text(".movie-title")
        times = await movie.query_selector_all(".showtime-button")
        for time in times:
            showtime = {
                "movie": title,
                "time": await time.get_text(),
                "cinema": cinema_chain.name,
                "date": date,
                "format": await time.get_attribute("data-format"),  # 2D, 3D, IMAX
            }
            showtimes.append(showtime)

    await browser.close()
    return showtimes

For API-based scraping:

# Pseudocode for API-based scraping
def scrape_api_showtimes(cinema_chain, date, proxy):
    response = requests.get(
        f"{cinema_chain.api_url}/showtimes",
        params={"date": date, "cinema_id": cinema_chain.id},
        proxies={"https": proxy.url},
        headers=cinema_chain.api_headers
    )
    return response.json()

Step 3: Schedule Regular Data Collection

Set up automated scraping schedules:

  • Showtime updates: Scrape once daily for the next 7-14 days of showtimes
  • New movie detection: Check for newly added films twice daily
  • Price checks: Monitor pricing changes weekly
  • Seat availability: Check every 30-60 minutes for target screenings

Scraping Seat Availability

Understanding Seat Map Data

Cinema seat maps typically include:

  • Seat layout: Grid of seats organized by row and column
  • Seat status: Available, sold, reserved, blocked, or maintenance
  • Seat type: Standard, premium, couple, wheelchair accessible
  • Pricing: Different prices for different seat types and positions

Technical Challenges

Seat maps present unique scraping challenges:

Dynamic rendering: Seat maps are often rendered using Canvas or SVG, making traditional HTML parsing insufficient.

Real-time updates: Seat availability changes constantly as tickets are sold.

Session-dependent: Seat maps often require an active booking session to display.

Anti-scraping measures: Cinema chains protect seat data more aggressively than showtime data.

Scraping Approach

  1. Initiate a booking session through a DataResearchTools mobile proxy
  2. Select the target showtime to load the seat map
  3. Extract seat data from the page (API calls or DOM parsing)
  4. Parse the seat grid to determine available vs. sold seats
  5. Calculate occupancy as a percentage of total capacity
  6. Store the snapshot with timestamp for historical tracking

Occupancy Analysis

Seat availability data enables occupancy analysis:

  • Fill rate by showtime: Which time slots have highest occupancy
  • Fill rate by movie: Which films sell the most seats
  • Fill rate by day: Which days of the week are busiest
  • Fill progression: How quickly seats sell after showtimes are published
  • Premium vs. standard: Relative demand for premium seating

Data Processing and Storage

Showtime Data Model

FieldTypeDescription
movie_titlestringFilm title
cinema_chainstringCinema operator
cinema_locationstringSpecific theater location
screen_numberintegerScreen/hall number
showtimedatetimeDate and time of screening
formatstring2D, 3D, IMAX, 4DX, etc.
languagestringAudio language
subtitlesstringSubtitle language
price_standarddecimalStandard seat price
price_premiumdecimalPremium seat price
currencystringCurrency code
total_seatsintegerTotal capacity
available_seatsintegerSeats currently available
proxy_countrystringCountry of proxy used
scraped_atdatetimeTimestamp of data collection

Data Quality Checks

Implement validation to ensure data quality:

  • Verify showtime dates are in the future
  • Check that prices are within reasonable ranges
  • Validate seat counts against known theater capacities
  • Detect and handle duplicate entries
  • Flag anomalies for manual review

Applications of Movie Theater Data

Building a Showtime Aggregator

Compile showtimes from all cinema chains in a market:

  1. Scrape all major chains using DataResearchTools proxies for each country
  2. Normalize data into a consistent format
  3. Deduplicate showings of the same movie at the same time
  4. Present data through an app or website
  5. Update regularly to reflect changes and cancellations

Cinema Industry Analytics

Produce industry reports using scraped data:

  • Screen utilization: How efficiently cinemas use their screens
  • Genre analysis: What types of films dominate each market
  • Format adoption: Growth of IMAX, 4DX, and premium formats
  • Pricing trends: How ticket prices change over time
  • Market competition: Cinema chain market share analysis

Price Optimization Research

Analyze pricing strategies across markets:

  • Compare ticket prices across cinema chains in the same city
  • Track price differences between countries for the same films
  • Analyze the premium charged for IMAX, 3D, and other formats
  • Study dynamic pricing adoption among cinema operators

Handling Anti-Scraping Measures

Rate Limiting

Cinema websites limit request frequency:

  • Use DataResearchTools rotating proxies to distribute requests
  • Implement delays of 3-5 seconds between requests per IP
  • Schedule scraping during off-peak hours

Geographic Restrictions

Cinema websites often restrict access by country:

  • Use country-specific DataResearchTools mobile proxies
  • Match browser settings to proxy location
  • Handle region-specific cookie requirements

CAPTCHA and Bot Detection

For cinemas with stronger protections:

  • Mobile proxies reduce CAPTCHA frequency significantly
  • Implement CAPTCHA-solving services as a fallback
  • Use browser automation with realistic fingerprints

Conclusion

Movie theater showtime and seat availability data is a valuable resource for app developers, market researchers, distributors, and cinema operators. Scraping this data across the fragmented SEA cinema landscape requires reliable proxy infrastructure with geographic coverage in each target country.

DataResearchTools mobile proxies provide the country-specific access and high trust scores needed to scrape cinema platforms across Singapore, Thailand, Malaysia, Indonesia, the Philippines, and Vietnam. By building structured scraping pipelines with proper proxy rotation, you can compile comprehensive cinema datasets for analysis, product development, and market intelligence applications.


Related Reading

Scroll to Top