How to Scrape Google Maps Business Listings Using Residential Proxies

How to Scrape Google Maps Business Listings Using Residential Proxies

Google Maps contains an extraordinary wealth of business data — names, addresses, phone numbers, operating hours, reviews, and ratings for millions of businesses worldwide. This data powers lead generation, market research, competitive analysis, and location intelligence for companies across every industry.

Extracting this data at scale, however, presents significant technical challenges. Google Maps relies heavily on JavaScript rendering, dynamic loading, and aggressive anti-bot protections. In this guide, we build a complete Python scraper using Selenium and residential proxies to extract business listings reliably.

Why Google Maps Scraping Requires Proxies

Google is arguably the most sophisticated anti-bot operator on the internet. Their detection systems examine:

  • Request frequency and patterns: Rapid successive searches trigger immediate blocks.
  • IP reputation scoring: Known datacenter IP ranges are flagged instantly.
  • Browser fingerprinting: JavaScript-based checks verify browser authenticity.
  • Behavioral analysis: Mouse movements, scroll patterns, and click timing are analyzed.

Without proxies, your scraper will encounter CAPTCHAs, rate limits, or outright IP bans within a handful of requests. Mobile proxies are particularly effective because Google cannot block mobile carrier IP addresses without disrupting millions of legitimate mobile users.

Setting Up Your Environment

Install the required packages:

pip install selenium webdriver-manager beautifulsoup4 pandas

You also need Chrome or Chromium installed on your system. The webdriver-manager package handles ChromeDriver installation automatically.

Building the Google Maps Scraper

Step 1: Configure Selenium with Proxy Support

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from webdriver_manager.chrome import ChromeDriverManager
from bs4 import BeautifulSoup
import time
import random
import json
import pandas as pd

# Proxy configuration
PROXY_HOST = "proxy.dataresearchtools.com"
PROXY_PORT = "8080"
PROXY_USER = "your_username"
PROXY_PASS = "your_password"

def create_driver(proxy=True):
    """Create a Selenium WebDriver with proxy and stealth settings."""
    options = Options()
    options.add_argument("--headless=new")
    options.add_argument("--no-sandbox")
    options.add_argument("--disable-dev-shm-usage")
    options.add_argument("--disable-blink-features=AutomationControlled")
    options.add_argument("--window-size=1920,1080")
    options.add_argument(
        "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
        "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
    )

    if proxy:
        # Use a Chrome extension for authenticated proxy
        options.add_argument(f"--proxy-server=http://{PROXY_HOST}:{PROXY_PORT}")

    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option("useAutomationExtension", False)

    service = Service(ChromeDriverManager().install())
    driver = webdriver.Chrome(service=service, options=options)

    # Remove webdriver detection flags
    driver.execute_cdp_cmd(
        "Page.addScriptToEvaluateOnNewDocument",
        {"source": "Object.defineProperty(navigator, 'webdriver', {get: () => undefined})"},
    )

    return driver

Step 2: Search for Businesses

def search_google_maps(driver, query, location=""):
    """Perform a search on Google Maps and wait for results to load."""
    search_term = f"{query} in {location}" if location else query
    search_url = f"https://www.google.com/maps/search/{search_term.replace(' ', '+')}"

    driver.get(search_url)
    time.sleep(random.uniform(3, 6))

    # Wait for results panel to load
    try:
        WebDriverWait(driver, 15).until(
            EC.presence_of_element_located((By.CSS_SELECTOR, "div[role='feed']"))
        )
    except Exception:
        print("Results panel did not load in time")
        return False
    return True

Step 3: Scroll and Collect All Results

def scroll_results(driver, max_results=50):
    """Scroll through the results panel to load more listings."""
    results_panel = driver.find_element(By.CSS_SELECTOR, "div[role='feed']")
    last_height = 0
    loaded_count = 0

    while loaded_count < max_results:
        # Scroll the results panel
        driver.execute_script(
            "arguments[0].scrollTop = arguments[0].scrollHeight", results_panel
        )
        time.sleep(random.uniform(1.5, 3))

        # Check if we reached the end
        new_height = driver.execute_script(
            "return arguments[0].scrollHeight", results_panel
        )

        # Count current results
        items = driver.find_elements(By.CSS_SELECTOR, "div[role='feed'] > div > div[jsaction]")
        loaded_count = len(items)
        print(f"Loaded {loaded_count} results...")

        if new_height == last_height:
            # Check for "end of results" indicator
            end_text = driver.find_elements(By.XPATH, "//*[contains(text(), 'end of list')]")
            if end_text:
                print("Reached end of results")
                break
            # Try one more scroll
            time.sleep(2)
            final_height = driver.execute_script(
                "return arguments[0].scrollHeight", results_panel
            )
            if final_height == new_height:
                break

        last_height = new_height

    return loaded_count

Step 4: Extract Business Details

def extract_business_details(driver, listing_element):
    """Click on a listing and extract detailed business information."""
    try:
        listing_element.click()
        time.sleep(random.uniform(2, 4))

        # Wait for details panel
        WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.CSS_SELECTOR, "h1.DUwDvf"))
        )

        business = {}

        # Business name
        name_el = driver.find_elements(By.CSS_SELECTOR, "h1.DUwDvf")
        business["name"] = name_el[0].text if name_el else None

        # Rating
        rating_el = driver.find_elements(By.CSS_SELECTOR, "div.F7nice span[aria-hidden='true']")
        business["rating"] = rating_el[0].text if rating_el else None

        # Review count
        review_el = driver.find_elements(By.CSS_SELECTOR, "div.F7nice span[aria-label*='reviews']")
        if review_el:
            review_text = review_el[0].get_attribute("aria-label")
            business["review_count"] = review_text.split()[0] if review_text else None
        else:
            business["review_count"] = None

        # Address
        address_el = driver.find_elements(
            By.CSS_SELECTOR, "button[data-item-id='address'] div.fontBodyMedium"
        )
        business["address"] = address_el[0].text if address_el else None

        # Phone number
        phone_el = driver.find_elements(
            By.CSS_SELECTOR, "button[data-item-id*='phone'] div.fontBodyMedium"
        )
        business["phone"] = phone_el[0].text if phone_el else None

        # Website
        website_el = driver.find_elements(
            By.CSS_SELECTOR, "a[data-item-id='authority'] div.fontBodyMedium"
        )
        business["website"] = website_el[0].text if website_el else None

        # Category
        category_el = driver.find_elements(By.CSS_SELECTOR, "button.DkEaL")
        business["category"] = category_el[0].text if category_el else None

        # Hours (if visible)
        hours_el = driver.find_elements(
            By.CSS_SELECTOR, "div.t39EBf.GUrTXd[aria-label*='hours']"
        )
        if hours_el:
            business["hours_status"] = hours_el[0].get_attribute("aria-label")
        else:
            business["hours_status"] = None

        # Coordinates from URL
        current_url = driver.current_url
        if "@" in current_url:
            coords_part = current_url.split("@")[1].split(",")
            if len(coords_part) >= 2:
                business["latitude"] = coords_part[0]
                business["longitude"] = coords_part[1]

        return business

    except Exception as e:
        print(f"Error extracting details: {e}")
        return None

Step 5: Run the Complete Scraper

def scrape_google_maps_listings(query, location, max_results=30):
    """Main function to scrape Google Maps listings."""
    driver = create_driver(proxy=True)
    all_businesses = []

    try:
        if not search_google_maps(driver, query, location):
            print("Search failed")
            return []

        scroll_results(driver, max_results)

        # Get all listing elements
        listings = driver.find_elements(
            By.CSS_SELECTOR, "div[role='feed'] > div > div[jsaction]"
        )
        print(f"Found {len(listings)} listings to process")

        for i, listing in enumerate(listings[:max_results]):
            print(f"Processing listing {i + 1}/{min(len(listings), max_results)}")
            business = extract_business_details(driver, listing)
            if business and business.get("name"):
                all_businesses.append(business)
                print(f"  Extracted: {business['name']}")

            # Navigate back to results
            driver.back()
            time.sleep(random.uniform(1, 2.5))

            # Re-fetch listings (DOM may have changed)
            listings = driver.find_elements(
                By.CSS_SELECTOR, "div[role='feed'] > div > div[jsaction]"
            )

    finally:
        driver.quit()

    return all_businesses


def main():
    results = scrape_google_maps_listings(
        query="restaurants",
        location="San Francisco, CA",
        max_results=20,
    )

    # Save as JSON
    with open("google_maps_businesses.json", "w", encoding="utf-8") as f:
        json.dump(results, f, indent=2, ensure_ascii=False)

    # Save as CSV
    df = pd.DataFrame(results)
    df.to_csv("google_maps_businesses.csv", index=False)
    print(f"Saved {len(results)} businesses")


if __name__ == "__main__":
    main()

Handling Google’s Anti-Bot Measures

Google Maps scraping demands more careful anti-detection strategies than most websites. Here are the essential techniques:

Browser Fingerprint Consistency

Google analyzes your browser’s JavaScript environment extensively. Make sure your headless browser has consistent properties — screen resolution, timezone, language settings, and WebGL renderer should all align with what a real user in your proxy’s location would have.

Human-Like Interaction Patterns

Programmatic scrolling at perfectly regular intervals is a dead giveaway. Add random variations to your scroll speeds, click positions, and wait times. Occasional pauses mimicking a user reading results improve your success rate significantly.

Proxy Rotation Strategy

For Google Maps, rotating your proxy IP every 15-20 requests is a good baseline. If you encounter CAPTCHAs, reduce this number. Using residential proxies with geographic targeting lets you appear as a local user, which increases trust scores.

Session Isolation

Each scraping session should use a fresh browser profile. Cookies, local storage, and cached data from previous sessions can create detectable patterns.

Structuring and Storing the Data

Business listing data from Google Maps fits naturally into a relational structure. For ongoing data collection, consider using PostgreSQL:

import psycopg2
from datetime import datetime

def save_to_database(businesses, search_query, search_location):
    """Save extracted businesses to PostgreSQL."""
    conn = psycopg2.connect(
        host="localhost",
        database="maps_data",
        user="scraper",
        password="your_db_password",
    )
    cursor = conn.cursor()

    for biz in businesses:
        cursor.execute(
            """
            INSERT INTO businesses (name, address, phone, website, rating,
                review_count, category, latitude, longitude,
                search_query, search_location, scraped_at)
            VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
            ON CONFLICT (name, address) DO UPDATE SET
                phone = EXCLUDED.phone,
                rating = EXCLUDED.rating,
                review_count = EXCLUDED.review_count,
                scraped_at = EXCLUDED.scraped_at
            """,
            (
                biz.get("name"), biz.get("address"), biz.get("phone"),
                biz.get("website"), biz.get("rating"), biz.get("review_count"),
                biz.get("category"), biz.get("latitude"), biz.get("longitude"),
                search_query, search_location, datetime.utcnow(),
            ),
        )

    conn.commit()
    cursor.close()
    conn.close()

Use Cases for Google Maps Data

The business data you extract from Google Maps serves numerous applications:

  • Lead generation: Build targeted prospect lists for sales outreach by scraping businesses in specific industries and locations.
  • Market research: Analyze competitor density, pricing patterns, and customer sentiment across geographic regions.
  • Local SEO auditing: Monitor how businesses appear in local search results, track rating changes, and identify SEO optimization opportunities.
  • Location intelligence: Map business distributions to identify underserved markets or ideal locations for new ventures.
  • Reputation monitoring: Track review scores and counts over time for competitive benchmarking.

Performance Optimization Tips

When scaling your Google Maps scraper to collect thousands of listings:

  1. Parallel browser instances: Run multiple Selenium instances, each with its own proxy, to scrape different queries simultaneously.
  2. Selective detail extraction: Not every listing needs full details. Scrape the overview first, then fetch details only for listings that match your criteria.
  3. Geographic partitioning: Break large areas into smaller grid cells and search each cell separately to ensure complete coverage.
  4. Incremental scraping: Store previously scraped businesses and skip them in subsequent runs to avoid redundant work.

Legal and Ethical Boundaries

Google Maps data scraping operates in a legal gray area. Important considerations include:

  • Google’s Terms of Service prohibit automated access to their services.
  • The data itself (business names, addresses, phone numbers) is factual and generally not copyrightable.
  • Some jurisdictions have specific regulations about collecting and storing business contact information.
  • Always use the data responsibly and avoid overwhelming Google’s infrastructure.

Conclusion

Scraping Google Maps business listings is a powerful technique for lead generation, market research, and competitive analysis. The combination of Selenium for JavaScript rendering and rotating residential proxies for anti-detection makes this approach reliable and scalable.

For more information on proxy-powered web scraping techniques, explore our other tutorials. If you are new to proxy terminology, our proxy glossary covers all the technical concepts referenced in this guide.


Related Reading

Scroll to Top