Best Proxies for Google Maps Business Lead Extraction

Best Proxies for Google Maps Business Lead Extraction

Google Maps is one of the richest sources of local business data available. Every business listing contains a company name, address, phone number, website, operating hours, reviews, and often category information. For B2B sales teams targeting local businesses — restaurants, dental offices, HVAC companies, law firms, real estate agencies — Google Maps provides a near-complete directory of potential clients.

The challenge is extracting this data at scale. Google aggressively protects Maps from automated scraping, employing CAPTCHA challenges, IP blocking, and request throttling. Mobile proxies are essential for maintaining reliable access because Google treats mobile carrier IPs as legitimate user traffic.

Why Google Maps Is Valuable for Lead Generation

Google Maps contains data that no other single source matches:

  • 150+ million business listings across virtually every industry and geography
  • Verified contact information — phone numbers and websites are typically accurate
  • Real-time data — listings are updated continuously by business owners
  • Review data — review counts and ratings indicate business maturity and reputation
  • Category taxonomy — precise business classification for targeting

A single Google Maps scraping session can yield thousands of local business leads that would take weeks to compile manually.

Proxy Requirements for Google Maps

Google’s anti-bot system for Maps is among the most sophisticated on the web. Here is what you need:

Mobile Proxies Over Residential

Residential proxies work for some Google services, but Maps specifically tracks proxy provider IP ranges and blocks them at higher rates. Mobile proxies from real carrier networks consistently outperform residential alternatives.

Geo-Targeted IPs

Google Maps results are localized. When scraping businesses in Houston, Texas, use a US mobile proxy — ideally one from the Texas region. This ensures search results match what a real local user would see.

Low Concurrency

Google Maps is particularly sensitive to high request volumes. Limit concurrent requests to 3-5 per proxy IP, with 5-10 second delays between requests.

Technical Implementation

Google Maps API Alternative

Before scraping the Maps web interface, consider the official Google Places API. It has legitimate rate limits and costs ($17 per 1,000 requests for Place Details), which may be acceptable for small-scale needs:

import requests

def google_places_search(query, api_key, location, radius=5000):
    """Search Google Places API (official method)"""
    url = "https://maps.googleapis.com/maps/api/place/textsearch/json"
    params = {
        "query": query,
        "location": location,  # "29.7604,-95.3698" for Houston
        "radius": radius,
        "key": api_key,
    }
    response = requests.get(url, params=params)
    return response.json().get("results", [])

For large-scale extraction where API costs become prohibitive, web scraping with proxies is the practical alternative.

Scraping Google Maps Search Results

Google Maps renders results via JavaScript, so you need browser automation:

from playwright.async_api import async_playwright
import asyncio
import random
import json

async def scrape_google_maps(query, proxy_config, max_results=100):
    """Scrape Google Maps search results"""
    async with async_playwright() as p:
        browser = await p.chromium.launch(
            headless=False,
            proxy={
                "server": proxy_config["server"],
                "username": proxy_config["username"],
                "password": proxy_config["password"],
            }
        )

        context = await browser.new_context(
            viewport={"width": 1920, "height": 1080},
            locale="en-US",
            timezone_id="America/Chicago",
            geolocation={"latitude": 29.7604, "longitude": -95.3698},
            permissions=["geolocation"],
        )

        page = await context.new_page()
        search_url = f"https://www.google.com/maps/search/{query.replace(' ', '+')}"
        await page.goto(search_url)
        await page.wait_for_timeout(random.randint(3000, 6000))

        businesses = []
        scroll_container = await page.query_selector('[role="feed"]')

        while len(businesses) < max_results:
            # Scroll to load more results
            if scroll_container:
                await scroll_container.evaluate(
                    'el => el.scrollTop = el.scrollHeight'
                )
            await page.wait_for_timeout(random.randint(2000, 5000))

            # Check for end of results
            end_marker = await page.query_selector('text="You\'ve reached the end of the list"')
            if end_marker:
                break

            # Extract visible results
            results = await page.query_selector_all('[data-result-index]')
            for result in results[len(businesses):]:
                business = await extract_business_card(result)
                if business and business not in businesses:
                    businesses.append(business)

        await browser.close()
        return businesses


async def extract_business_card(element):
    """Extract business data from a Maps result card"""
    try:
        business = {}

        # Business name
        name_el = await element.query_selector('[class*="fontHeadlineSmall"]')
        business['name'] = await name_el.inner_text() if name_el else None

        # Rating and review count
        rating_el = await element.query_selector('[class*="fontBodyMedium"] span[role="img"]')
        if rating_el:
            aria_label = await rating_el.get_attribute('aria-label')
            if aria_label:
                parts = aria_label.split()
                business['rating'] = float(parts[0]) if parts else None

        # Business type and address
        info_els = await element.query_selector_all('[class*="fontBodyMedium"] > span')
        texts = []
        for el in info_els:
            text = await el.inner_text()
            texts.append(text.strip())
        if texts:
            business['category'] = texts[0] if len(texts) > 0 else None
            business['address'] = texts[-1] if len(texts) > 1 else None

        return business if business.get('name') else None
    except Exception:
        return None

Extracting Detailed Business Information

For each business in your search results, click through to get full details:

async def get_business_details(page, business_element, proxy_config):
    """Click a business result and extract full details"""
    await business_element.click()
    await page.wait_for_timeout(random.randint(3000, 6000))

    details = {}

    # Phone number
    phone_el = await page.query_selector('[data-item-id*="phone"]')
    if phone_el:
        details['phone'] = await phone_el.inner_text()

    # Website
    website_el = await page.query_selector('[data-item-id="authority"]')
    if website_el:
        details['website'] = await website_el.get_attribute('href')

    # Address
    address_el = await page.query_selector('[data-item-id*="address"]')
    if address_el:
        details['address'] = await address_el.inner_text()

    # Hours
    hours_el = await page.query_selector('[data-item-id*="oh"]')
    if hours_el:
        details['hours'] = await hours_el.inner_text()

    # Review count
    review_el = await page.query_selector('button[jsaction*="review"]')
    if review_el:
        review_text = await review_el.inner_text()
        import re
        numbers = re.findall(r'[\d,]+', review_text)
        if numbers:
            details['review_count'] = int(numbers[0].replace(',', ''))

    return details

Anti-Detection Strategies

Google Maps scraping requires careful anti-detection measures. Understanding concepts like IP rotation and fingerprinting is essential for long-term success.

Request Pacing

class GoogleMapsPacer:
    """Control request pacing for Google Maps scraping"""

    def __init__(self):
        self.request_count = 0
        self.session_start = time.time()

    async def pace(self, page):
        """Apply human-like pacing between actions"""
        self.request_count += 1

        # Base delay
        delay = random.uniform(5, 10)

        # Longer pauses every 10-15 actions (simulating reading)
        if self.request_count % random.randint(10, 15) == 0:
            delay = random.uniform(30, 60)

        # Session break every 50-80 actions
        if self.request_count % random.randint(50, 80) == 0:
            delay = random.uniform(120, 300)  # 2-5 minute break

        await page.wait_for_timeout(int(delay * 1000))

        # Occasionally perform non-scraping actions
        if random.random() < 0.1:
            await self.random_map_interaction(page)

    async def random_map_interaction(self, page):
        """Simulate natural map browsing behavior"""
        actions = [
            lambda: page.mouse.wheel(0, random.randint(-300, 300)),
            lambda: page.mouse.click(
                random.randint(400, 1200),
                random.randint(200, 800)
            ),
        ]
        action = random.choice(actions)
        await action()
        await page.wait_for_timeout(random.randint(1000, 3000))

CAPTCHA Handling

Google Maps will present CAPTCHAs after sustained scraping. Implement detection and response:

async def check_captcha(page):
    """Detect if Google is showing a CAPTCHA"""
    captcha_indicators = [
        'text="I\'m not a robot"',
        '[class*="captcha"]',
        'iframe[src*="recaptcha"]',
    ]

    for selector in captcha_indicators:
        element = await page.query_selector(selector)
        if element:
            return True
    return False

async def handle_captcha(page, strategy="pause"):
    """Handle CAPTCHA detection"""
    if strategy == "pause":
        # Pause and alert for manual solving
        print("CAPTCHA detected - pausing for 5 minutes")
        await page.wait_for_timeout(300000)
    elif strategy == "rotate":
        # Rotate to new proxy IP and retry
        return "rotate_proxy"
    elif strategy == "service":
        # Send to CAPTCHA solving service
        return "solve_captcha"

Structuring Extracted Data

Clean and structure your Google Maps data for CRM import:

import csv
import re

def clean_phone(phone_str):
    """Normalize phone number format"""
    if not phone_str:
        return None
    digits = re.sub(r'[^\d+]', '', phone_str)
    if len(digits) == 10:
        return f"+1{digits}"
    return digits if digits.startswith('+') else f"+{digits}"

def clean_address(address_str):
    """Parse address into components"""
    if not address_str:
        return {}
    parts = address_str.split(',')
    result = {"full_address": address_str}
    if len(parts) >= 3:
        result["street"] = parts[0].strip()
        result["city"] = parts[1].strip()
        state_zip = parts[2].strip().split()
        if len(state_zip) >= 2:
            result["state"] = state_zip[0]
            result["zip"] = state_zip[1]
    return result

def export_leads(businesses, filename="google_maps_leads.csv"):
    """Export cleaned leads to CSV"""
    fieldnames = [
        'name', 'category', 'phone', 'website', 'email',
        'address', 'city', 'state', 'zip',
        'rating', 'review_count'
    ]

    with open(filename, 'w', newline='') as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()

        for biz in businesses:
            address = clean_address(biz.get('address'))
            row = {
                'name': biz.get('name'),
                'category': biz.get('category'),
                'phone': clean_phone(biz.get('phone')),
                'website': biz.get('website'),
                'address': address.get('street'),
                'city': address.get('city'),
                'state': address.get('state'),
                'zip': address.get('zip'),
                'rating': biz.get('rating'),
                'review_count': biz.get('review_count'),
            }
            writer.writerow(row)

Scaling Across Multiple Cities

For national lead generation campaigns, run parallel scraping sessions across multiple geographic targets:

async def multi_city_scrape(search_query, cities, proxy_pool):
    """Scrape Google Maps across multiple cities"""
    tasks = []

    for city in cities:
        proxy = proxy_pool.get_proxy(geo=city["state"])
        query = f"{search_query} in {city['name']}, {city['state']}"
        tasks.append(scrape_google_maps(query, proxy, max_results=200))

    results = await asyncio.gather(*tasks)

    # Flatten and deduplicate by phone number
    all_businesses = []
    seen_phones = set()

    for city_results in results:
        for biz in city_results:
            phone = clean_phone(biz.get('phone'))
            if phone and phone not in seen_phones:
                seen_phones.add(phone)
                all_businesses.append(biz)

    return all_businesses

Email Discovery from Google Maps Data

Google Maps listings rarely include email addresses directly. After extracting phone numbers and websites, enrich with email data by visiting company websites through your web scraping proxy infrastructure:

async def enrich_with_email(business, proxy_url):
    """Visit business website to find email addresses"""
    if not business.get('website'):
        return business

    try:
        response = requests.get(
            business['website'],
            proxies={"https": proxy_url},
            timeout=15,
            headers={"User-Agent": "Mozilla/5.0"}
        )
        emails = re.findall(
            r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}',
            response.text
        )
        # Filter to domain-matching emails
        domain = urlparse(business['website']).netloc.replace('www.', '')
        business['emails'] = [e for e in set(emails) if domain in e]
    except Exception:
        business['emails'] = []

    return business

Performance Expectations

With a properly configured mobile proxy setup:

MetricExpected Value
Businesses per hour200-400
Detail pages per hour100-200
CAPTCHA frequencyEvery 300-500 requests
Data completeness (name + phone)95%+
Website availability70-80%
Email discovery rate40-60% (after website enrichment)

Conclusion

Google Maps is an unmatched source of local business leads, and mobile proxies make large-scale extraction practical. The combination of browser automation, careful pacing, geo-targeted proxies, and post-extraction enrichment creates a pipeline capable of generating thousands of qualified local business leads per day. Start with a single city and business category, validate your data quality, and then expand systematically across geographies and industries.


Related Reading

Scroll to Top