How to Scrape Trade Show and Conference Attendee Lists
Trade shows and industry conferences concentrate your ideal prospects in one place. Exhibitor lists, speaker profiles, and attendee directories contain pre-qualified leads — these are companies actively investing in your industry. Manually collecting this data from dozens of event websites is time-consuming, but automated scraping with mobile proxies makes it possible to build comprehensive event-based lead databases.
This guide covers how to extract exhibitor data, speaker information, and attendee details from major event platforms and individual conference websites.
Why Event Data Is High-Quality Lead Data
Event attendees and exhibitors are among the most qualified B2B leads available:
- Active buyers — Companies attending trade shows are actively evaluating solutions.
- Budget confirmed — Exhibitor booth fees range from $5,000 to $100,000+, indicating real purchasing power.
- Decision makers — Conference attendees are typically senior professionals with buying authority.
- Industry verified — Attendance confirms the company operates in your target vertical.
- Timing signal — Companies preparing for shows are often in active procurement cycles.
Event Data Sources
Major Event Platforms
Most trade shows host their exhibitor and session data on standardized event platforms:
| Platform | Common Events | Data Available |
|---|---|---|
| Map Your Show | Large trade shows | Exhibitor list, booth locations, categories |
| Swapcard | Tech conferences | Attendees, speakers, sponsors |
| Whova | Professional conferences | Agenda, speakers, exhibitors |
| Cvent | Corporate events | Exhibitors, sessions |
| Eventbrite | Smaller events | Organizer info, event details |
| a2z/Personify | Industry trade shows | Exhibitors, products, floor plans |
Individual Event Websites
Many major conferences host their own exhibitor directories. Examples include CES, SXSW, Web Summit, Dreamforce, and hundreds of industry-specific shows.
Scraping Exhibitor Directories
Map Your Show / a2z Platform
Many large trade shows use these platforms for their exhibitor directories. The data is typically loaded via JavaScript:
from playwright.async_api import async_playwright
import asyncio
import random
import json
async def scrape_exhibitor_directory(event_url, proxy_config):
"""Scrape exhibitor directory from event website"""
async with async_playwright() as p:
browser = await p.chromium.launch(
proxy=proxy_config,
headless=False,
)
page = await browser.new_page()
await page.goto(event_url, wait_until="networkidle")
await page.wait_for_timeout(random.randint(3000, 6000))
exhibitors = []
# Many exhibitor directories load via AJAX — intercept API calls
async def handle_response(response):
if "exhibitor" in response.url.lower() and response.status == 200:
try:
data = await response.json()
if isinstance(data, list):
exhibitors.extend(data)
elif isinstance(data, dict) and "exhibitors" in data:
exhibitors.extend(data["exhibitors"])
except Exception:
pass
page.on("response", handle_response)
# Scroll through the directory to trigger all data loads
await auto_scroll(page)
# If no API data captured, parse the HTML directly
if not exhibitors:
exhibitors = await parse_exhibitor_html(page)
await browser.close()
return exhibitors
async def auto_scroll(page, max_scrolls=50):
"""Scroll through a page to trigger lazy loading"""
for i in range(max_scrolls):
await page.evaluate("window.scrollBy(0, 500)")
await page.wait_for_timeout(random.randint(500, 1500))
# Check if we've reached the bottom
at_bottom = await page.evaluate(
"(window.innerHeight + window.scrollY) >= document.body.scrollHeight"
)
if at_bottom:
break
async def parse_exhibitor_html(page):
"""Parse exhibitor data from HTML when API interception fails"""
exhibitors = []
cards = await page.query_selector_all('[class*="exhibitor"], [class*="company-card"]')
for card in cards:
exhibitor = {}
name_el = await card.query_selector('h2, h3, [class*="name"]')
if name_el:
exhibitor['company_name'] = (await name_el.inner_text()).strip()
booth_el = await card.query_selector('[class*="booth"]')
if booth_el:
exhibitor['booth_number'] = (await booth_el.inner_text()).strip()
category_els = await card.query_selector_all('[class*="category"], [class*="tag"]')
exhibitor['categories'] = []
for cat in category_els:
exhibitor['categories'].append((await cat.inner_text()).strip())
desc_el = await card.query_selector('[class*="description"], p')
if desc_el:
exhibitor['description'] = (await desc_el.inner_text()).strip()[:500]
link_el = await card.query_selector('a[href*="http"]')
if link_el:
exhibitor['website'] = await link_el.get_attribute('href')
if exhibitor.get('company_name'):
exhibitors.append(exhibitor)
return exhibitorsExtracting Exhibitor Detail Pages
Each exhibitor listing often links to a detail page with contact information:
async def scrape_exhibitor_detail(page, detail_url):
"""Scrape full details from an exhibitor's profile page"""
await page.goto(detail_url, wait_until="networkidle")
await page.wait_for_timeout(random.randint(2000, 5000))
details = {}
# Company description
desc_el = await page.query_selector('[class*="description"], [class*="about"]')
if desc_el:
details['description'] = (await desc_el.inner_text()).strip()
# Contact info
contact_el = await page.query_selector('[class*="contact"]')
if contact_el:
contact_html = await contact_el.inner_html()
# Extract email
import re
emails = re.findall(r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}', contact_html)
details['emails'] = list(set(emails))
# Extract phone
phones = re.findall(r'[\+]?[(]?[0-9]{1,3}[)]?[-\s\.]?[0-9]{3}[-\s\.]?[0-9]{4,6}', contact_html)
details['phones'] = list(set(phones))
# Website
website_el = await page.query_selector('a[href*="http"][class*="website"]')
if website_el:
details['website'] = await website_el.get_attribute('href')
# Social links
social_links = await page.query_selector_all('a[href*="linkedin"], a[href*="twitter"]')
details['social'] = {}
for link in social_links:
href = await link.get_attribute('href')
if 'linkedin' in href:
details['social']['linkedin'] = href
elif 'twitter' in href:
details['social']['twitter'] = href
# Products/services
product_els = await page.query_selector_all('[class*="product"]')
details['products'] = []
for prod in product_els:
details['products'].append((await prod.inner_text()).strip())
return detailsScraping Speaker and Session Data
Conference speaker lists contain decision-makers and thought leaders:
async def scrape_speakers(agenda_url, proxy_config):
"""Scrape speaker data from conference agenda"""
async with async_playwright() as p:
browser = await p.chromium.launch(proxy=proxy_config)
page = await browser.new_page()
await page.goto(agenda_url, wait_until="networkidle")
await page.wait_for_timeout(random.randint(3000, 6000))
speakers = []
speaker_cards = await page.query_selector_all(
'[class*="speaker"], [class*="presenter"]'
)
for card in speaker_cards:
speaker = {}
name_el = await card.query_selector('[class*="name"], h3, h4')
if name_el:
speaker['name'] = (await name_el.inner_text()).strip()
title_el = await card.query_selector('[class*="title"], [class*="role"]')
if title_el:
speaker['title'] = (await title_el.inner_text()).strip()
company_el = await card.query_selector('[class*="company"], [class*="org"]')
if company_el:
speaker['company'] = (await company_el.inner_text()).strip()
bio_el = await card.query_selector('[class*="bio"], p')
if bio_el:
speaker['bio'] = (await bio_el.inner_text()).strip()[:300]
img_el = await card.query_selector('img')
if img_el:
speaker['photo_url'] = await img_el.get_attribute('src')
if speaker.get('name'):
speakers.append(speaker)
await browser.close()
return speakersBuilding an Event Calendar Scraping System
Automate discovery and scraping across multiple events throughout the year. For foundational concepts on proxy rotation used in this system, check our proxy glossary.
import schedule
from datetime import datetime, timedelta
class EventCalendarScraper:
"""Automated scraping across multiple trade shows and conferences"""
def __init__(self, proxy_pool):
self.proxy_pool = proxy_pool
self.events = []
self.all_leads = []
def add_event(self, event):
"""Register an event for scraping"""
self.events.append({
"name": event["name"],
"exhibitor_url": event.get("exhibitor_url"),
"speaker_url": event.get("speaker_url"),
"event_date": event.get("date"),
"industry": event.get("industry"),
"scrape_start": event.get("date") - timedelta(days=30),
})
async def scrape_upcoming_events(self):
"""Scrape all events happening in the next 60 days"""
today = datetime.now()
upcoming = [
e for e in self.events
if e["event_date"] and today <= e["event_date"] <= today + timedelta(days=60)
]
for event in upcoming:
proxy = self.proxy_pool.get_next()
proxy_config = {
"server": proxy,
}
if event.get("exhibitor_url"):
exhibitors = await scrape_exhibitor_directory(
event["exhibitor_url"],
proxy_config
)
for ex in exhibitors:
ex["event_name"] = event["name"]
ex["event_date"] = event["event_date"].isoformat()
ex["industry"] = event["industry"]
ex["lead_type"] = "exhibitor"
self.all_leads.extend(exhibitors)
if event.get("speaker_url"):
speakers = await scrape_speakers(
event["speaker_url"],
proxy_config
)
for sp in speakers:
sp["event_name"] = event["name"]
sp["event_date"] = event["event_date"].isoformat()
sp["lead_type"] = "speaker"
self.all_leads.extend(speakers)
return self.all_leadsEnriching Event Leads
Event data provides company names but often lacks direct contact information. Enrich with email and phone data using your web scraping infrastructure:
async def enrich_exhibitor(exhibitor, proxy_url):
"""Enrich exhibitor data with contact information"""
enriched = exhibitor.copy()
if exhibitor.get('website'):
try:
response = requests.get(
exhibitor['website'],
proxies={"https": proxy_url},
timeout=15,
headers={"User-Agent": "Mozilla/5.0"}
)
import re
# Extract emails
emails = re.findall(
r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}',
response.text
)
enriched['emails'] = list(set(e.lower() for e in emails))
# Extract phone numbers
phones = re.findall(
r'[\+]?[(]?[0-9]{1,3}[)]?[-\s\.]?[0-9]{3}[-\s\.]?[0-9]{4,6}',
response.text
)
enriched['phones'] = list(set(phones))[:3]
except Exception:
pass
return enrichedCompetitive Intelligence from Event Data
Track which competitors are exhibiting where and what they are promoting:
class CompetitorEventTracker:
"""Track competitor presence across trade shows"""
def __init__(self):
self.competitor_data = {}
def track_competitor(self, competitor_name, event_data):
"""Record competitor's event participation"""
if competitor_name not in self.competitor_data:
self.competitor_data[competitor_name] = []
self.competitor_data[competitor_name].append({
"event": event_data["event_name"],
"date": event_data["event_date"],
"booth_size": event_data.get("booth_number"),
"products_showcased": event_data.get("products", []),
"description": event_data.get("description"),
})
def get_competitor_report(self, competitor_name):
"""Generate report on competitor's event strategy"""
events = self.competitor_data.get(competitor_name, [])
return {
"competitor": competitor_name,
"total_events": len(events),
"events": sorted(events, key=lambda x: x.get("date", ""), reverse=True),
"products_promoted": list(set(
p for e in events for p in e.get("products_showcased", [])
)),
}Output Formatting
Structure event leads for import into your CRM or outreach tool:
import csv
def export_event_leads(leads, output_file="event_leads.csv"):
"""Export event leads to CSV"""
fieldnames = [
'company_name', 'booth_number', 'categories', 'website',
'emails', 'phones', 'event_name', 'event_date',
'industry', 'lead_type', 'description'
]
with open(output_file, 'w', newline='', encoding='utf-8') as f:
writer = csv.DictWriter(f, fieldnames=fieldnames, extrasaction='ignore')
writer.writeheader()
for lead in leads:
row = lead.copy()
row['categories'] = '; '.join(lead.get('categories', []))
row['emails'] = '; '.join(lead.get('emails', []))
row['phones'] = '; '.join(lead.get('phones', []))
writer.writerow(row)
print(f"Exported {len(leads)} event leads to {output_file}")Timing Your Outreach
Event-based leads have a natural outreach timeline:
- 30 days before event — Initial outreach: “We noticed you’re exhibiting at [Event]. Let’s connect beforehand.”
- During event — Real-time engagement if you are also attending.
- 1-7 days after event — Follow-up: “Great seeing [Company] at [Event]. Here’s how we can help…”
- 30 days after event — Value-based follow-up with industry insights.
Conclusion
Trade show and conference data provides some of the highest-quality B2B leads available — pre-qualified by industry, budget, and buying intent. Automated scraping with mobile proxies lets you build comprehensive databases across dozens of events per year, capturing exhibitor details, speaker profiles, and session data. The key is systematic enrichment of raw event data with contact information from company websites and professional networks. Start with the five largest events in your industry, validate the lead quality through outreach, and expand your event calendar from there.
- How to Build an Automated Lead Scraping Pipeline with Proxies
- Building a B2B Contact Enrichment Pipeline with Mobile Proxies
- How to Scrape Job Listings at Scale with Rotating Proxies
- Proxies for HR Tech: Salary Benchmarking & Talent Intelligence
- How to Scrape AliExpress Product Data Without Getting Blocked
- Amazon Buy Box Monitoring: Proxy Setup for Continuous Tracking
- How to Build an Automated Lead Scraping Pipeline with Proxies
- Building a B2B Contact Enrichment Pipeline with Mobile Proxies
- How to Scrape Job Listings at Scale with Rotating Proxies
- Proxies for HR Tech: Salary Benchmarking & Talent Intelligence
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
- How to Build an Automated Lead Scraping Pipeline with Proxies
- Building a B2B Contact Enrichment Pipeline with Mobile Proxies
- How to Scrape Job Listings at Scale with Rotating Proxies
- Proxies for HR Tech: Salary Benchmarking & Talent Intelligence
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked
Related Reading
- How to Build an Automated Lead Scraping Pipeline with Proxies
- Building a B2B Contact Enrichment Pipeline with Mobile Proxies
- How to Scrape Job Listings at Scale with Rotating Proxies
- Proxies for HR Tech: Salary Benchmarking & Talent Intelligence
- aiohttp + BeautifulSoup: Async Python Scraping
- How to Scrape AliExpress Product Data Without Getting Blocked