Pinnacle is the sharpest book on the planet, and its lines are the closest thing to a consensus market price in sports betting. If you’re building a sharp model, closing line value (CLV) analysis, or an arbitrage detector, scraping Pinnacle Sports lines is the starting point — not an afterthought.
Why Pinnacle Lines Are Different
Most sportsbooks shade their lines to protect against sharp action. Pinnacle doesn’t. It accepts high limits from winning players, which means its odds reflect real market consensus faster than any retail book. A line move at Pinnacle is signal; a line move at a square book might just be liability management.
For quants, this means two things. First, Pinnacle closing lines are the standard benchmark for CLV. Second, Pinnacle’s API-like structure makes it one of the more straightforward books to scrape compared to heavily obfuscated platforms like FanDuel or BetMGM, both of which wrap their odds in React SPAs with session token requirements.
Pinnacle’s Actual Data Structure
Pinnacle exposes data through a semi-public REST API at https://guest.api.pinnacle.com. No auth is required for basic odds — you just need to know the endpoint structure and respect rate limits.
Key endpoints:
GET /v1/leagues?sportId={id}— returns all active leagues for a sportGET /v1/fixtures?sportId={id}&leagueId={id}— returns upcoming eventsGET /v1/odds?sportId={id}&leagueId={id}&oddsFormat=American— returns current oddsGET /v1/special/odds— props and specials
Sport IDs you’ll use constantly: football = 29, basketball = 4, tennis = 33, soccer = 1, baseball = 3.
The odds response is clean JSON. Each event has a matchupId, and the odds object contains moneyline, spread, and total keys with home, away, and draw sub-fields. Crucially, it also returns cutoff (the scheduled start time in ISO 8601) and a limits field that reflects Pinnacle’s maximum accepted bet — a useful proxy for market confidence.
Scraping Setup and Rate Limits
The API is accessible without cookies or browser fingerprinting, which makes it faster to get running than scraping Bet365 or other geo-restricted platforms. That said, Pinnacle does throttle aggressively at the IP level if you poll too frequently.
Recommended polling intervals:
| Scenario | Poll interval | Notes |
|---|---|---|
| Pre-game model input | Every 5 minutes | Stable, low IP risk |
| Line movement tracking | Every 60–90 seconds | Use rotating proxies |
| Live/in-play odds | Every 10–15 seconds | High risk of 429s without proxy rotation |
| CLV snapshot (closing) | Once at event start | Single request per event |
For rotating proxies, residential or mobile IPs from Singapore or the US work well. Datacenter IPs get rate-limited within minutes at sub-minute polling intervals.
A minimal Python scraper:
import httpx
import time
BASE = "https://guest.api.pinnacle.com/v1"
HEADERS = {
"Accept": "application/json",
"Referer": "https://www.pinnacle.com/",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
}
def fetch_odds(sport_id: int, league_id: int) -> dict:
url = f"{BASE}/odds"
params = {
"sportId": sport_id,
"leagueId": league_id,
"oddsFormat": "American",
"since": 0
}
r = httpx.get(url, headers=HEADERS, params=params, timeout=10)
r.raise_for_status()
return r.json()
def poll(sport_id: int, league_id: int, interval: int = 300):
while True:
try:
data = fetch_odds(sport_id, league_id)
# process data["leagues"][0]["events"]
print(f"Fetched {len(data.get('leagues', []))} leagues")
except httpx.HTTPStatusError as e:
print(f"HTTP {e.response.status_code}: backing off")
time.sleep(interval * 3)
time.sleep(interval)The since parameter is key for incremental updates — Pinnacle returns a last timestamp in the response, and you can pass it back as since to receive only changed records. This cuts payload size dramatically for high-frequency polling.
Handling the Response for Sharp Models
Raw odds aren’t enough for a sharp model. You need to compute implied probability, remove the vig, and track line movement over time. Here’s the standard workflow:
- Parse the
pricefields from the odds response (American format) - Convert to decimal:
decimal = (price / 100) + 1for positives,decimal = (-100 / price) + 1for negatives - Compute raw implied prob:
1 / decimal - Sum the raw probs for both sides to get the overround
- Divide each raw prob by the overround to get the fair (no-vig) probability
- Store timestamped snapshots in a time-series table (Postgres with TimescaleDB works well for this)
- Compare your model’s implied prob to the no-vig fair prob to generate edge estimates
For CLV, your closing snapshot must be captured within 5 minutes of cutoff. Build a job that checks cutoff on every event and schedules a final scrape accordingly — don’t rely on a fixed cron.
Pinnacle also returns opening lines if you use the historical odds endpoint: GET /v1/odds/history?matchupId={id}. This is invaluable for building opening-to-closing line movement features. The DraftKings scraping guide covers how to cross-reference retail book movement against a sharp benchmark like Pinnacle — the same pattern applies whether you’re flagging steam moves or quantifying CLV distribution.
Geo-Restrictions and IP Strategy
Pinnacle’s main site is blocked in the US. The guest API, however, responds without geo-checks in most regions. If you’re running infrastructure in a US data center, expect inconsistent access — use a proxy layer based in Canada, the UK, or Southeast Asia.
Key points on IP management:
- Rotate IPs per session, not per request, to mimic normal browser behavior
- Mobile residential proxies perform better than static datacenter IPs for sustained polling
- Respect the
Retry-Afterheader on 429 responses — ignoring it gets your IP range flagged faster - For in-play scraping, a pool of 5–10 IPs with round-robin rotation at 30-second intervals is the minimum viable setup
If you’re also pulling from closed or geo-locked platforms, the same proxy infrastructure handles both. The session management complexity is lower with Pinnacle than with most US-facing books.
Bottom Line
Pinnacle’s semi-open API makes it the easiest sharp book to integrate into a quant pipeline, but IP management and incremental polling via the since parameter are non-negotiable if you’re running at useful frequency. Use fair-value conversion from the start — raw American odds are not model inputs. DRT covers the full stack of sportsbook data pipelines, from retail books to sharp feeds, so check back as the competitive landscape shifts heading into 2026 season cycles.