How to Scrape Yahoo Finance Stock Data in 2026
Yahoo Finance is one of the most widely used financial data platforms, providing free access to stock prices, historical data, financial statements, analyst estimates, and market news for thousands of publicly traded companies worldwide. For quantitative traders, financial analysts, investment researchers, and fintech developers, scraping Yahoo Finance provides comprehensive market data at no cost.
looking for premium 4G/5G IPs? our Singapore mobile proxies for finance scraping start at $40/month for 200GB.
This guide covers how to extract Yahoo Finance data using Python with the yfinance library and custom scraping approaches.
What Data Can You Extract?
Yahoo Finance provides extensive financial data:
- Stock prices (real-time quotes, historical OHLCV data)
- Financial statements (income statement, balance sheet, cash flow)
- Company information (sector, industry, employees, description)
- Analyst recommendations and price targets
- Earnings data (EPS, revenue, earnings dates)
- Dividend history and yield
- Options chain data
- Market indices and ETF data
- Financial news and articles
Example JSON Output
{
"ticker": "AAPL",
"company_name": "Apple Inc.",
"current_price": 245.67,
"market_cap": 3890000000000,
"pe_ratio": 32.5,
"dividend_yield": 0.0044,
"52_week_high": 260.10,
"52_week_low": 164.08,
"earnings_date": "2026-04-28",
"analyst_target_price": 270.00,
"recommendation": "Buy"
}Prerequisites
pip install yfinance requests beautifulsoup4 pandasMethod 1: Using yfinance (Recommended)
The yfinance library is the most popular and reliable way to access Yahoo Finance data.
import yfinance as yf
import pandas as pd
import json
from datetime import datetime, timedelta
class YahooFinanceScraper:
def __init__(self):
pass
def get_stock_info(self, ticker):
"""Get comprehensive stock information."""
stock = yf.Ticker(ticker)
info = stock.info
return {
"ticker": ticker,
"name": info.get("longName"),
"sector": info.get("sector"),
"industry": info.get("industry"),
"current_price": info.get("currentPrice"),
"market_cap": info.get("marketCap"),
"pe_ratio": info.get("trailingPE"),
"forward_pe": info.get("forwardPE"),
"dividend_yield": info.get("dividendYield"),
"52_week_high": info.get("fiftyTwoWeekHigh"),
"52_week_low": info.get("fiftyTwoWeekLow"),
"volume": info.get("volume"),
"avg_volume": info.get("averageVolume"),
"beta": info.get("beta"),
"earnings_date": str(info.get("earningsDate")),
"target_mean_price": info.get("targetMeanPrice"),
"recommendation": info.get("recommendationKey"),
"total_revenue": info.get("totalRevenue"),
"net_income": info.get("netIncomeToCommon"),
"employees": info.get("fullTimeEmployees"),
}
def get_historical_data(self, ticker, period="1y", interval="1d"):
"""Get historical price data."""
stock = yf.Ticker(ticker)
hist = stock.history(period=period, interval=interval)
return hist.reset_index().to_dict(orient="records")
def get_financials(self, ticker):
"""Get financial statements."""
stock = yf.Ticker(ticker)
return {
"income_statement": stock.financials.to_dict() if not stock.financials.empty else {},
"balance_sheet": stock.balance_sheet.to_dict() if not stock.balance_sheet.empty else {},
"cash_flow": stock.cashflow.to_dict() if not stock.cashflow.empty else {},
}
def get_analyst_recommendations(self, ticker):
"""Get analyst recommendations."""
stock = yf.Ticker(ticker)
recs = stock.recommendations
if recs is not None and not recs.empty:
return recs.tail(20).reset_index().to_dict(orient="records")
return []
def get_options_chain(self, ticker, expiration_date=None):
"""Get options chain data."""
stock = yf.Ticker(ticker)
if expiration_date:
opts = stock.option_chain(expiration_date)
else:
expirations = stock.options
if expirations:
opts = stock.option_chain(expirations[0])
else:
return None
return {
"calls": opts.calls.to_dict(orient="records"),
"puts": opts.puts.to_dict(orient="records"),
}
def get_multiple_stocks(self, tickers, period="1mo"):
"""Get data for multiple stocks at once."""
data = yf.download(tickers, period=period, group_by="ticker")
return data
def get_earnings_history(self, ticker):
"""Get historical earnings data."""
stock = yf.Ticker(ticker)
earnings = stock.earnings_history
if earnings is not None and not earnings.empty:
return earnings.to_dict(orient="records")
return []
def screen_stocks(self, tickers, min_market_cap=None, max_pe=None, min_dividend=None):
"""Simple stock screener."""
results = []
for ticker in tickers:
try:
info = self.get_stock_info(ticker)
passed = True
if min_market_cap and (info.get("market_cap") or 0) < min_market_cap:
passed = False
if max_pe and (info.get("pe_ratio") or float('inf')) > max_pe:
passed = False
if min_dividend and (info.get("dividend_yield") or 0) < min_dividend:
passed = False
if passed:
results.append(info)
except Exception as e:
print(f"Error processing {ticker}: {e}")
return results
# Usage
scraper = YahooFinanceScraper()
# Get stock info
aapl = scraper.get_stock_info("AAPL")
print(json.dumps(aapl, indent=2, default=str))
# Get historical data
hist = scraper.get_historical_data("AAPL", period="6mo")
print(f"Historical data points: {len(hist)}")
# Get financials
financials = scraper.get_financials("AAPL")
print(f"Income statement columns: {len(financials['income_statement'])}")
# Get analyst recommendations
recs = scraper.get_analyst_recommendations("AAPL")
print(f"Analyst recommendations: {len(recs)}")
# Simple screen
tech_stocks = ["AAPL", "MSFT", "GOOGL", "META", "NVDA"]
screened = scraper.screen_stocks(tech_stocks, min_market_cap=1e12)
print(f"Stocks passing screen: {len(screened)}")Method 2: Direct Web Scraping
For data not available through yfinance:
import requests
from bs4 import BeautifulSoup
from fake_useragent import UserAgent
import json
class YahooFinanceWebScraper:
def __init__(self, proxy_url=None):
self.session = requests.Session()
self.ua = UserAgent()
self.proxy_url = proxy_url
def _get_headers(self):
return {
"User-Agent": self.ua.random,
"Accept": "text/html,application/xhtml+xml",
}
def _get_proxies(self):
if self.proxy_url:
return {"http": self.proxy_url, "https": self.proxy_url}
return None
def get_trending_tickers(self):
"""Scrape trending tickers from Yahoo Finance."""
url = "https://finance.yahoo.com/trending-tickers"
try:
response = self.session.get(url, headers=self._get_headers(), proxies=self._get_proxies(), timeout=30)
response.raise_for_status()
soup = BeautifulSoup(response.text, "lxml")
tickers = []
rows = soup.select("table tbody tr")
for row in rows:
cells = row.select("td")
if len(cells) >= 4:
tickers.append({
"symbol": cells[0].get_text(strip=True),
"name": cells[1].get_text(strip=True),
"price": cells[2].get_text(strip=True),
"change": cells[3].get_text(strip=True),
})
return tickers
except Exception as e:
print(f"Error: {e}")
return []
def get_news(self, ticker):
"""Scrape news articles for a ticker."""
url = f"https://finance.yahoo.com/quote/{ticker}/news"
try:
response = self.session.get(url, headers=self._get_headers(), proxies=self._get_proxies(), timeout=30)
response.raise_for_status()
soup = BeautifulSoup(response.text, "lxml")
articles = []
news_items = soup.select("li[class*='stream-item'], div[class*='news-stream'] li")
for item in news_items:
title = item.select_one("h3, a")
link = item.select_one("a[href]")
articles.append({
"title": title.get_text(strip=True) if title else None,
"url": link["href"] if link else None,
})
return articles[:20]
except Exception as e:
print(f"Error: {e}")
return []
# Usage
web_scraper = YahooFinanceWebScraper(proxy_url="http://user:pass@proxy:port")
trending = web_scraper.get_trending_tickers()
print(json.dumps(trending[:5], indent=2))Proxy Recommendations
| Proxy Type | Necessity | Best For |
|---|---|---|
| None | yfinance library | Standard data access |
| Datacenter | Optional | High-frequency data pulls |
| Residential | Optional | Web scraping at scale |
The yfinance library typically doesn’t require proxies. For high-frequency data access or web scraping, residential proxies can help avoid rate limits.
Legal Considerations
- Terms of Service: Yahoo Finance’s ToS restrict automated data collection beyond their API.
- Data Redistribution: Redistribution of financial data may violate exchange agreements.
- Real-Time Data: Real-time quotes may have licensing requirements.
- Commercial Use: Consult legal counsel for commercial financial data products.
See our web scraping compliance guide for details.
Frequently Asked Questions
Is the yfinance library official?
No. yfinance is an unofficial library that accesses Yahoo Finance data. It’s the most widely used method for accessing Yahoo Finance data programmatically but is not endorsed by Yahoo.
How often can I pull data with yfinance?
yfinance has no strict rate limits, but excessive requests may result in temporary blocks. For real-time data, limit pulls to once per minute. For historical data, batch your requests.
Can I get real-time stock prices?
yfinance provides near-real-time prices (15-20 minute delay for US markets). For true real-time data, consider paid data providers or broker APIs.
What are alternatives to Yahoo Finance for financial data?
Alpha Vantage (free API), IEX Cloud, Polygon.io, and Finnhub are popular alternatives with their own APIs and pricing tiers.
Conclusion
Yahoo Finance is one of the most accessible sources for financial market data. The yfinance library handles most data needs without proxies or complex scraping. For supplementary data like news and trending tickers, web scraping with proxies provides additional coverage.
For more financial data guides, visit our web scraping proxy guide and proxy provider comparisons.
- How to Scrape AliExpress Product Data
- How to Scrape Amazon Product Reviews in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- How to Scrape AliExpress Product Data
- How to Scrape Amazon Product Reviews in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- How to Scrape AliExpress Product Data
- How to Scrape Amazon Product Reviews in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
Related Reading
- How to Scrape AliExpress Product Data
- How to Scrape Amazon Product Reviews in 2026
- aiohttp + BeautifulSoup: Async Python Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
last updated: May 11, 2026