How to Scrape Yahoo Finance Stock Data in 2026

Yahoo Finance is one of the most widely used financial data platforms, providing free access to stock prices, historical data, financial statements, analyst estimates, and market news for thousands of publicly traded companies worldwide. For quantitative traders, financial analysts, investment researchers, and fintech developers, scraping Yahoo Finance provides comprehensive market data at no cost.

looking for premium 4G/5G IPs? our Singapore mobile proxies for finance scraping start at $40/month for 200GB.

This guide covers how to extract Yahoo Finance data using Python with the yfinance library and custom scraping approaches.

What Data Can You Extract?

Yahoo Finance provides extensive financial data:

Stock prices (real-time quotes, historical OHLCV data)
Financial statements (income statement, balance sheet, cash flow)
Company information (sector, industry, employees, description)
Analyst recommendations and price targets
Earnings data (EPS, revenue, earnings dates)
Dividend history and yield
Options chain data
Market indices and ETF data
Financial news and articles

Example JSON Output

{
  "ticker": "AAPL",
  "company_name": "Apple Inc.",
  "current_price": 245.67,
  "market_cap": 3890000000000,
  "pe_ratio": 32.5,
  "dividend_yield": 0.0044,
  "52_week_high": 260.10,
  "52_week_low": 164.08,
  "earnings_date": "2026-04-28",
  "analyst_target_price": 270.00,
  "recommendation": "Buy"
}

Prerequisites

pip install yfinance requests beautifulsoup4 pandas

Method 1: Using yfinance (Recommended)

The yfinance library is the most popular and reliable way to access Yahoo Finance data.

import yfinance as yf
import pandas as pd
import json
from datetime import datetime, timedelta

class YahooFinanceScraper:
    def __init__(self):
        pass

    def get_stock_info(self, ticker):
        """Get comprehensive stock information."""
        stock = yf.Ticker(ticker)
        info = stock.info

        return {
            "ticker": ticker,
            "name": info.get("longName"),
            "sector": info.get("sector"),
            "industry": info.get("industry"),
            "current_price": info.get("currentPrice"),
            "market_cap": info.get("marketCap"),
            "pe_ratio": info.get("trailingPE"),
            "forward_pe": info.get("forwardPE"),
            "dividend_yield": info.get("dividendYield"),
            "52_week_high": info.get("fiftyTwoWeekHigh"),
            "52_week_low": info.get("fiftyTwoWeekLow"),
            "volume": info.get("volume"),
            "avg_volume": info.get("averageVolume"),
            "beta": info.get("beta"),
            "earnings_date": str(info.get("earningsDate")),
            "target_mean_price": info.get("targetMeanPrice"),
            "recommendation": info.get("recommendationKey"),
            "total_revenue": info.get("totalRevenue"),
            "net_income": info.get("netIncomeToCommon"),
            "employees": info.get("fullTimeEmployees"),
        }

    def get_historical_data(self, ticker, period="1y", interval="1d"):
        """Get historical price data."""
        stock = yf.Ticker(ticker)
        hist = stock.history(period=period, interval=interval)

        return hist.reset_index().to_dict(orient="records")

    def get_financials(self, ticker):
        """Get financial statements."""
        stock = yf.Ticker(ticker)

        return {
            "income_statement": stock.financials.to_dict() if not stock.financials.empty else {},
            "balance_sheet": stock.balance_sheet.to_dict() if not stock.balance_sheet.empty else {},
            "cash_flow": stock.cashflow.to_dict() if not stock.cashflow.empty else {},
        }

    def get_analyst_recommendations(self, ticker):
        """Get analyst recommendations."""
        stock = yf.Ticker(ticker)
        recs = stock.recommendations

        if recs is not None and not recs.empty:
            return recs.tail(20).reset_index().to_dict(orient="records")
        return []

    def get_options_chain(self, ticker, expiration_date=None):
        """Get options chain data."""
        stock = yf.Ticker(ticker)

        if expiration_date:
            opts = stock.option_chain(expiration_date)
        else:
            expirations = stock.options
            if expirations:
                opts = stock.option_chain(expirations[0])
            else:
                return None

        return {
            "calls": opts.calls.to_dict(orient="records"),
            "puts": opts.puts.to_dict(orient="records"),
        }

    def get_multiple_stocks(self, tickers, period="1mo"):
        """Get data for multiple stocks at once."""
        data = yf.download(tickers, period=period, group_by="ticker")
        return data

    def get_earnings_history(self, ticker):
        """Get historical earnings data."""
        stock = yf.Ticker(ticker)
        earnings = stock.earnings_history

        if earnings is not None and not earnings.empty:
            return earnings.to_dict(orient="records")
        return []

    def screen_stocks(self, tickers, min_market_cap=None, max_pe=None, min_dividend=None):
        """Simple stock screener."""
        results = []

        for ticker in tickers:
            try:
                info = self.get_stock_info(ticker)
                passed = True

                if min_market_cap and (info.get("market_cap") or 0) < min_market_cap:
                    passed = False
                if max_pe and (info.get("pe_ratio") or float('inf')) > max_pe:
                    passed = False
                if min_dividend and (info.get("dividend_yield") or 0) < min_dividend:
                    passed = False

                if passed:
                    results.append(info)

            except Exception as e:
                print(f"Error processing {ticker}: {e}")

        return results


# Usage
scraper = YahooFinanceScraper()

# Get stock info
aapl = scraper.get_stock_info("AAPL")
print(json.dumps(aapl, indent=2, default=str))

# Get historical data
hist = scraper.get_historical_data("AAPL", period="6mo")
print(f"Historical data points: {len(hist)}")

# Get financials
financials = scraper.get_financials("AAPL")
print(f"Income statement columns: {len(financials['income_statement'])}")

# Get analyst recommendations
recs = scraper.get_analyst_recommendations("AAPL")
print(f"Analyst recommendations: {len(recs)}")

# Simple screen
tech_stocks = ["AAPL", "MSFT", "GOOGL", "META", "NVDA"]
screened = scraper.screen_stocks(tech_stocks, min_market_cap=1e12)
print(f"Stocks passing screen: {len(screened)}")

Method 2: Direct Web Scraping

For data not available through yfinance:

import requests
from bs4 import BeautifulSoup
from fake_useragent import UserAgent
import json

class YahooFinanceWebScraper:
    def __init__(self, proxy_url=None):
        self.session = requests.Session()
        self.ua = UserAgent()
        self.proxy_url = proxy_url

    def _get_headers(self):
        return {
            "User-Agent": self.ua.random,
            "Accept": "text/html,application/xhtml+xml",
        }

    def _get_proxies(self):
        if self.proxy_url:
            return {"http": self.proxy_url, "https": self.proxy_url}
        return None

    def get_trending_tickers(self):
        """Scrape trending tickers from Yahoo Finance."""
        url = "https://finance.yahoo.com/trending-tickers"
        try:
            response = self.session.get(url, headers=self._get_headers(), proxies=self._get_proxies(), timeout=30)
            response.raise_for_status()
            soup = BeautifulSoup(response.text, "lxml")

            tickers = []
            rows = soup.select("table tbody tr")
            for row in rows:
                cells = row.select("td")
                if len(cells) >= 4:
                    tickers.append({
                        "symbol": cells[0].get_text(strip=True),
                        "name": cells[1].get_text(strip=True),
                        "price": cells[2].get_text(strip=True),
                        "change": cells[3].get_text(strip=True),
                    })
            return tickers
        except Exception as e:
            print(f"Error: {e}")
            return []

    def get_news(self, ticker):
        """Scrape news articles for a ticker."""
        url = f"https://finance.yahoo.com/quote/{ticker}/news"
        try:
            response = self.session.get(url, headers=self._get_headers(), proxies=self._get_proxies(), timeout=30)
            response.raise_for_status()
            soup = BeautifulSoup(response.text, "lxml")

            articles = []
            news_items = soup.select("li[class*='stream-item'], div[class*='news-stream'] li")
            for item in news_items:
                title = item.select_one("h3, a")
                link = item.select_one("a[href]")
                articles.append({
                    "title": title.get_text(strip=True) if title else None,
                    "url": link["href"] if link else None,
                })
            return articles[:20]
        except Exception as e:
            print(f"Error: {e}")
            return []


# Usage
web_scraper = YahooFinanceWebScraper(proxy_url="http://user:pass@proxy:port")
trending = web_scraper.get_trending_tickers()
print(json.dumps(trending[:5], indent=2))

Proxy Recommendations

Proxy Type	Necessity	Best For
None	yfinance library	Standard data access
Datacenter	Optional	High-frequency data pulls
Residential	Optional	Web scraping at scale

The yfinance library typically doesn’t require proxies. For high-frequency data access or web scraping, residential proxies can help avoid rate limits.

Legal Considerations

Terms of Service: Yahoo Finance’s ToS restrict automated data collection beyond their API.
Data Redistribution: Redistribution of financial data may violate exchange agreements.
Real-Time Data: Real-time quotes may have licensing requirements.
Commercial Use: Consult legal counsel for commercial financial data products.

See our web scraping compliance guide for details.

Frequently Asked Questions

Is the yfinance library official?

No. yfinance is an unofficial library that accesses Yahoo Finance data. It’s the most widely used method for accessing Yahoo Finance data programmatically but is not endorsed by Yahoo.

How often can I pull data with yfinance?

yfinance has no strict rate limits, but excessive requests may result in temporary blocks. For real-time data, limit pulls to once per minute. For historical data, batch your requests.

Can I get real-time stock prices?

yfinance provides near-real-time prices (15-20 minute delay for US markets). For true real-time data, consider paid data providers or broker APIs.

What are alternatives to Yahoo Finance for financial data?

Alpha Vantage (free API), IEX Cloud, Polygon.io, and Finnhub are popular alternatives with their own APIs and pricing tiers.

Conclusion

Yahoo Finance is one of the most accessible sources for financial market data. The yfinance library handles most data needs without proxies or complex scraping. For supplementary data like news and trending tickers, web scraping with proxies provides additional coverage.

For more financial data guides, visit our web scraping proxy guide and proxy provider comparisons.