Web Scraping with VBA/Excel: No-Code Data Pull

Web Scraping with VBA/Excel: No-Code Data Pull

Excel is the most accessible web scraping tool available. You do not need Python, Node.js, or any programming framework. Excel’s built-in Power Query handles many data import tasks with zero code, and VBA (Visual Basic for Applications) provides full scraping capabilities for more complex needs. If your goal is getting web data into a spreadsheet, Excel might be all you need.

This tutorial covers three approaches: Power Query (no code), Web Query (legacy), and VBA macros (full control).

Table of Contents

When to Use Excel for Scraping

Excel scraping is ideal when:

  • Your end goal is a spreadsheet (no data pipeline needed)
  • You are scraping HTML tables or structured data
  • You need a one-off data pull, not a recurring crawler
  • Your team does not have Python/Node.js skills
  • You are pulling data from a small number of pages (under 100)

Excel is NOT ideal for JavaScript-rendered pages, large-scale crawling, or sites requiring proxy rotation. For those, see our Python scraping guide.

Method 1: Power Query (No Code)

Power Query is Excel’s built-in data import tool. It handles most table-based web scraping without any code.

Steps

  1. Open Excel and go to Data > From Web
  2. Enter the URL (e.g., https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal))
  3. Excel detects tables on the page automatically
  4. Select the table you want and click Load

The data imports directly into your spreadsheet.

Power Query M Code (Advanced)

For more control, use Power Query’s M language:

let
    Source = Web.Page(
        Web.Contents("https://books.toscrape.com/")
    ),
    // Select specific table
    Data = Source{0}[Data],
    // Rename columns
    Renamed = Table.RenameColumns(Data, {
        {"Column1", "Title"},
        {"Column2", "Price"}
    }),
    // Filter rows
    Filtered = Table.SelectRows(Renamed, each [Price] <> null)
in
    Filtered

Refreshing Data

Right-click the imported table and select Refresh to pull updated data. You can also set automatic refresh intervals:

  1. Right-click the query in the Queries & Connections pane
  2. Select Properties
  3. Check Refresh every X minutes

Method 2: Web Query (Legacy)

The traditional web query approach still works in older Excel versions:

  1. Go to Data > From Web (or Data > Get External Data > From Web in older versions)
  2. Enter the URL
  3. Click the yellow arrows next to tables you want to import
  4. Click Import

This method auto-detects HTML tables and imports them directly.

Method 3: VBA Macros

VBA gives you full control over HTTP requests and HTML parsing.

Setting Up VBA

  1. Press Alt + F11 to open the VBA editor
  2. Go to Tools > References and enable:
  • Microsoft XML, v6.0 (for HTTP requests)
  • Microsoft HTML Object Library (for HTML parsing)
  1. Insert a new module: Insert > Module

Basic VBA Scraper

Sub ScrapeBooks()
    Dim http As New MSXML2.XMLHTTP60
    Dim html As New HTMLDocument
    Dim books As Object
    Dim book As Object
    Dim row As Long

    ' Send HTTP request
    http.Open "GET", "https://books.toscrape.com/", False
    http.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
    http.send

    ' Parse HTML
    html.body.innerHTML = http.responseText

    ' Find all book elements
    Set books = html.querySelectorAll("article.product_pod")

    ' Write headers
    Cells(1, 1).Value = "Title"
    Cells(1, 2).Value = "Price"
    Cells(1, 3).Value = "Rating"

    ' Extract data
    row = 2
    Dim i As Long
    For i = 0 To books.Length - 1
        Set book = books.Item(i)

        Cells(row, 1).Value = book.querySelector("h3 a").getAttribute("title")
        Cells(row, 2).Value = book.querySelector(".price_color").innerText
        Cells(row, 3).Value = Replace(book.querySelector("p").className, "star-rating ", "")

        row = row + 1
    Next i

    MsgBox "Scraped " & (row - 2) & " books!"
End Sub

VBA HTTP Requests

GET Request

Function FetchPage(url As String) As String
    Dim http As New MSXML2.XMLHTTP60

    http.Open "GET", url, False
    http.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
    http.setRequestHeader "Accept", "text/html"
    http.send

    If http.Status = 200 Then
        FetchPage = http.responseText
    Else
        FetchPage = ""
        Debug.Print "Error: HTTP " & http.Status & " for " & url
    End If
End Function

POST Request

Function PostRequest(url As String, postData As String) As String
    Dim http As New MSXML2.XMLHTTP60

    http.Open "POST", url, False
    http.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
    http.setRequestHeader "User-Agent", "Mozilla/5.0"
    http.send postData

    PostRequest = http.responseText
End Function

' Usage
Dim result As String
result = PostRequest("https://example.com/search", "query=laptops&page=1")

JSON API Request

Function FetchJSON(url As String) As String
    Dim http As New MSXML2.XMLHTTP60

    http.Open "GET", url, False
    http.setRequestHeader "Accept", "application/json"
    http.setRequestHeader "User-Agent", "Mozilla/5.0"
    http.send

    FetchJSON = http.responseText
End Function

' Parse JSON (requires VBA-JSON library or manual parsing)
' Download from: https://github.com/VBA-tools/VBA-JSON

VBA HTML Parsing

querySelector and querySelectorAll

Dim html As New HTMLDocument
html.body.innerHTML = httpResponseText

' Single element
Dim title As Object
Set title = html.querySelector("h1")
Debug.Print title.innerText

' Multiple elements
Dim items As Object
Set items = html.querySelectorAll(".product-card")
Debug.Print "Found " & items.Length & " items"

' Attributes
Dim link As Object
Set link = html.querySelector("a.product-link")
Debug.Print link.getAttribute("href")

' Nested selection
Dim container As Object
Set container = html.querySelector(".products")
Dim childItems As Object
Set childItems = container.querySelectorAll(".item")

Common Selectors

' By class
html.querySelectorAll(".product")

' By ID
html.querySelector("#main-content")

' By attribute
html.querySelectorAll("a[href]")
html.querySelectorAll("[data-id='123']")

' By tag
html.querySelectorAll("tr")

' Combined
html.querySelectorAll("div.product h3 a")

' Nested
html.querySelectorAll("table tbody tr td")

getElementById and getElementsByTagName

' By ID (returns single element)
Dim mainDiv As Object
Set mainDiv = html.getElementById("main-content")

' By tag name (returns collection)
Dim allLinks As Object
Set allLinks = html.getElementsByTagName("a")

Dim i As Long
For i = 0 To allLinks.Length - 1
    Debug.Print allLinks.Item(i).getAttribute("href")
Next i

' By class name
Dim products As Object
Set products = html.getElementsByClassName("product")

Scraping Multiple Pages

Sub ScrapeAllPages()
    Dim http As New MSXML2.XMLHTTP60
    Dim html As New HTMLDocument
    Dim row As Long
    Dim page As Long

    ' Headers
    Cells(1, 1).Value = "Title"
    Cells(1, 2).Value = "Price"
    Cells(1, 3).Value = "Page"

    row = 2

    For page = 1 To 50
        Dim url As String
        url = "https://books.toscrape.com/catalogue/page-" & page & ".html"

        ' Fetch page
        http.Open "GET", url, False
        http.setRequestHeader "User-Agent", "Mozilla/5.0"
        http.send

        If http.Status <> 200 Then
            Debug.Print "Error on page " & page & ": HTTP " & http.Status
            Exit For
        End If

        html.body.innerHTML = http.responseText

        ' Extract books
        Dim books As Object
        Set books = html.querySelectorAll("article.product_pod")

        If books.Length = 0 Then Exit For

        Dim i As Long
        For i = 0 To books.Length - 1
            Dim book As Object
            Set book = books.Item(i)

            Cells(row, 1).Value = book.querySelector("h3 a").getAttribute("title")
            Cells(row, 2).Value = book.querySelector(".price_color").innerText
            Cells(row, 3).Value = page

            row = row + 1
        Next i

        ' Status update
        Application.StatusBar = "Scraping page " & page & "... (" & (row - 2) & " books)"
        DoEvents

        ' Polite delay (1 second)
        Application.Wait Now + TimeValue("00:00:01")
    Next page

    Application.StatusBar = False
    MsgBox "Done! Scraped " & (row - 2) & " books from " & (page - 1) & " pages."
End Sub

Handling Tables

Automatic Table Extraction

Sub ExtractTable()
    Dim html As New HTMLDocument
    Dim http As New MSXML2.XMLHTTP60

    http.Open "GET", "https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)", False
    http.send

    html.body.innerHTML = http.responseText

    ' Find first table with class "wikitable"
    Dim table As Object
    Set table = html.querySelector("table.wikitable")

    If table Is Nothing Then
        MsgBox "No table found!"
        Exit Sub
    End If

    ' Extract rows
    Dim rows As Object
    Set rows = table.querySelectorAll("tr")

    Dim row As Long
    row = 1

    Dim r As Long
    For r = 0 To rows.Length - 1
        Dim cells As Object
        Set cells = rows.Item(r).querySelectorAll("th, td")

        Dim c As Long
        For c = 0 To cells.Length - 1
            Cells(row, c + 1).Value = CleanText(cells.Item(c).innerText)
        Next c

        row = row + 1
    Next r

    MsgBox "Extracted " & (row - 1) & " rows!"
End Sub

Function CleanText(text As String) As String
    ' Remove extra whitespace and line breaks
    CleanText = Trim(Replace(Replace(text, vbLf, " "), vbCr, " "))
    ' Remove multiple spaces
    Do While InStr(CleanText, "  ") > 0
        CleanText = Replace(CleanText, "  ", " ")
    Loop
End Function

Error Handling

Sub SafeScrape()
    On Error GoTo ErrorHandler

    Dim http As New MSXML2.XMLHTTP60
    Dim html As New HTMLDocument

    http.Open "GET", "https://books.toscrape.com/", False
    http.setRequestHeader "User-Agent", "Mozilla/5.0"
    http.send

    If http.Status <> 200 Then
        MsgBox "HTTP Error: " & http.Status
        Exit Sub
    End If

    html.body.innerHTML = http.responseText

    ' Check if element exists before accessing
    Dim title As Object
    Set title = html.querySelector("h1")

    If Not title Is Nothing Then
        Debug.Print "Title: " & title.innerText
    Else
        Debug.Print "Title element not found"
    End If

    Exit Sub

ErrorHandler:
    MsgBox "Error " & Err.Number & ": " & Err.Description
    Debug.Print "Error in SafeScrape: " & Err.Description
End Sub

Retry Logic

Function FetchWithRetry(url As String, maxRetries As Long) As String
    Dim http As New MSXML2.XMLHTTP60
    Dim attempt As Long

    For attempt = 1 To maxRetries
        On Error Resume Next

        http.Open "GET", url, False
        http.setRequestHeader "User-Agent", "Mozilla/5.0"
        http.send

        If Err.Number = 0 And http.Status = 200 Then
            FetchWithRetry = http.responseText
            Exit Function
        End If

        On Error GoTo 0
        Debug.Print "Attempt " & attempt & " failed for " & url
        Application.Wait Now + TimeValue("00:00:02")
    Next attempt

    FetchWithRetry = ""
End Function

Scheduling Automatic Updates

Windows Task Scheduler

  1. Save your workbook as .xlsm (macro-enabled)
  2. Create a VBS wrapper script:
' run_scraper.vbs — save as a .vbs file
Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = False
Set objWorkbook = objExcel.Workbooks.Open("C:\path\to\scraper.xlsm")
objExcel.Run "ScrapeAllPages"
objWorkbook.Save
objWorkbook.Close
objExcel.Quit
  1. Open Windows Task Scheduler
  2. Create a new task that runs wscript.exe "C:\path\to\run_scraper.vbs"
  3. Set your desired schedule (daily, weekly, etc.)

Auto-Run on Open

' In ThisWorkbook module
Private Sub Workbook_Open()
    ' Ask before running
    If MsgBox("Run the scraper?", vbYesNo) = vbYes Then
        Call ScrapeAllPages
    End If
End Sub

Limitations and Alternatives

Excel VBA Cannot:

  • Render JavaScript (use Playwright or Selenium)
  • Rotate proxies efficiently (use Python with proxies)
  • Handle CAPTCHAs or advanced anti-bot measures
  • Scale to thousands of pages (performance degrades)
  • Run on macOS reliably (VBA support is limited)

Better Tools for Complex Scraping:

  • Power Query — Built into Excel, handles many tasks without VBA
  • Google Sheets IMPORTHTML=IMPORTHTML("url", "table", 1) for simple table imports
  • Python — For anything beyond basic table extraction. See our Python web scraping guide
  • Browser extensions — Tools like Web Scraper or Data Miner for visual scraping

FAQ

Can Excel scrape any website?

Excel can scrape static HTML websites. It cannot handle JavaScript-rendered content, CAPTCHAs, or sites with aggressive anti-bot protection. For those scenarios, use Python with Playwright or a dedicated scraping tool.

Is Power Query better than VBA for web scraping?

For table-based data, Power Query is better — it requires no code and auto-refreshes. VBA is better when you need to parse non-table HTML, handle pagination, or perform complex data extraction logic.

Can I use proxies with Excel VBA?

VBA’s XMLHTTP uses system proxy settings. You can configure a proxy in Windows Internet Options, but proxy rotation is not practical in VBA. For proxy-based scraping, use Python with rotating proxies.

How many pages can Excel VBA scrape?

Practically, Excel VBA handles up to a few hundred pages before becoming slow. The spreadsheet itself becomes unwieldy beyond 100,000 rows. For large-scale scraping, use Python with Scrapy.

Does web scraping in Excel work on Mac?

Limited. VBA on macOS does not support the MSXML2.XMLHTTP or HTMLDocument objects. Power Query works on Mac with Microsoft 365 but has fewer data source options. For Mac users, Python is the recommended alternative.


For more advanced scraping, explore Python web scraping and our proxy glossary. See our web scraping proxy guide for proxy setup.

External Resources:


Related Reading

last updated: April 3, 2026

Scroll to Top
message me on telegram

Resources

Proxy Signals Podcast
Operator-level insights on mobile proxies and access infrastructure.

Multi-Account Proxies: Setup, Types, Tools & Mistakes (2026)