Web Scraping with VBA/Excel: No-Code Data Pull
Excel is the most accessible web scraping tool available. You do not need Python, Node.js, or any programming framework. Excel’s built-in Power Query handles many data import tasks with zero code, and VBA (Visual Basic for Applications) provides full scraping capabilities for more complex needs. If your goal is getting web data into a spreadsheet, Excel might be all you need.
This tutorial covers three approaches: Power Query (no code), Web Query (legacy), and VBA macros (full control).
Table of Contents
- When to Use Excel for Scraping
- Method 1: Power Query (No Code)
- Method 2: Web Query (Legacy)
- Method 3: VBA Macros
- VBA HTTP Requests
- VBA HTML Parsing
- Scraping Multiple Pages
- Handling Tables
- Error Handling
- Scheduling Automatic Updates
- Limitations and Alternatives
- FAQ
When to Use Excel for Scraping
Excel scraping is ideal when:
- Your end goal is a spreadsheet (no data pipeline needed)
- You are scraping HTML tables or structured data
- You need a one-off data pull, not a recurring crawler
- Your team does not have Python/Node.js skills
- You are pulling data from a small number of pages (under 100)
Excel is NOT ideal for JavaScript-rendered pages, large-scale crawling, or sites requiring proxy rotation. For those, see our Python scraping guide.
Method 1: Power Query (No Code)
Power Query is Excel’s built-in data import tool. It handles most table-based web scraping without any code.
Steps
- Open Excel and go to Data > From Web
- Enter the URL (e.g.,
https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)) - Excel detects tables on the page automatically
- Select the table you want and click Load
The data imports directly into your spreadsheet.
Power Query M Code (Advanced)
For more control, use Power Query’s M language:
let
Source = Web.Page(
Web.Contents("https://books.toscrape.com/")
),
// Select specific table
Data = Source{0}[Data],
// Rename columns
Renamed = Table.RenameColumns(Data, {
{"Column1", "Title"},
{"Column2", "Price"}
}),
// Filter rows
Filtered = Table.SelectRows(Renamed, each [Price] <> null)
in
FilteredRefreshing Data
Right-click the imported table and select Refresh to pull updated data. You can also set automatic refresh intervals:
- Right-click the query in the Queries & Connections pane
- Select Properties
- Check Refresh every X minutes
Method 2: Web Query (Legacy)
The traditional web query approach still works in older Excel versions:
- Go to Data > From Web (or Data > Get External Data > From Web in older versions)
- Enter the URL
- Click the yellow arrows next to tables you want to import
- Click Import
This method auto-detects HTML tables and imports them directly.
Method 3: VBA Macros
VBA gives you full control over HTTP requests and HTML parsing.
Setting Up VBA
- Press Alt + F11 to open the VBA editor
- Go to Tools > References and enable:
- Microsoft XML, v6.0 (for HTTP requests)
- Microsoft HTML Object Library (for HTML parsing)
- Insert a new module: Insert > Module
Basic VBA Scraper
Sub ScrapeBooks()
Dim http As New MSXML2.XMLHTTP60
Dim html As New HTMLDocument
Dim books As Object
Dim book As Object
Dim row As Long
' Send HTTP request
http.Open "GET", "https://books.toscrape.com/", False
http.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
http.send
' Parse HTML
html.body.innerHTML = http.responseText
' Find all book elements
Set books = html.querySelectorAll("article.product_pod")
' Write headers
Cells(1, 1).Value = "Title"
Cells(1, 2).Value = "Price"
Cells(1, 3).Value = "Rating"
' Extract data
row = 2
Dim i As Long
For i = 0 To books.Length - 1
Set book = books.Item(i)
Cells(row, 1).Value = book.querySelector("h3 a").getAttribute("title")
Cells(row, 2).Value = book.querySelector(".price_color").innerText
Cells(row, 3).Value = Replace(book.querySelector("p").className, "star-rating ", "")
row = row + 1
Next i
MsgBox "Scraped " & (row - 2) & " books!"
End SubVBA HTTP Requests
GET Request
Function FetchPage(url As String) As String
Dim http As New MSXML2.XMLHTTP60
http.Open "GET", url, False
http.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
http.setRequestHeader "Accept", "text/html"
http.send
If http.Status = 200 Then
FetchPage = http.responseText
Else
FetchPage = ""
Debug.Print "Error: HTTP " & http.Status & " for " & url
End If
End FunctionPOST Request
Function PostRequest(url As String, postData As String) As String
Dim http As New MSXML2.XMLHTTP60
http.Open "POST", url, False
http.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
http.setRequestHeader "User-Agent", "Mozilla/5.0"
http.send postData
PostRequest = http.responseText
End Function
' Usage
Dim result As String
result = PostRequest("https://example.com/search", "query=laptops&page=1")JSON API Request
Function FetchJSON(url As String) As String
Dim http As New MSXML2.XMLHTTP60
http.Open "GET", url, False
http.setRequestHeader "Accept", "application/json"
http.setRequestHeader "User-Agent", "Mozilla/5.0"
http.send
FetchJSON = http.responseText
End Function
' Parse JSON (requires VBA-JSON library or manual parsing)
' Download from: https://github.com/VBA-tools/VBA-JSONVBA HTML Parsing
querySelector and querySelectorAll
Dim html As New HTMLDocument
html.body.innerHTML = httpResponseText
' Single element
Dim title As Object
Set title = html.querySelector("h1")
Debug.Print title.innerText
' Multiple elements
Dim items As Object
Set items = html.querySelectorAll(".product-card")
Debug.Print "Found " & items.Length & " items"
' Attributes
Dim link As Object
Set link = html.querySelector("a.product-link")
Debug.Print link.getAttribute("href")
' Nested selection
Dim container As Object
Set container = html.querySelector(".products")
Dim childItems As Object
Set childItems = container.querySelectorAll(".item")Common Selectors
' By class
html.querySelectorAll(".product")
' By ID
html.querySelector("#main-content")
' By attribute
html.querySelectorAll("a[href]")
html.querySelectorAll("[data-id='123']")
' By tag
html.querySelectorAll("tr")
' Combined
html.querySelectorAll("div.product h3 a")
' Nested
html.querySelectorAll("table tbody tr td")getElementById and getElementsByTagName
' By ID (returns single element)
Dim mainDiv As Object
Set mainDiv = html.getElementById("main-content")
' By tag name (returns collection)
Dim allLinks As Object
Set allLinks = html.getElementsByTagName("a")
Dim i As Long
For i = 0 To allLinks.Length - 1
Debug.Print allLinks.Item(i).getAttribute("href")
Next i
' By class name
Dim products As Object
Set products = html.getElementsByClassName("product")Scraping Multiple Pages
Sub ScrapeAllPages()
Dim http As New MSXML2.XMLHTTP60
Dim html As New HTMLDocument
Dim row As Long
Dim page As Long
' Headers
Cells(1, 1).Value = "Title"
Cells(1, 2).Value = "Price"
Cells(1, 3).Value = "Page"
row = 2
For page = 1 To 50
Dim url As String
url = "https://books.toscrape.com/catalogue/page-" & page & ".html"
' Fetch page
http.Open "GET", url, False
http.setRequestHeader "User-Agent", "Mozilla/5.0"
http.send
If http.Status <> 200 Then
Debug.Print "Error on page " & page & ": HTTP " & http.Status
Exit For
End If
html.body.innerHTML = http.responseText
' Extract books
Dim books As Object
Set books = html.querySelectorAll("article.product_pod")
If books.Length = 0 Then Exit For
Dim i As Long
For i = 0 To books.Length - 1
Dim book As Object
Set book = books.Item(i)
Cells(row, 1).Value = book.querySelector("h3 a").getAttribute("title")
Cells(row, 2).Value = book.querySelector(".price_color").innerText
Cells(row, 3).Value = page
row = row + 1
Next i
' Status update
Application.StatusBar = "Scraping page " & page & "... (" & (row - 2) & " books)"
DoEvents
' Polite delay (1 second)
Application.Wait Now + TimeValue("00:00:01")
Next page
Application.StatusBar = False
MsgBox "Done! Scraped " & (row - 2) & " books from " & (page - 1) & " pages."
End SubHandling Tables
Automatic Table Extraction
Sub ExtractTable()
Dim html As New HTMLDocument
Dim http As New MSXML2.XMLHTTP60
http.Open "GET", "https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)", False
http.send
html.body.innerHTML = http.responseText
' Find first table with class "wikitable"
Dim table As Object
Set table = html.querySelector("table.wikitable")
If table Is Nothing Then
MsgBox "No table found!"
Exit Sub
End If
' Extract rows
Dim rows As Object
Set rows = table.querySelectorAll("tr")
Dim row As Long
row = 1
Dim r As Long
For r = 0 To rows.Length - 1
Dim cells As Object
Set cells = rows.Item(r).querySelectorAll("th, td")
Dim c As Long
For c = 0 To cells.Length - 1
Cells(row, c + 1).Value = CleanText(cells.Item(c).innerText)
Next c
row = row + 1
Next r
MsgBox "Extracted " & (row - 1) & " rows!"
End Sub
Function CleanText(text As String) As String
' Remove extra whitespace and line breaks
CleanText = Trim(Replace(Replace(text, vbLf, " "), vbCr, " "))
' Remove multiple spaces
Do While InStr(CleanText, " ") > 0
CleanText = Replace(CleanText, " ", " ")
Loop
End FunctionError Handling
Sub SafeScrape()
On Error GoTo ErrorHandler
Dim http As New MSXML2.XMLHTTP60
Dim html As New HTMLDocument
http.Open "GET", "https://books.toscrape.com/", False
http.setRequestHeader "User-Agent", "Mozilla/5.0"
http.send
If http.Status <> 200 Then
MsgBox "HTTP Error: " & http.Status
Exit Sub
End If
html.body.innerHTML = http.responseText
' Check if element exists before accessing
Dim title As Object
Set title = html.querySelector("h1")
If Not title Is Nothing Then
Debug.Print "Title: " & title.innerText
Else
Debug.Print "Title element not found"
End If
Exit Sub
ErrorHandler:
MsgBox "Error " & Err.Number & ": " & Err.Description
Debug.Print "Error in SafeScrape: " & Err.Description
End SubRetry Logic
Function FetchWithRetry(url As String, maxRetries As Long) As String
Dim http As New MSXML2.XMLHTTP60
Dim attempt As Long
For attempt = 1 To maxRetries
On Error Resume Next
http.Open "GET", url, False
http.setRequestHeader "User-Agent", "Mozilla/5.0"
http.send
If Err.Number = 0 And http.Status = 200 Then
FetchWithRetry = http.responseText
Exit Function
End If
On Error GoTo 0
Debug.Print "Attempt " & attempt & " failed for " & url
Application.Wait Now + TimeValue("00:00:02")
Next attempt
FetchWithRetry = ""
End FunctionScheduling Automatic Updates
Windows Task Scheduler
- Save your workbook as
.xlsm(macro-enabled) - Create a VBS wrapper script:
' run_scraper.vbs — save as a .vbs file
Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = False
Set objWorkbook = objExcel.Workbooks.Open("C:\path\to\scraper.xlsm")
objExcel.Run "ScrapeAllPages"
objWorkbook.Save
objWorkbook.Close
objExcel.Quit- Open Windows Task Scheduler
- Create a new task that runs
wscript.exe "C:\path\to\run_scraper.vbs" - Set your desired schedule (daily, weekly, etc.)
Auto-Run on Open
' In ThisWorkbook module
Private Sub Workbook_Open()
' Ask before running
If MsgBox("Run the scraper?", vbYesNo) = vbYes Then
Call ScrapeAllPages
End If
End SubLimitations and Alternatives
Excel VBA Cannot:
- Render JavaScript (use Playwright or Selenium)
- Rotate proxies efficiently (use Python with proxies)
- Handle CAPTCHAs or advanced anti-bot measures
- Scale to thousands of pages (performance degrades)
- Run on macOS reliably (VBA support is limited)
Better Tools for Complex Scraping:
- Power Query — Built into Excel, handles many tasks without VBA
- Google Sheets IMPORTHTML —
=IMPORTHTML("url", "table", 1)for simple table imports - Python — For anything beyond basic table extraction. See our Python web scraping guide
- Browser extensions — Tools like Web Scraper or Data Miner for visual scraping
FAQ
Can Excel scrape any website?
Excel can scrape static HTML websites. It cannot handle JavaScript-rendered content, CAPTCHAs, or sites with aggressive anti-bot protection. For those scenarios, use Python with Playwright or a dedicated scraping tool.
Is Power Query better than VBA for web scraping?
For table-based data, Power Query is better — it requires no code and auto-refreshes. VBA is better when you need to parse non-table HTML, handle pagination, or perform complex data extraction logic.
Can I use proxies with Excel VBA?
VBA’s XMLHTTP uses system proxy settings. You can configure a proxy in Windows Internet Options, but proxy rotation is not practical in VBA. For proxy-based scraping, use Python with rotating proxies.
How many pages can Excel VBA scrape?
Practically, Excel VBA handles up to a few hundred pages before becoming slow. The spreadsheet itself becomes unwieldy beyond 100,000 rows. For large-scale scraping, use Python with Scrapy.
Does web scraping in Excel work on Mac?
Limited. VBA on macOS does not support the MSXML2.XMLHTTP or HTMLDocument objects. Power Query works on Mac with Microsoft 365 but has fewer data source options. For Mac users, Python is the recommended alternative.
For more advanced scraping, explore Python web scraping and our proxy glossary. See our web scraping proxy guide for proxy setup.
External Resources:
- Microsoft Power Query Documentation
- VBA MSXML Documentation)
- VBA-JSON Library
- aiohttp + BeautifulSoup: Async Python Scraping
- Axios + Cheerio: Lightweight Node.js Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- How to Build an Ethical Web Scraping Policy for Your Company
- aiohttp + BeautifulSoup: Async Python Scraping
- Axios + Cheerio: Lightweight Node.js Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- How to Build an Ethical Web Scraping Policy for Your Company
Related Reading
- aiohttp + BeautifulSoup: Async Python Scraping
- Axios + Cheerio: Lightweight Node.js Scraping
- How Anti-Bot Systems Detect Scrapers (Cloudflare, Akamai, PerimeterX)
- API vs Web Scraping: When You Need Proxies (and When You Don’t)
- ASEAN Data Protection Laws: A Web Scraping Compliance Matrix
- How to Build an Ethical Web Scraping Policy for Your Company
last updated: April 3, 2026