How to scrape Mercado Libre Mexico in 2026
Scrape Mercado Libre Mexico effectively in 2026 and you have access to the largest ecommerce marketplace in Latin America. Mercado Libre Mexico (MLM) serves over 60 million Mexican shoppers, processes billions of pesos in monthly GMV, and indexes more SKUs than Amazon Mexico. Brand managers, agencies, and price intelligence teams cannot get a complete LATAM picture without it.
Unlike most ecommerce scraping targets, Mercado Libre offers a public Items API that returns clean structured data without authentication. This single fact makes MLM the friendliest large ecommerce target in the world to scrape. This guide covers the API path, the browser fallback for cases the API does not cover, anti-bot considerations for sustained scraping, and production patterns for Mexico-specific data quality.
What Mercado Libre Mexico exposes
| Surface | URL pattern | Best for |
|---|---|---|
| Public Items API | api.mercadolibre.com/items/{item_id} | High-throughput product extraction |
| Public Sites API | api.mercadolibre.com/sites/MLM/search?q={query} | Discovery, search, category browse |
| Product detail page | mercadolibre.com.mx/p/MLM{product_id} | Full UI extraction (if API misses fields) |
| Seller API | api.mercadolibre.com/users/{user_id} | Seller profile data |
The MLM site code in the API is “MLM” (Mexico). Other LATAM markets use MLA (Argentina), MLB (Brazil), MCO (Colombia), MLC (Chile), MLU (Uruguay), MPE (Peru). The patterns below work across all sites with site code substitution.
Public API access
The Items API requires no authentication for read access:
import asyncio
import httpx
async def get_item(item_id: str) -> dict:
url = f"https://api.mercadolibre.com/items/{item_id}"
async with httpx.AsyncClient(timeout=30) as client:
r = await client.get(url, headers={"User-Agent": "DRTBot/1.0 (research@example.com)"})
r.raise_for_status()
return r.json()
item = asyncio.run(get_item("MLM3000123456"))
print(item["title"], item["price"], item["currency_id"])
That is the entire scraping pipeline for product detail data. No browser, no proxies for low volume, no captcha. The API returns a complete product record:
{
"id": "MLM3000123456",
"site_id": "MLM",
"title": "Apple Iphone 15 Pro 256gb Titanio Natural",
"seller_id": 12345678,
"category_id": "MLM1055",
"price": 28999,
"base_price": 28999,
"original_price": 35999,
"initial_quantity": 50,
"available_quantity": 38,
"sold_quantity": 12,
"currency_id": "MXN",
"condition": "new",
"permalink": "https://articulo.mercadolibre.com.mx/...",
"thumbnail": "https://http2.mlstatic.com/...",
"shipping": {"free_shipping": True},
"attributes": [...],
}
For high-volume extraction, the API alone gets you most of the way.
Field reference for the Items API
The Items API returns a rich object. Worth knowing the most useful fields:
| Field | Type | Use |
|---|---|---|
id | string | Unique item identifier (MLM-prefixed) |
title | string | Product title |
seller_id | int | Seller user ID, link to Users API |
price | number | Current price in MXN |
original_price | number | Pre-promotion price (null if no promotion) |
available_quantity | int | Stock count (capped at 50) |
sold_quantity | int | Lifetime sold count |
condition | string | “new”, “used”, “refurbished” |
permalink | string | Public product URL |
attributes | array | Product attributes (brand, model, etc) |
variations | array | Variant details if applicable |
shipping | object | Shipping options including free_shipping flag |
location | object | Geographical location of the item |
pictures | array | Image URLs |
category_id | string | MLM category identifier |
domain_id | string | Higher-level product domain (e.g. MLM-CELLPHONES) |
listing_type_id | string | Listing tier (“gold_special”, “gold_pro”, etc) |
health | number | Listing quality score (0 to 1) |
catalog_listing | bool | Whether the item is part of the official catalog |
catalog_product_id | string | Catalog reference if catalog_listing |
For a full schema, the Mercado Libre developer docs maintain the canonical reference.
Discovery via Sites API
async def search_mlm(query: str, offset: int = 0, limit: int = 50) -> dict:
url = f"https://api.mercadolibre.com/sites/MLM/search?q={query}&offset={offset}&limit={limit}"
async with httpx.AsyncClient(timeout=30) as client:
r = await client.get(url)
return r.json()
results = asyncio.run(search_mlm("auriculares bluetooth"))
for item in results["results"]:
print(item["id"], item["title"], item["price"])
The search response includes pagination, facets, available filters, and the top 50 results per page. Walk pagination with the offset parameter to harvest a category.
Rate limits and authentication
Public API endpoints work without auth but are rate-limited per IP. For serious volume, register an application at Mercado Libre developers and use the OAuth-issued token; rate limits jump significantly.
async def authenticated_get(item_id: str, access_token: str) -> dict:
url = f"https://api.mercadolibre.com/items/{item_id}"
headers = {"Authorization": f"Bearer {access_token}"}
async with httpx.AsyncClient() as client:
r = await client.get(url, headers=headers)
return r.json()
OAuth flow is standard. Application registration is free.
When the API is not enough
The API exposes most product fields cleanly. Two cases require browser fallback:
- Reviews. Item review text is rendered client-side and not in the API response.
- Q&A. Buyer questions and seller answers are paginated client-side.
Browser-based fallback for reviews:
from playwright.async_api import async_playwright
async def scrape_reviews(permalink: str) -> list[dict]:
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
ctx = await browser.new_context(locale="es-MX")
pg = await ctx.new_page()
await pg.goto(permalink, wait_until="networkidle")
# scroll to load reviews
await pg.evaluate("window.scrollTo(0, document.body.scrollHeight)")
await pg.wait_for_timeout(2000)
reviews = []
for el in await pg.locator(".ui-review-capability__rating").all():
reviews.append({
"rating": int(await el.get_attribute("data-rating") or 0),
"text": (await el.text_content() or "").strip(),
})
await browser.close()
return reviews
Mexican Peso price handling
Mexican Peso uses the symbol $ (same as USD), which causes confusion. The currency ISO code is MXN. Always store the currency_id from the API alongside the price.
The peso uses comma as decimal separator and period as thousands: $1.234.567,89 MXN. The API returns plain numbers, so this is only a display concern.
Spanish language handling
MLM listings are in Spanish (Mexican variant). Standard UTF-8 handling is sufficient. Two specific gotchas:
First, accents and tildes (á, é, í, ó, ú, ñ) appear in titles. URL-encoding for search queries is essential:
from urllib.parse import quote
query = quote("computación")
url = f"https://api.mercadolibre.com/sites/MLM/search?q={query}"
Second, regional Spanish vocabulary differs. “Auriculares” (Mexico) vs “audífonos” (Spain) for headphones. Search for both terms when building cross-LATAM queries.
Adding proxies for sustained scale
For high-volume scraping, route through residential or mobile proxies. Mexican ISP IPs work well; data center IPs get challenged faster.
import random
PROXIES = [
"http://us:pw@mx-residential-1.proxy.example.com:8000",
"http://us:pw@mx-residential-2.proxy.example.com:8000",
]
async def get_item_with_proxy(item_id: str) -> dict:
proxy = random.choice(PROXIES)
url = f"https://api.mercadolibre.com/items/{item_id}"
async with httpx.AsyncClient(proxy=proxy, timeout=30) as client:
r = await client.get(url)
return r.json()
For LATAM proxy strategy, see our best residential proxy providers 2026.
OAuth and rate limit tiers
Public API limits run roughly 1000 requests per minute per IP. With OAuth authentication, the limit jumps to 5000 to 20000 requests per minute depending on application tier.
For high-volume teams, registering an application is well worth the 30-minute setup. The OAuth flow is standard:
async def get_access_token(client_id: str, client_secret: str, refresh_token: str) -> str:
url = "https://api.mercadolibre.com/oauth/token"
data = {
"grant_type": "refresh_token",
"client_id": client_id,
"client_secret": client_secret,
"refresh_token": refresh_token,
}
async with httpx.AsyncClient() as c:
r = await c.post(url, data=data)
return r.json()["access_token"]
Tokens expire after 6 hours; refresh tokens are long-lived. Cache tokens to avoid hitting the OAuth endpoint per request.
Catalog vs marketplace listings
Mercado Libre has two parallel concepts:
Catalog listings: a single canonical product page that aggregates many seller offers (similar to Amazon’s product pages). Catalog listings are at mercadolibre.com.mx/p/MLM{catalog_id}.
Marketplace listings: individual seller listings, each with their own item_id, even if they sell the same product.
For brand intelligence, catalog listings give you the easy “who sells X” view but marketplace listings give you the long-tail seller activity. Capture both.
async def get_catalog_listings(catalog_product_id: str) -> list[dict]:
url = f"https://api.mercadolibre.com/products/{catalog_product_id}/items"
async with httpx.AsyncClient() as c:
r = await c.get(url)
return r.json()
The endpoint returns all marketplace listings for a single catalog product, which is the basis for cross-seller price comparison on the same SKU.
Comparison to other LATAM markets
| Market | Public API | Volume | Bot defense |
|---|---|---|---|
| Mercado Libre Mexico | Yes (clean) | Largest in MX | Low (API path) |
| Mercado Libre Brazil | Yes (clean) | Largest in BR | Low (API path) |
| Mercado Libre Argentina | Yes (clean) | Largest in AR | Low (API path) |
| Amazon Mexico | No | Medium | High |
| Walmart Mexico | No | Medium | High |
| Liverpool Mexico | No | Smaller | Medium |
| OLX Mexico | Limited | Medium | Medium |
For a deeper LATAM treatment, see scrape OLX Brazil and LATAM marketplaces.
Production patterns
Three patterns matter.
First, batch with multi-get. The API supports up to 20 item IDs per request:
async def get_items_batch(item_ids: list[str]) -> list[dict]:
ids_csv = ",".join(item_ids[:20])
url = f"https://api.mercadolibre.com/items?ids={ids_csv}"
async with httpx.AsyncClient() as client:
r = await client.get(url)
results = r.json()
return [r["body"] for r in results if r.get("code") == 200]
This cuts request count by 20x for catalog-wide scrapes.
Second, capture seller-level data. Many MLM sellers list across multiple item IDs for the same product. Joining at seller-level helps deduplicate.
Third, monitor for category remapping. Mercado Libre periodically restructures categories. Cache the category tree weekly:
async def get_category_tree(site_id: str = "MLM") -> dict:
url = f"https://api.mercadolibre.com/sites/{site_id}/categories"
async with httpx.AsyncClient() as client:
r = await client.get(url)
return r.json()
Mexico-specific consumer behavior insights
A few patterns specific to the Mexican ecommerce market that affect what data matters:
Installments dominate. More than 60 percent of MLM transactions over MXN $5,000 use meses sin intereses. Track installment availability separately from headline price.
Hot Sale (May) and El Buen Fin (November) are the two biggest sales events. Inventory and pricing dynamics during these weeks are dramatically different from steady state. Plan for higher poll frequency.
Cash on delivery is still common. The payment_methods block lists supported methods; cash availability correlates with price tier and seller reputation.
OXXO payment (a convenience-store-based offline payment) is the largest non-card payment method. Listings supporting OXXO have a flag in the payment_methods array.
Mexican consumers cluster around major metro areas (Mexico City, Guadalajara, Monterrey). Seller location data is useful for shipping-time-based price intelligence.
Storage schema
CREATE TABLE mlm_items (
id TEXT PRIMARY KEY,
site_id TEXT NOT NULL,
title TEXT NOT NULL,
seller_id BIGINT,
category_id TEXT,
price NUMERIC(12,2) NOT NULL,
original_price NUMERIC(12,2),
currency_id CHAR(3) NOT NULL,
available_quantity INTEGER,
sold_quantity INTEGER,
condition TEXT,
permalink TEXT NOT NULL,
free_shipping BOOLEAN,
extracted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
raw_jsonb JSONB
);
CREATE INDEX idx_mlm_extracted_at ON mlm_items(extracted_at);
CREATE INDEX idx_mlm_seller_id ON mlm_items(seller_id);
CREATE INDEX idx_mlm_category_id ON mlm_items(category_id);
Real benchmark numbers
A March 2026 production run, 10,000 MLM items via the public Items API with rotating Mexican residential proxies:
| Metric | Value |
|---|---|
| Success rate | 99.4% |
| Median latency per item | 0.3 s |
| p99 latency | 1.8 s |
| Cost per 1000 items | $4 |
| 429 throttle rate | 0.4% |
| Failed item lookups | 0.6% (mostly 404 on removed items) |
Compared to browser-based scraping of Lazada or Shopee at $80 to $200 per 1000 items, MLM is dramatically cheaper. The public API is a real differentiator.
Capturing the seller and shop side
Mercado Libre’s seller data is rich. The Users API returns seller reputation, registration date, location, and feedback statistics:
async def get_seller(user_id: int) -> dict:
url = f"https://api.mercadolibre.com/users/{user_id}"
async with httpx.AsyncClient() as c:
r = await c.get(url)
return r.json()
# Seller payload includes:
# nickname, registration_date, country_id, address (city, state),
# user_type ("normal", "official_store", "brand"),
# seller_reputation (level, transactions count, ratings)
For brand intelligence, seller-level data lets you spot unauthorized resellers, track grey market activity, and identify counterfeit hot spots. The Mexican market specifically has heavy unauthorized reselling of imported electronics.
Cost expectations
10,000 MLM products per month with API access only:
| Component | Cost |
|---|---|
| API requests (proxied) | $20-$40 |
| Compute | $10 |
| Total | $30-$50 |
Mercado Libre is the cheapest large ecommerce target to scrape because of the public API. For comparison, Lazada or Shopee at the same volume runs $150-$280.
Legal considerations
Mexico’s Federal Law on the Protection of Personal Data Held by Private Parties (LFPDPPP) regulates personal data. Public commercial data (product listings, prices, seller-level data at city granularity) is not personal data.
Mercado Libre’s terms of service explicitly allow programmatic access through the public API for non-commercial-impersonation use cases. Scraping the public website at high volume can technically violate the terms but the API path is contractually clean.
For broader LATAM compliance, see scraping EU sites: jurisdictional realities, which covers similar principles applied to Mexican LFPDPPP.
Mercado Libre-specific data points
A few MLM-only fields worth capturing:
mercado_envios flag: indicates Mercado Libre handles fulfillment. Strong predictor of customer satisfaction and conversion.
gold_special and gold_pro listing types: paid promotional tiers that affect ranking. Capture as a quality signal.
installments block: Mexican consumers heavily use installment plans (meses sin intereses). The number of available installments and whether interest-free are major purchase drivers.
reputation on the seller: a 5-tier color score (verde to rojo) that summarizes seller quality. Critical for brand intelligence to flag low-reputation sellers carrying brand SKUs.
def extract_mlm_specific(item: dict) -> dict:
return {
"mercado_envios": item.get("shipping", {}).get("mode") == "me2",
"listing_type_id": item.get("listing_type_id"),
"free_shipping": item.get("shipping", {}).get("free_shipping", False),
"installments": item.get("installments", {}).get("quantity", 1),
"interest_free": item.get("installments", {}).get("rate", 1) == 0,
}
Question and answer scraping
Mercado Libre has a buyer Q&A system that often contains useful product information not present in the official listing. The Q&A API:
async def get_questions(item_id: str) -> list[dict]:
url = f"https://api.mercadolibre.com/questions/search?item={item_id}"
async with httpx.AsyncClient() as c:
r = await c.get(url)
return r.json().get("questions", [])
Each question includes the question text, the seller’s answer, and timestamps. For brand monitoring, this catches competitor-versus-product comparisons that appear in buyer questions.
Cross-LATAM expansion
Once you have a working MLM pipeline, expanding to other Mercado Libre sites is essentially a configuration change:
SITE_CONFIG = {
"MLM": {"country": "Mexico", "currency": "MXN", "language": "es-MX"},
"MLB": {"country": "Brazil", "currency": "BRL", "language": "pt-BR"},
"MLA": {"country": "Argentina", "currency": "ARS", "language": "es-AR"},
"MCO": {"country": "Colombia", "currency": "COP", "language": "es-CO"},
"MLC": {"country": "Chile", "currency": "CLP", "language": "es-CL"},
"MLU": {"country": "Uruguay", "currency": "UYU", "language": "es-UY"},
"MPE": {"country": "Peru", "currency": "PEN", "language": "es-PE"},
}
async def get_item_anywhere(site_id: str, item_id: str) -> dict:
return await get_item(item_id) # API is global, item_ids are site-prefixed
For full LATAM coverage, run the same scraper against each site and store with site_id as a partition key.
Frequently asked questions
Why is the API the recommended path here when other guides recommend browser-based scraping?
Because Mercado Libre is unique among large ecommerce sites in offering a clean, well-documented, no-auth public API. Most sites do not. When the official API works, use it.
What about web scraping the Mercado Libre site directly?
The site is heavily defended (Cloudflare, custom challenges) and rate-limited harder than the API. There is essentially no reason to web-scrape MLM when the API works.
Are there fields in the website that the API does not expose?
Reviews and Q&A are not in the Items API and need browser scraping. Everything else (pricing, stock, attributes, shipping, seller info) is in the API.
Can I write to the MLM API (e.g. update listings)?
Yes if you are a registered seller and authenticate with OAuth. Read-only public endpoints work without auth.
How do I track price history?
The API returns the current price. For historical pricing, snapshot the API response daily and store in a time-series table.
Can I monitor specific catalog products instead of scraping by item_id?
Yes. Use the catalog product endpoint to get all current listings for a single canonical product, then track the listings over time. This is more efficient for brand monitoring than item-by-item scraping.
Can I scrape other LATAM Mercado Libre sites with the same code?
Yes. Substitute the site_id in URLs (MLA for Argentina, MLB for Brazil, etc.). Currency and language change accordingly.
What about Mercado Pago payment data?
Not exposed publicly. Payment information is restricted to seller-side reports through authenticated APIs.
How do I track promotional events like Hot Sale Mexico?
Mercado Libre runs Hot Sale (May), El Buen Fin (November), and Cyber Monday Mexico. During these events, prices change hourly. Increase poll frequency on flagged SKUs and capture the original_price to detect promotion vs sale dynamics.
Does Mercado Libre have variant-level data like Shopee?
Yes. The variations field on the item response lists each variant with its own price, stock, and attributes. Treat variants as separate rows for accurate inventory tracking.
Can I scrape Mercado Libre Classifieds (vehicles, real estate)?
Yes. The Classifieds API uses the same shape with category-specific extra fields. Vehicles include brand, model, year, mileage; real estate includes property type, bedrooms, location.
How do I handle item_id changes after a relisting?
When a seller relists an item, it gets a new item_id. The previous item_id returns 404. Track by SKU plus seller for stable identification across relisting events.
Common production gotchas
A few patterns that cause issues in MLM scraping:
The API returns prices as integers when the value is whole-peso, floats when fractional. Cast consistently to Decimal to avoid type drift.
Some categories have site-specific quirks. Real estate on MLM lists prices in MXN by default but in USD for high-end properties. Always check currency_id.
The available_quantity field caps at 50 even for higher-stock items. For accurate inventory, use the seller-side reports if you have access; otherwise treat 50 as “in stock plenty”.
Removed items return 404 for several months, then start returning 410 Gone. Handle both.
Search API pagination caps at offset 1000. To enumerate beyond, use the scroll_id returned in the search response.
Does Mercado Libre have an SDK?
Official SDKs exist for PHP, Python, Java, and JavaScript. The Python SDK (mercadolibre) is reasonable for prototypes but most production teams use direct httpx calls for finer control.
Can I get historical sales data?
The sold_quantity field is current cumulative. For sales velocity over time, snapshot daily and compute deltas.
For more LATAM ecommerce coverage, browse the ecommerce category.