Scraping data from EU sites: jurisdictional realities
Scraping EU sites legal questions are rarely straightforward, because the jurisdictional rules in 2026 are layered: GDPR has extraterritorial reach, member states implement it differently, the e-Privacy Directive sits on top, the EU AI Act layers on for any AI training pipeline, and post-Brexit UK adds a parallel UK GDPR regime. Most scraping teams operating against EU targets in 2026 are surprised by which authority claims jurisdiction over them and which national rules apply. This guide walks through the actual jurisdictional rules, the Schrems II and Schrems III data transfer realities, the Brexit divergence, and a working playbook for a scraping operator.
The audience is the technical lead or in-house counsel responsible for a pipeline that touches EU traffic and needs to know who can come after them, from where, under which rules.
Article 3 GDPR and what it actually reaches
Article 3 of the GDPR sets out the territorial scope of the regulation. It applies in two situations. First, where the processing takes place in the context of an establishment of a controller or processor in the Union (regardless of where processing occurs). Second, where the controller or processor is not established in the Union but processes personal data of data subjects in the Union, where the processing relates to the offering of goods or services or the monitoring of behaviour within the Union.
The “monitoring of behaviour” branch is what catches scraping operators. If you systematically collect personal data about EU residents, you are monitoring their behaviour, and GDPR applies to your processing regardless of where you sit. The 2024-2025 EDPB guidelines explicitly listed scraping for AI training and scraping for B2B intelligence as monitoring activities.
A scraping operator outside the EU who systematically targets EU sites is in scope. A scraping operator who incidentally hits EU residents while targeting global content is in a gray zone. The conservative reading: assume scope, document the assessment.
For the broader compliance picture, see the GDPR compliance guide and the personal vs public data framework.
The lead supervisory authority and one-stop shop
GDPR established a one-stop-shop mechanism: a controller with multiple EU establishments deals with the supervisory authority of its main establishment as the lead authority. The lead authority coordinates cross-border investigations.
For a scraper without an EU establishment, the one-stop shop does not apply. You can be investigated by any national supervisory authority where data subjects whose data you process are located. The Italian Garante, the French CNIL, the Dutch AP, the German BfDI, and the Spanish AEPD have all opened investigations of non-EU scrapers in 2024-2025.
Each authority has different enforcement priorities, fine ranges, and procedural styles. A scraper that gets multiple parallel investigations from different national authorities is in a worst-case scenario, because they cannot consolidate the defence under one-stop shop rules.
| Authority | Country | Notable scraping enforcement (2024-2025) |
|---|---|---|
| Garante | Italy | Multiple AI training scraping investigations; high-profile fines |
| CNIL | France | B2B contact scraping; cookies and AI training focus |
| AP | Netherlands | Cross-border scraping investigations; tight cooperation |
| BfDI / state DPAs | Germany | Fragmented (16 state authorities); strict on scraping |
| AEPD | Spain | Active on profiling; significant fines |
| ICO | UK | Post-Brexit divergence; pragmatic but firm |
| DPC | Ireland | Lead for many big tech; slow but high-stakes |
The pragmatic move: identify your most exposed jurisdictions (top three EU markets where your data subjects are most concentrated) and align compliance to the strictest of those.
Schrems II, Schrems III, and cross-border transfer
If you scrape EU personal data and transfer it outside the EEA (to your US-based servers, for example), the transfer must comply with Chapter V of the GDPR. The two main mechanisms in 2026:
Adequacy decisions: a finding by the European Commission that a third country provides an adequate level of protection. The list includes the UK, Switzerland, Japan, South Korea, New Zealand, Canada (commercial only), Israel, Argentina, Uruguay, the Faroe Islands, Guernsey, Jersey, the Isle of Man, and the US under the Data Privacy Framework (DPF, replacing the invalidated Privacy Shield).
Standard Contractual Clauses (SCCs): the Commission’s 2021 SCCs as updated, plus a Transfer Impact Assessment (TIA) demonstrating that the recipient country provides essentially equivalent protection.
Schrems II (2020) invalidated Privacy Shield and required TIAs for SCCs. Schrems III is the inevitable challenge to the Data Privacy Framework, expected to be heard by the CJEU in 2026 or 2027. If Schrems III invalidates the DPF, US-based scrapers will need to fall back to SCCs plus TIAs for every transfer, which is operationally heavy.
The conservative posture for a scraper: use SCCs plus TIAs even where DPF coverage exists, because the DPF could fall at any time and your operations should not depend on its survival.
The e-Privacy Directive layer
GDPR is not the only EU data law. The e-Privacy Directive (2002/58/EC, as amended) governs cookies, electronic communications, and tracking. The forthcoming e-Privacy Regulation has been stuck in the legislative process for years and is unlikely to land before 2027.
For scrapers, the e-Privacy relevance is narrow but real. If your scraping involves placing cookies on user devices (it should not, but some scrapers use browser automation that does), you trigger e-Privacy. If your scraping involves intercepting electronic communications (you should not), you trigger e-Privacy.
The CNIL has been the most active enforcer of e-Privacy in 2024-2025, with multiple seven-figure fines against ad-tech operators. Scrapers who run residential proxy networks that also serve consumer ad-tech are in particular danger.
For the proxy infrastructure dimension, see the self-hosted proxy infrastructure guide.
Brexit and the UK divergence
Since 1 January 2021, the UK is a third country for EU GDPR purposes. The UK enacted UK GDPR (a slightly modified version of EU GDPR) and the Data Protection Act 2018. The European Commission granted the UK an adequacy decision in 2021, valid for four years and renewed in 2025 with conditions.
The practical implication for scrapers: a US scraper transferring data to a UK processor or storing data on UK servers is in a defensible position because of the adequacy. A UK scraper processing EU data is in scope of EU GDPR (Article 3 still applies) and must comply with both regimes.
The UK Information Commissioner’s Office (ICO) has been more pragmatic than EU counterparts, with explicit guidance favouring proportionate enforcement. UK fines have generally been smaller than EU peers. But the UK Data Protection and Digital Information Bill (DPDI) introduced in 2023-2024 made changes to UK GDPR that the EU explicitly flagged as risking adequacy. The 2025 renewal kept adequacy but on conditions; a future divergence could remove it.
Decision tree for an EU-touching scrape
Q1: Does your pipeline target EU residents specifically (B2C, news, social)?
├── Yes -> EU GDPR applies. Treat as in scope.
└── No -> Q2
Q2: Does your pipeline incidentally collect EU personal data?
├── No -> EU GDPR may not apply. Document the assessment.
└── Yes -> Q3
Q3: Will the data be transferred outside the EEA?
├── Yes -> Use SCCs plus TIA, or rely on adequacy decision.
└── No -> Q4
Q4: Do you have an EU establishment?
├── Yes -> Identify the lead supervisory authority.
└── No -> Risk of multiple-authority investigation.
Q5: Is your processing for AI training?
├── Yes -> EU AI Act layers on; transparency obligations apply.
└── No -> Standard GDPR posture.
Each branch produces a documented decision in the compliance register. The register is your defence.
Compliance checklist for cross-EU operators
| Control | What it requires | Why it matters |
|---|---|---|
| Article 3 territorial assessment | Documented, dated | Defence against claims |
| Lead authority identification (if EU establishment) | Written | One-stop shop benefit |
| Article 27 representative (if no EU establishment) | Appointed in EU | Mandatory for non-EU controllers in scope |
| Privacy notice in EU languages | At least major markets | Article 12 transparency |
| Lawful basis documented per source | LIA preferred | Article 6 |
| Data Protection Impact Assessment | If high risk | Article 35 |
| SCCs plus TIA for transfers | Per recipient country | Chapter V |
| Cookie compliance (if browser-based scraping) | Consent management | e-Privacy |
| Breach response plan | 72-hour notification | Article 33 |
| Records of processing | Article 30 register | Mandatory at scale |
| EU AI Act training data summary | If training | EU AI Act |
The Article 27 representative requirement
This is the requirement most non-EU scrapers miss. Article 27 GDPR requires controllers and processors not established in the EU but in scope of GDPR to designate, in writing, a representative in the Union. The representative must be in a member state where the relevant data subjects are.
The representative is the addressee for supervisory authority and data subject inquiries. The representative is not the controller, but they are the contact point. Failing to appoint a representative is itself a violation that can attract a fine.
The market for Article 27 representation services is mature in 2026. Major providers offer turnkey representation for low five-figure annual fees. There is no good reason for a serious scraping operator to be without one.
For the parallel discussion of how AI training pipelines must structure their EU operations, see fair use and copyright for AI training data.
Member state divergence in practice
Even within the GDPR framework, member states diverge on several practical questions:
Germany interprets “scientific research” exceptions narrowly and has 16 state-level data protection authorities; cross-state coordination is sometimes slow. Fines tend to be moderate but enforcement is consistent.
France has a strong cookies-and-tracking enforcement focus through the CNIL. AI training scrapers have been singled out repeatedly. Fines tend to be larger.
Italy has been the most aggressive in 2024-2025 against scraping operators. The Garante has issued multiple injunctions and provisional measures. Italian enforcement is fast.
The Netherlands runs a pragmatic enforcement style with high willingness to settle for compliance commitments. Useful for engaged operators.
Spain has been active on profiling and behavioural monitoring, with significant fines for B2B contact scrapers.
Ireland is the lead authority for most big-tech operators with EU establishments in Dublin. Investigations are slow but stakes are high.
The pragmatic move: identify which member state your most exposed customer base or data subject base sits in, and align compliance to that authority’s expectations.
External references
The European Data Protection Board’s library of guidelines is at edpb.europa.eu/our-work-tools/our-documents. The European Commission’s adequacy decisions are at commission.europa.eu/law/law-topic/data-protection/international-dimension-data-protection/adequacy-decisions_en. The standard contractual clauses for international transfers are at eur-lex.europa.eu/eli/dec_impl/2021/914/oj.
Comparison: EU GDPR vs UK GDPR vs Swiss FADP
| Dimension | EU GDPR | UK GDPR | Swiss FADP (revised 2023) |
|---|---|---|---|
| Personal data definition | Same | Same | Mirrors GDPR |
| Lawful basis required | Yes | Yes | Yes (similar set) |
| Article 27 representative | Yes (EU) | Yes (UK) | Yes (Switzerland) if in scope |
| Cross-border transfer | SCCs / adequacy | SCCs / adequacy (separate UK list) | SCCs / adequacy (separate Swiss list) |
| One-stop shop | Yes | N/A (single authority) | N/A |
| Maximum fine | EUR 20M or 4% revenue | GBP 17.5M or 4% revenue | CHF 250K (criminal liability for individuals) |
| AI Act layer | Yes | No (separate AI safety regime forthcoming) | No (separate consultation) |
The three regimes are operationally similar but require separate compliance artefacts (separate representatives, separate transfer mechanisms, separate notices).
A worked example: scraping news sites across France, Germany, and Italy
A scraper operating from California pulls news headlines from major French, German, and Italian publishers for a media intelligence product sold to PR agencies. The dataset includes headline text, byline (author name), publication, timestamp, and category.
Classification: byline is personal data; everything else is potentially copyright-protected. GDPR applies under Article 3 (monitoring of EU data subjects). EU AI Act layers on if the dataset is fed to a model.
Authorities: CNIL (France), BfDI plus relevant state DPAs (Germany), Garante (Italy). The Italian Garante is the most aggressive on AI/scraping topics in 2026; align baseline compliance there.
Required artefacts: Article 27 representative (one is sufficient if covering all three jurisdictions, typically based in the most relevant member state), LIA per source, SCCs plus TIA for transfer to California, privacy notice in French, German, and Italian, opt-out inbox, retention schedule, classification register.
The compliance overhead is real but tractable: a fortnight of lawyer time plus an Article 27 representative subscription. Compare that to the seven-figure exposure of an Italian Garante investigation.
FAQ
Does GDPR apply to me if I am US-based?
Yes if your processing relates to offering goods or services to EU data subjects, or to monitoring their behaviour. Scraping EU sites at scale typically counts as monitoring.
Do I need an EU representative?
If you are a non-EU controller in scope of GDPR, yes. Article 27 makes the representative mandatory.
Is the UK in or out of EU GDPR?
Out since Brexit. The UK has its own UK GDPR which is similar but diverges. Adequacy was renewed in 2025 with conditions.
Can I use Standard Contractual Clauses?
Yes, with a Transfer Impact Assessment. SCCs are the workhorse of cross-border transfer in 2026.
What happens if Schrems III invalidates the Data Privacy Framework?
US-based recipients lose adequacy and must fall back to SCCs plus TIAs. Build for that fallback today; it will cost less than scrambling later.
Extended jurisdictional analysis
The jurisdictional reach of EU privacy law over scraping is governed by Article 3 of the GDPR. Article 3(1) covers any processing in the context of the activities of an EU establishment regardless of where the processing occurs. Article 3(2) extends the regulation to processors outside the EU when they offer goods or services to data subjects in the Union or monitor their behaviour within the Union.
For scrapers the Article 3(2) prong is the operative provision. The European Data Protection Board’s 2019 guidance (Guidelines 3/2018, updated 2024) clarifies that scraping EU-located public websites for personal data of EU residents typically constitutes monitoring and triggers Article 3(2). The 2025 enforcement against several US-based people-data vendors confirmed this reading.
A second jurisdictional vector is the EU AI Act, which entered force in August 2024 with phased application through 2027. The Act applies extraterritorially to providers of general-purpose AI models that place models on the EU market or whose outputs are used in the EU. Training data provenance is a documentation obligation under Article 53. Scrapers feeding GPAI training data therefore inherit indirect AI Act obligations.
A third vector is the Digital Services Act, which imposes systemic risk obligations on very large online platforms (VLOPs) and constrains how those platforms can be scraped. The DSA’s Article 40 data access regime for vetted researchers is one approved pathway.
Implementation patterns for cross-border scraping
A scraper handling EU-touching data in 2026 should implement nine controls.
- Designate an EU representative under Article 27 if the controller is outside the EU.
- Identify the lead supervisory authority if multiple member states are touched.
- Maintain Article 30 records of processing activities.
- Apply transfer safeguards (SCCs or adequacy) for any data leaving the EEA.
- Conduct a Transfer Impact Assessment per the Schrems II framework.
- Honour rights requests under the lead authority’s procedure.
- Document the LIA covering EU-resident data subjects.
- Apply a dedicated retention TTL for EU-resident records.
- Map the AI Act applicability if outputs feed model training.
Code pattern: jurisdiction tagging at fetch time
import tldextract
EU_TLDS = {"de", "fr", "es", "it", "nl", "be", "pl", "se", "fi", "dk", "ie", "at", "pt", "cz", "ro", "gr", "hu", "bg", "sk", "hr", "lt", "lv", "ee", "lu", "cy", "mt", "si", "eu"}
def jurisdiction_for(url, geo_ip):
parsed = tldextract.extract(url)
if parsed.suffix in EU_TLDS:
return "EU"
if geo_ip and geo_ip.country_iso in {"DE","FR","ES","IT","NL","BE","PL","SE","FI","DK","IE","AT","PT","CZ","RO","GR","HU","BG","SK","HR","LT","LV","EE","LU","CY","MT","SI"}:
return "EU"
return "OTHER"
Comparison: jurisdictional triggers across regimes
| Regime | Territorial trigger | Targeting trigger | Monitoring trigger |
|---|---|---|---|
| EU GDPR | Establishment | Goods or services to EU | Monitor EU behaviour |
| UK GDPR | Establishment | Goods or services to UK | Monitor UK behaviour |
| California CCPA | Doing business in CA | Threshold-based | N/A |
| Singapore PDPA | Activity in Singapore | Targeted at Singapore | Indirect |
| India DPDP | Processing in India | Offer goods or services to data principals in India | Implied |
Additional FAQ
Can I avoid GDPR by hosting outside the EU?
No if Article 3(2) applies. Hosting location is not the operative test.
What is a lead supervisory authority?
The DPA in the member state where the controller’s main establishment lies. For non-EU controllers, the lead is determined by the EU representative’s location or by the affected member state’s DPA in the absence of a representative.
Are there exemptions for journalism or research?
Yes under Article 85 (journalism) and Article 89 (research), but the exemptions are narrow and member-state implementation varies.
Does Brexit change UK obligations?
The UK GDPR mirrors most of the EU GDPR but is enforced by the ICO under the UK Data Protection Act 2018. Cross-border transfers between the UK and the EU rely on adequacy.
The EDPB targeting and monitoring tests
The European Data Protection Board’s Guidelines 3/2018 on the territorial scope of the GDPR provide the operative tests for Article 3(2). The targeting test asks whether the controller offers goods or services to data subjects in the Union. The monitoring test asks whether the controller monitors the behaviour of data subjects within the Union.
For the targeting test the EDPB lists factors including the use of an EU language, the use of an EU currency, the targeting of EU users in marketing, the availability of EU shipping, the use of EU top-level domains, and references to EU customers. A scraper does not typically offer goods or services in the targeting sense, but a downstream application using scraped data might.
For the monitoring test the EDPB explicitly mentions tracking, profiling, and behavioural analysis as triggers. A scraper that profiles individuals is monitoring under the test. A scraper that simply collects information without profiling is in a grayer zone, but the EDPB’s 2024 update suggests that aggregation followed by behavioural analysis qualifies as monitoring.
The practical implication is that most commercial scraping that touches EU residents triggers Article 3(2). The defensible posture is to assume in-scope and design accordingly, rather than to argue for an exception.
EU representative requirements under Article 27
A non-EU controller in scope of Article 3(2) must designate an EU representative under Article 27 unless an exception applies. The exceptions cover public authorities, occasional processing that does not include large-scale processing of special category or criminal data, and processing unlikely to result in a risk to data subjects. Most commercial scrapers do not fit any exception.
The EU representative is the point of contact for data subjects and supervisory authorities. The representative must be located in a member state where data subjects whose data is processed are located. The representative does not assume the controller’s obligations but is jointly liable for certain failures.
The 2026 market for EU representative services is mature. Several specialised firms offer the service for low four-figure euro per year, plus per-incident fees for handling rights requests. The cost is modest relative to the regulatory exposure of operating without a representative.
Cross-border data transfer mechanics
GDPR Chapter V restricts transfers of personal data outside the EEA. The permitted mechanisms are adequacy decisions, Standard Contractual Clauses, Binding Corporate Rules, codes of conduct, certifications, and the limited derogations in Article 49.
For scrapers the most common mechanism is the SCCs. The 2021 SCC update introduced four modules for different transfer scenarios. The controller-to-controller and controller-to-processor modules are the workhorses. The SCCs must be supplemented with a Transfer Impact Assessment per the Schrems II framework, which evaluates whether the destination country’s law provides essentially equivalent protection.
The 2023 EU-US Data Privacy Framework provides an adequacy basis for transfers to certified US recipients. A scraper transferring data to a DPF-certified US entity does not need additional safeguards. Other US transfers still require SCCs or a derogation.
The 2024 UK adequacy decision (in both directions) and the 2024 Korean adequacy decision are the other major recent additions to the adequacy list. Outside those countries, SCCs remain the default.
Next steps
The fastest improvement this quarter is to identify your top three EU member state exposures, designate an Article 27 representative if you do not have one, and document SCCs plus TIA for any cross-border transfers. For broader compliance, head to the DRT compliance hub and pair this with the GDPR and personal-vs-public-data guides.
This guide is informational, not legal advice.