India DPDP Act for scrapers: 2026 compliance
DPDP Act India scraping compliance is the newest major data protection regime in Asia, with the Digital Personal Data Protection Act 2023 substantially in force from 2025 and the supporting rules issued in early 2026. India is one of the world’s largest sources of digital personal data, and any scraper touching Indian users now operates under the DPDP. The regime is consent-default like GDPR, but with several India-specific design choices: the data fiduciary versus data principal terminology, the Significant Data Fiduciary tier, the Data Protection Board (rather than independent regulators), and the cross-border transfer model based on a positive whitelist. This guide walks through the structure, the consent and notification rules, the operator obligations, and a practical compliance checklist for scrapers.
The audience is the technical lead or in-house counsel responsible for a scraping pipeline that touches Indian residents, or one based in India that touches anywhere.
What the DPDP Act actually covers in scraping context
The DPDP Act applies to the processing of digital personal data in India where the data is collected in digital form, or in non-digital form and subsequently digitised. It also applies to processing outside India that is in connection with offering goods or services to data principals in India. Like GDPR Article 3, the DPDP has extraterritorial reach, and scraping operations that systematically target Indian residents are in scope.
Personal data under Section 2(t) is any data about an individual who is identifiable by or in relation to such data. The “digital” qualifier is critical: the DPDP only covers digital personal data, which means voice, paper, and physical-only data falls outside the regime.
Data principal is the individual to whom the data relates. Data fiduciary is the entity that determines the purpose and means of processing. Data processor processes on behalf of the fiduciary. The terminology mirrors GDPR’s data subject and data controller but is locally distinct.
The Data Protection Board (DPB) was constituted in 2024 and operates as an adjudicatory body rather than a Western-style regulator. The DPB receives complaints, conducts inquiries, and imposes penalties. The maximum penalty under the DPDP is INR 250 crore per breach (approximately USD 30 million), making it one of the highest-penalty regimes in Asia.
For comparison with the Singapore PDPA, see the Singapore PDPA for scrapers guide. For the GDPR parallel, see the GDPR compliance guide.
The consent default and its narrow exceptions
The DPDP Act’s default rule is that personal data may be processed only with the consent of the data principal. Consent must be free, specific, informed, unconditional, unambiguous, and given through a clear affirmative action. The consent request must be presented in a manner that is clear, in plain language, and accompanied by a notice describing the personal data being collected and the purpose of processing.
This consent default makes life harder for scrapers than the corresponding GDPR Article 6(1)(f) Legitimate Interests basis. The DPDP does not include a general legitimate interests provision. It does, however, provide for “legitimate uses” under Section 7, which permits processing without consent in specific enumerated cases:
- The data principal voluntarily provided the data and has not indicated an objection.
- Processing by the State for the provision of any subsidy, benefit, service, certificate, licence, or permit.
- Compliance with a court order, judgement, or legal obligation.
- Medical emergency, threat to life, or public health response.
- Employer-employee relationship purposes.
- Compliance with a legal obligation to disclose.
For scrapers, the operationally relevant cases are extremely narrow. The “voluntarily provided” exception is the closest analogue to a general scraping basis, but it requires the data principal to have voluntarily provided the data and applies only to the purpose for which it was provided. A user who voluntarily posts on a public forum did not voluntarily provide the data to a downstream scraper.
The publicly-available exception is also narrower than PDPA Singapore: only personal data made publicly available by the data principal themselves, or by another person obligated under law to make it available, is excluded. Publicly available data that ended up online through breach, leak, or third-party aggregation is not covered.
Compliance checklist for scrapers
| Control | What it requires | Why it matters |
|---|---|---|
| Consent or legitimate use basis | Per-source documentation | Section 6 / Section 7 |
| Notice to data principal | At or before collection | Section 5 |
| Purpose limitation | Use only for notified purposes | Section 5(2) |
| Data minimisation | Collect only what is needed | Section 4 |
| Accuracy obligation | Reasonable steps to maintain accuracy | Section 8(3) |
| Reasonable security safeguards | Section 8(5) | Mandatory |
| Breach notification | To DPB and affected principals | Section 8(6) |
| Erasure on consent withdrawal | Section 8(7) | Right to be forgotten |
| Data Protection Officer (if SDF) | Mandatory for SDFs | Section 10(2) |
| Independent data auditor (if SDF) | Annual | Section 10(2) |
| DPIA (if SDF) | Periodic | Section 10(2) |
| Cross-border transfer | Whitelist mechanism | Section 16 |
| Children’s data extra protections | Verifiable parental consent | Section 9 |
| Grievance redressal mechanism | Easily accessible | Section 8(10) |
A scraper operating at scale in India needs every row, with extra controls if classed as a Significant Data Fiduciary.
The Significant Data Fiduciary tier
Section 10 introduces the Significant Data Fiduciary (SDF) classification. The Central Government may notify any data fiduciary or class of fiduciaries as significant based on the volume and sensitivity of personal data processed, risk to electoral democracy, security of the State, public order, or other prescribed factors.
SDFs face additional obligations:
– Appoint a Data Protection Officer based in India.
– Engage an independent data auditor to conduct annual audits.
– Conduct periodic Data Protection Impact Assessments.
– Comply with such other measures as the Central Government may prescribe.
The 2025-2026 notifications classified several large-scale data processors as SDFs, including some that scrape and aggregate at industrial scale. A scraper that crosses certain volume thresholds or operates AI training pipelines on Indian personal data should expect SDF designation in the near term.
For the AI training overlay, see fair use and copyright for AI training data.
Cross-border transfer and the whitelist model
Section 16 of the DPDP gives the Central Government the power to restrict transfer of personal data to specified countries or territories. This is a positive whitelist model: by default, transfer is permitted to any country, and the government can list countries that are restricted (or, depending on rules interpretation, list permitted countries).
The 2026 rules clarified that the government would publish a list of restricted destinations rather than a list of permitted destinations. This is operator-friendly: scrapers can transfer freely to any unlisted country until and unless that country is added to the restricted list.
As of mid-2026, the restricted list is short and primarily targets countries with no diplomatic relations with India or specific national security concerns. Major scraping destinations (US, EU, Singapore, UK, Canada, Australia) are all unrestricted.
This is the most operator-friendly element of the DPDP. The flexibility makes Indian-data pipelines easier to architect than EU-data pipelines.
Decision tree: is this scrape DPDP-compliant?
Q1: Is the source data made publicly available by the data principal themselves?
├── Yes -> Outside DPDP scope; verify and document.
└── No -> Q2
Q2: Have you obtained consent from the data principal?
├── Yes -> Document consent; standard obligations apply.
└── No -> Q3
Q3: Does a Section 7 legitimate use apply?
├── Yes -> Document; standard obligations apply.
└── No -> Stop; restructure to obtain consent or change scope.
For scrapers, the consent default and narrow legitimate use list make the DPDP much more restrictive than PDPA Singapore. The publicly-available carve-out is narrower than EU GDPR’s. Plan around this; do not assume.
Data principal rights
Sections 11-14 grant data principals four core rights: right to information about processing, right to correction and erasure, right to grievance redressal, and right to nominate (a person who can exercise rights on the principal’s behalf).
The right to erasure is broader than GDPR’s because it applies on consent withdrawal and is not subject to the same balancing tests. A scraper who relies on consent and the principal withdraws consent must erase, full stop.
The right of grievance redressal requires the data fiduciary to maintain an easily accessible grievance mechanism. The DPB only accepts complaints after the principal has exhausted the fiduciary’s grievance mechanism. A scraper without a public, responsive grievance inbox is exposed.
For the worked operational pattern, see the personal vs public data scraping framework.
Children’s data and verifiable parental consent
Section 9 imposes additional obligations for data of children (under 18 in India). A data fiduciary must obtain verifiable consent from the parent or lawful guardian before processing children’s data. Tracking, behavioural monitoring, and targeted advertising directed at children are prohibited.
For scrapers, this is a major exposure if any source contains children’s data. Social media platforms, gaming platforms, education platforms, and parts of the public web all contain user-generated content from minors. The DPDP’s verifiable parental consent requirement is not a “best efforts” standard; it requires a specific verifiable mechanism.
The 2026 rules specified acceptable verification mechanisms including Aadhaar-based verification for parents and acceptable third-party verification services. Scrapers who cannot verify must avoid processing children’s data, which in practice means filtering aggressively at intake.
How DPDP enforcement is shaping up in 2025-2026
The DPB began operations in 2024 and has issued its first round of decisions in 2025-2026. The early pattern: focus on consent quality, breach notification timing, and grievance redressal mechanisms. Several mid-sized fines (INR 5-25 crore range) were issued for missing or non-functional grievance redressal.
Larger fines are anticipated as the DPB completes its first investigations of SDFs. The maximum INR 250 crore penalty per breach has not yet been imposed but is reserved for systematic violations affecting large numbers of data principals.
The DPB is more litigation-focused than the PDPC or the EDPB; it is an adjudicatory body. Operators who engage early and constructively in proceedings tend to settle for substantially reduced penalties.
External references
The DPDP Act 2023 full text is at meity.gov.in/dpdp-act-2023. The Digital Personal Data Protection Rules 2026 are at the same MeitY portal. Data Protection Board notices and decisions are published at dpb.gov.in (note: as of mid-2026 the official site is in active build-out).
Comparison: DPDP vs PDPA vs GDPR for scrapers
| Dimension | DPDP India | PDPA Singapore | GDPR EU |
|---|---|---|---|
| Default basis | Consent | Consent (with carve-outs) | Lawful basis (six options) |
| Public data carve-out | Narrow (data principal disclosed) | Broad (generally available) | Narrow |
| Legitimate Interests | Limited (Section 7 list) | Yes (since 2020) | Yes (Article 6(1)(f)) |
| Cross-border transfer | Whitelist (operator-friendly) | Comparable protection | SCCs / adequacy |
| Right to erasure | Yes (on consent withdrawal) | Limited | Yes (Article 17) |
| DPO mandatory | For SDFs only | All organisations | Conditional |
| Maximum fine | INR 250 crore (~USD 30M) | SGD 1M or 10% turnover | EUR 20M or 4% turnover |
| Children’s data extra | Verifiable parental consent | Standard rules | Article 8 conditions |
| AI training friendly | Not articulated | DIP since 2024 | EU AI Act layers on |
| Adjudicatory body | DPB (litigation-style) | PDPC (regulator-style) | National DPAs (regulator-style) |
The DPDP is broadly stricter than PDPA on default basis but more flexible on cross-border transfer. It is comparable to GDPR on overall stringency but with different operational textures.
A worked example: scraping Indian ecommerce listings
A scraper collects publicly available product listings from major Indian ecommerce sites (Flipkart, Myntra, Meesho, Amazon India). The dataset includes product name, price, seller name, seller location, and seller rating.
Classification: seller name and location are personal data if the seller is an individual (many Indian ecommerce sellers are sole proprietors). Product name and price are not personal data.
Basis: the publicly-available carve-out applies if the seller voluntarily made their information public on the platform. Most platform terms require sellers to display their contact information. A defensible basis exists.
Notice: a public privacy notice on the scraper’s website describing collection, purpose, retention.
Grievance: an easily accessible grievance redressal mechanism (preferably an inbox, ideally a portal).
Cross-border: if data is transferred to non-restricted countries (US, EU, Singapore), no additional mechanism required as of mid-2026.
Outcome: defensible posture, with low ongoing overhead, with the publicly-available basis documented and a grievance mechanism live.
For a parallel ecommerce-targeted scrape, see scraping Flipkart India product data.
Special cases: AI training and political data
The DPDP Act does not yet have a specific AI training carve-out or framework. The current operator posture is to obtain consent or rely on Section 7 legitimate uses where possible, and to maintain documented training data manifests in anticipation of future rules.
Political data (party affiliation, voting behaviour, election-related profiling) is treated with extra caution by the DPB, with multiple early enforcement actions targeting election-time profiling operations. Scrapers should avoid political data entirely unless they have a defensible democratic-interest basis, which is hard to construct.
FAQ
Is publicly available data exempt from DPDP?
Only if the data was made publicly available by the data principal themselves or by a person obligated under law. Third-party publication does not qualify.
Do I need consent to scrape Indian data?
The default is yes. The Section 7 legitimate uses list is narrow and rarely applies to scraping operations.
Does DPDP apply if I am outside India?
Yes if your processing relates to offering goods or services to data principals in India. Extraterritorial reach is similar to GDPR.
Do I need a local DPO?
Only if classified as a Significant Data Fiduciary. SDFs must appoint an India-based DPO.
What is the cross-border transfer rule?
A positive whitelist model: transfer is permitted by default to any country not on the government’s restricted list. Major scraping destinations are unrestricted as of mid-2026.
Extended DPDP Act enforcement analysis
The Digital Personal Data Protection Act 2023 received presidential assent in August 2023. The Data Protection Board of India was constituted in 2024, and the Act’s substantive provisions came into force in phased fashion through 2025 and 2026. The DPDP rules notified in early 2025 fleshed out consent manager registration, breach notification timelines, and cross-border transfer mechanics.
The Act’s territorial reach (Section 3) covers processing of digital personal data in India, plus processing outside India if it relates to offering goods or services to data principals in India. The latter prong directly captures scrapers harvesting India-resident data from anywhere in the world.
The lawful-basis architecture differs from GDPR. The DPDP recognises two pathways. First, consent under Section 6, which must be free, specific, informed, unconditional, and unambiguous, and given through clear affirmative action. Second, certain legitimate uses under Section 7, including performance of state functions, employment, medical emergencies, disaster response, and certain specified purposes. There is no general legitimate-interest catch-all of the GDPR Article 6(1)(f) kind.
For scrapers the practical implication is that consent is the dominant pathway, and consent for third-party scraping is rarely available. The narrow Section 7 routes do not generally cover commercial scraping. Operators must therefore consider whether DPDP applies and design accordingly.
Implementation patterns for India-touching scraping
A 2026 DPDP-aware scraping pipeline should implement eight controls.
- Identify India-resident data principals at ingest.
- Distinguish between processing under consent and under Section 7 legitimate use, with documentation per record.
- Honour withdrawal of consent and erasure requests.
- Maintain a grievance officer and process within the statutory time limit.
- Apply notified-country transfer rules for cross-border movement.
- Apply enhanced safeguards for children’s data (under 18) including verifiable parental consent.
- Maintain breach notification within the prescribed window to the DPB.
- Maintain a data protection impact assessment for significant data fiduciary status if applicable.
Code pattern: India identification and consent gate
import re
IN_PHONE = re.compile(r"\+?91[\s-]?\d{5}[\s-]?\d{5}")
IN_DOMAINS = {"in", "co.in", "org.in", "ac.in", "gov.in"}
def is_india_subject(record):
if IN_PHONE.search(record.get("text", "")):
return True
if any(record.get("email","").endswith("." + d) for d in IN_DOMAINS):
return True
if record.get("country_iso") == "IN":
return True
return False
def consent_gate(record):
if not is_india_subject(record):
return True
if record.get("consent_token"):
return True
if record.get("section_7_basis") in {"employment", "medical_emergency", "disaster", "state_function"}:
return True
return False
Comparison: DPDP vs GDPR for scrapers
| Question | India DPDP | EU GDPR |
|---|---|---|
| Lawful bases | Consent plus narrow Section 7 list | Six bases including legitimate interest |
| Territorial reach | India processing plus targeting | EU processing plus targeting plus monitoring |
| Children threshold | Under 18 | Under 16 default, member states can lower to 13 |
| Cross-border transfer | Notified country list | SCCs, adequacy, derogations |
| Max fine | INR 250 crore | EUR 20 million or 4 percent global turnover |
| Breach notification | DPB notification window | 72 hours to DPA, individuals where high risk |
| Data principal rights | Access, correction, erasure, grievance, nominate | Access, rectification, erasure, restriction, portability, object |
Additional FAQ
Does DPDP have a legitimate-interest catch-all?
No. Section 7 specifies discrete legitimate uses. Commercial scraping of public data does not fit any listed category.
What is a Significant Data Fiduciary?
A class designated by the central government based on volume, sensitivity, risk to electoral democracy or public order, sovereignty, or India’s security. Significant Data Fiduciaries face enhanced obligations.
How does DPDP treat AI training data?
The Act does not have AI-specific provisions, but consent or a Section 7 ground is required for any personal data processing including training.
What happens to non-personal data?
DPDP regulates only digital personal data. A separate non-personal data framework has been discussed but not enacted as of 2026.
The DPDP Act’s consent architecture
The DPDP Act places consent at the centre of the lawful processing analysis. Section 6 specifies that consent must be free, specific, informed, unconditional, and unambiguous, and must be given through clear affirmative action. The Act introduced the concept of a Consent Manager (Section 6(7)), a registered intermediary that can manage consent on behalf of data principals.
For scrapers the consent requirement is generally not satisfiable. Third-party scraping involves no direct interaction with the data principal, so affirmative consent cannot be obtained. The consent pathway is therefore typically unavailable for commercial scraping.
The Consent Manager mechanism is a 2026 innovation that may eventually create new pathways. A data principal who has consented through a Consent Manager can have that consent presented to downstream services. The infrastructure is still maturing, and broad scraper-friendly consent flows have not emerged. Scrapers should monitor the development but should not rely on it for current operations.
Section 7 legitimate uses in detail
Section 7 of the DPDP Act lists the legitimate uses for which processing may occur without consent. The list is closed, meaning it cannot be expanded by regulator interpretation. The legitimate uses include: processing for state functions; medical emergency; disaster response; employment-related processing; and a few specific public interest categories.
Critically the list does not include a general legitimate-interest catch-all comparable to GDPR Article 6(1)(f). This makes the DPDP regime more restrictive for scrapers than the GDPR. A scrape that would pass an LIA under GDPR may have no available basis under DPDP.
The 2025 DPDP Rules provided some additional clarification on Section 7 application. The rules tightened the definition of employment-related processing and specified that the disaster response basis is limited to time-bound emergencies declared by competent authority. Neither change benefits scrapers.
Significant Data Fiduciary obligations
Section 10 of the DPDP Act allows the central government to designate certain data fiduciaries as Significant Data Fiduciaries (SDFs) based on volume, sensitivity, risk to electoral democracy, public order, sovereignty, or India’s security. SDFs face enhanced obligations including DPIA, data protection officer designation, and independent audit.
The 2025 government notification of initial SDF categories included social media platforms above thresholds, e-commerce platforms above thresholds, and certain AI service providers. Future notifications may extend to data brokers and aggregators.
For scrapers the practical implication is that operating at scale in India increases the risk of SDF designation. A scraper crossing the volume threshold or feeding sensitive AI applications should plan for SDF obligations. The compliance uplift is non-trivial, including a designated DPO, mandatory DPIAs, and annual audits.
Cross-border transfer under DPDP
The DPDP Act takes a different approach to cross-border transfer than GDPR. Section 16 permits transfer to any country except those specifically restricted by the central government through notification. The starting position is permissive, with restrictions added as needed.
The 2025 DPDP Rules clarified the transfer framework. As of 2026 only a small number of countries are on the restricted list, and the criteria for addition are based on national security and public order rather than data protection adequacy.
For scrapers the implication is that exporting India-resident data to most jurisdictions is permissible under the DPDP, but the source-side consent or Section 7 basis must still be established. The DPDP transfer rules do not create a lawful basis where none otherwise exists.
Next steps
The fastest path to DPDP compliance is to publish a privacy notice and grievance redressal mechanism, document the basis (preferably the narrow publicly-available carve-out where it applies), and prepare for SDF designation if your volume warrants it. For broader Asia compliance, head to the DRT compliance hub and pair this with the PDPA Singapore guide.
This guide is informational, not legal advice.