Verifiable credentials and scraping in 2026

Verifiable credentials and scraping in 2026

Verifiable credentials scraping is the access pattern that did not exist commercially three years ago and is starting to appear at production scale in 2026. Verifiable Credentials (VCs), defined by the W3C, are signed digital attestations that travel with the user and that a verifier can validate without contacting the issuer. The standardisation finished in 2022; the production deployment ramped in 2024-2025; the pivotal moment was the EU Digital Identity Wallet rollout across 2025-2026. For scraping operators, VCs reshape both the access landscape (sources will increasingly require credential presentation) and the operations landscape (some scrapers will themselves issue or hold credentials). This guide walks through what VCs actually are, the OID4VC presentation flow, the scraping-relevant credential ecosystems, the access patterns that work, and a practical operator playbook.

The audience is the technical lead, security architect, or compliance partner who needs to plan for a world where verifiable credentials gate the data they need.

What a verifiable credential actually is

A Verifiable Credential is a tamper-evident signed assertion. The data model is JSON-LD or JWT-encoded and contains three things:

  1. The issuer (a DID or an X.509 certificate identifying the authority).
  2. The subject (the entity the credential describes, typically a DID).
  3. The claims (the assertions being made, e.g., “this subject is over 18”, “this subject is a licensed driver”).

The credential is signed by the issuer’s key. Any verifier can validate the signature using the issuer’s public key. The verifier does not need to contact the issuer at validation time, which is the protocol’s central innovation. This makes VCs offline-verifiable, fast, and privacy-preserving.

A holder (typically the user via their wallet) presents the credential by signing a Verifiable Presentation, which can include the entire credential or only selected claims. Selective disclosure is supported via two main mechanisms: BBS+ signatures (mature, computationally heavy) and SD-JWT (lighter, widely deployed in 2026).

For the broader Web4 context, see decentralized identity and Web4 scrapers.

The OID4VC presentation flow

OpenID for Verifiable Credentials (OID4VC) is the protocol family that puts VCs into production HTTP flows. There are two main specs: OID4VCI (issuance) and OID4VP (presentation).

The OID4VP flow that a scraping operator interacts with:

  1. The verifier (the site you want to access) presents an authorisation request, typically as a QR code or a deep link, asking for specific credentials.
  2. The user’s wallet retrieves the request, identifies matching credentials, and asks the user to authorise presentation.
  3. The wallet builds a Verifiable Presentation containing the requested claims (with selective disclosure if applicable).
  4. The wallet POSTs the presentation to the verifier’s endpoint.
  5. The verifier validates the signature, checks the issuer, checks revocation, and grants access.

The flow is HTTP-native and integrates cleanly with existing OAuth 2.0/OIDC infrastructure. From the verifier’s perspective, it is a sign-in flow; from the wallet’s perspective, it is a credential presentation; from the user’s perspective, it is one approval click.

For scraping operators, the flow has implications. A scraper that wants to access an OID4VP-gated source needs to fit into this flow: either be the wallet (programmatic credential presentation) or partner with a wallet holder (delegated access).

The 2026 credential ecosystems

EcosystemIssuer authorityCommon credentialsStatus
EU Digital Identity WalletMember statesNational ID, driver’s licence, professional qualifications, ageGeneral availability
UK Digital IdentityUK Government Digital ServiceNational ID-equivalent, age, residencyProduction
Singpass (Singapore)Singapore GovernmentNational ID, qualifications, residenceProduction with VC issuance
DigiLocker (India)Indian GovernmentNational ID, education, employmentVC alongside document store
Apple WalletApple, partner issuersState IDs, transit, payment, loyaltyUS state IDs in selected states
Google WalletGoogle, partner issuersState IDs, transit, paymentEquivalent to Apple in coverage
OpenAttestation (Singapore)Open standard, multi-issuerTrade documents, qualificationsMature for trade and education
Hyperledger AnonCredsOpen standardVarious sectoralSelf-sovereign identity

For scraping, the EU and Singapore ecosystems are the most relevant in 2026. Both have substantive VC issuance with mature verification infrastructure.

Compliance and trust framework

The trust in a VC-gated system rests on three layers:

LayerQuestionMechanism
Issuer trustIs the issuer who they claim to be?Trust list (e.g., EU LOTL), DID resolution
Cryptographic integrityIs the signature valid?Standard cryptography
RevocationHas the credential been revoked?Status list 2021, OCSP-like protocols

A verifier checks all three. A scraper presenting a credential needs all three to pass. For commercial scrapers without legitimate credential issuance, the trust framework is a hard barrier; falsifying a credential requires breaking cryptography or compromising an issuer key.

The legitimate access paths for scrapers:

  1. Hold credentials in your own right (research institutions, regulated industries).
  2. Partner with credential holders (delegated access via signed authorisation).
  3. Become a credential issuer (issue credentials about your data, not about your identity).
  4. Use the public, non-credentialed surface of the source.

Decision tree: how to scrape a VC-gated source

Q1: Does the source require VC presentation?
    ├── No  -> Standard access; no VC work needed.
    └── Yes -> Q2
Q2: Which credential is required?
    ├── National ID / age -> Likely user-specific; partnership path only.
    ├── Professional qualification -> Operator may hold legitimately.
    ├── Subscription / payment -> Operator can typically obtain.
    └── Other -> Evaluate per credential type.
Q3: Can your operation hold the credential legitimately?
    ├── Yes -> Implement OID4VP client; present credential.
    └── No  -> Q4
Q4: Is partnership with a credential holder feasible?
    ├── Yes -> Negotiate; implement delegated access.
    └── No  -> Source is effectively unscrapable; explore licensed access.

Worked example: scraping an OpenAttestation-protected supply-chain document portal

A 2026 supply-chain portal hosts customs documents and bills of lading, accessible to authorised supply-chain participants. Access requires an OpenAttestation credential proving the requester is a licensed customs broker.

Web2 access path: register an account, submit business verification, wait for human approval. Often months.

VC access path: a customs broker holds a credential issued by Singapore’s Customs authority. The broker authorises the scraping operation to use the credential on their behalf. The scraping operation runs an OID4VP client that presents the credential at the portal’s verifier endpoint. Access is granted.

For the scraping operation, the credential acquisition and partnership path is real engineering work but bounded. Once in place, ongoing access is automated.

For the broader supply-chain scraping discussion, see scraping crypto exchange order books.

Selective disclosure and minimisation

A Verifiable Presentation can include only the claims the verifier needs. A credential carrying multiple claims (name, date of birth, nationality, address) can be presented with only “over 18” or “EU resident” disclosed.

This matters for scrapers in two ways:

  1. As a verifier (if your operation is the verifier of credentials presented by other parties), request only the claims you need. Excess collection is a GDPR/CCPA violation.
  2. As a holder (if your operation presents credentials), use selective disclosure to limit what the source learns about you.

The selective disclosure mechanisms in 2026 are SD-JWT (most common, simpler) and BBS+ (more cryptographically powerful, less common). Both are well-documented and implemented in major wallet SDKs.

Comparison: VC presentation vs traditional authentication

DimensionUsername/passwordOAuth 2.0 / OIDCOID4VP / VC presentation
Account at site requiredYesConditionalNo
Site learns user identityYesYesOnly what credential discloses
Credential portable across sitesNoLimitedYes
Selective disclosureNoNoYes (SD-JWT, BBS+)
Offline verificationNoNoYes
Phishing resistanceLowModerateHigh (cryptographic)
Adoption (mid-2026)MatureMatureEarly production

The VC pattern is structurally superior on privacy and portability. Adoption is the constraint, not capability.

Operations: becoming a credential issuer

A 2026 trend that scraping operators should not miss: some operators are themselves becoming credential issuers. Instead of issuing credentials about identity, they issue credentials about data. Examples:

Issuer typeCredentials issuedUse case
Data marketplaceProvenance VCs (this dataset was scraped from X on Y)Buyer trust
AggregatorQuality VCs (this record passes 99 percent quality checks)Downstream confidence
Compliance auditorCompliance VCs (this dataset was processed under GDPR)Regulator-facing assurance
Research operatorReplication VCs (this experiment used this dataset)Academic integrity

Becoming an issuer requires: a DID, an issuance service, a revocation registry, and (for trust) inclusion on relevant trust lists. The setup cost is real (engineering plus legal) but the resulting product differentiation is meaningful.

For the parallel discussion of how this fits into RAG-over-scraped-data products, see RAG over scraped data.

Implementation: an OID4VP verifier in Python

A minimal OID4VP verifier endpoint in Python (FastAPI) for accepting a presented Verifiable Presentation:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from didkit import verify_presentation
import json

app = FastAPI()

class Presentation(BaseModel):
    vp_token: str
    presentation_submission: dict

@app.post("/verify")
async def verify(p: Presentation):
    options = json.dumps({"proofPurpose": "authentication"})
    result_str = await verify_presentation(p.vp_token, options)
    result = json.loads(result_str)
    if result.get("errors"):
        raise HTTPException(400, detail=result["errors"])

    vp = json.loads(p.vp_token) if p.vp_token.startswith("{") else None
    holder = vp.get("holder") if vp else None
    return {"verified": True, "holder": holder}

The verifier validates signatures, issuer trust, and revocation status. Production deployment adds replay protection (nonce checking) and audience binding (verifier-specific challenges).

External references

The W3C Verifiable Credentials data model is at w3.org/TR/vc-data-model-2.0. The OpenID for Verifiable Presentations specification is at openid.net/specs/openid-4-verifiable-presentations-1_0.html. The EU Digital Identity Wallet architecture and reference framework is at ec.europa.eu/digital-building-blocks/sites/display/EUDIGITALIDENTITYWALLET.

Comparison: SD-JWT vs BBS+ for selective disclosure

DimensionSD-JWTBBS+
Computational costLowHigh
Predicate proofs (over 18 without DOB)No (requires explicit claim)Yes
Wallet support (mid-2026)WideLimited
Verifier support (mid-2026)WideLimited
Revocation handlingStandardStandard
Production maturityHighModerate

SD-JWT is the workhorse for production OID4VC in 2026. BBS+ is the path forward for use cases that need true zero-knowledge predicate proofs.

Where this is heading

Three trajectories.

First, more sectors adopt VC-gated access. Healthcare, finance, government, professional services, and supply chain are leading. Consumer sites are slower but trending.

Second, agent wallets become standard. Agentic browsers (Stagehand, browser-use, Operator) are adding wallet capabilities to programmatically present credentials. The boundary between user wallets and agent wallets blurs.

Third, the credential ecosystem fragments before it consolidates. The 2026 landscape has dozens of credential formats and trust frameworks. The next two years bring consolidation as larger ecosystems (EUDI Wallet, Singpass, US Wallets) absorb smaller ones.

For the broader trajectory of access patterns, see the agentic browser revolution.

Compliance overlay for VC-using scrapers

A scraper that uses VCs has compliance implications that traditional scraping does not. Four controls:

  1. Credential governance: written policy on which credentials the operation holds, who legitimately holds them, what use is in scope.
  2. Audit logging: every credential presentation logged with timestamp, target, claim selection, and outcome.
  3. Revocation handling: when a credential is revoked, the operation must stop using it within a defined window.
  4. Misuse safeguards: technical controls preventing inappropriate use of held credentials.

For the broader policy build, see building an ethics-first scraping policy.

FAQ

Is VC-gated access widespread in 2026?
It is in early production for several major sectors (EU public services, Singapore, regulated industries) but not yet ubiquitous. Adoption is accelerating.

Can a scraper hold a verifiable credential?
Yes if the credential is appropriate to the operation (research credential for a research scraper, broker credential for a broker-affiliated scraper). Falsifying a credential is fraud, full stop.

What is the difference between OAuth and OID4VP?
OAuth is account-based and platform-mediated. OID4VP is credential-based and portable across sites without per-site accounts.

What if I just want to bypass the VC requirement?
Bypass is generally infeasible (cryptographic) and almost certainly fraudulent (misrepresentation). The legitimate paths are partnership, delegated access, or licensed access.

Are VCs only for humans?
No. VCs can be issued to organisations, software agents, IoT devices, or any entity with a DID. Agent-targeted credentials are a growing category in 2026.

Extended verifiable credentials analysis

Verifiable credentials (VCs) per the W3C VC Data Model 2.0 specification became a production technology between 2023 and 2026. The eIDAS 2.0 European Digital Identity Wallet, the US California mDL programme, and several Singapore Singpass pilots all shipped VC-based credentials by mid-2026.

For scrapers VCs change three things. First, gated content moves from cookie auth to credential presentation. Second, content provenance can be verified cryptographically rather than through platform attestation. Third, scrapers themselves can present credentials to identify their operator and purpose.

The 2026 stable VC stack consists of four layers.

  1. Issuance protocol (OpenID for Verifiable Credential Issuance, OID4VCI).
  2. Presentation protocol (OpenID for Verifiable Presentations, OID4VP).
  3. Credential format (JWT-VC, SD-JWT-VC, mDoc).
  4. Status mechanism (Status List 2021, Bitstring Status List).

Implementation pattern: VC verification at fetch

from vc_lib import verify_presentation, check_status

async def verify_vc_presentation(presentation_jwt, expected_issuer, expected_claims):
    result = verify_presentation(presentation_jwt)
    if not result.valid:
        return False, "signature_invalid"
    if result.issuer != expected_issuer:
        return False, "issuer_mismatch"
    status_ok = await check_status(result.credential_id)
    if not status_ok:
        return False, "credential_revoked"
    for claim, expected in expected_claims.items():
        if result.claims.get(claim) != expected:
            return False, f"claim_mismatch_{claim}"
    return True, result

SD-JWT-VC pattern for selective disclosure

def request_minimum_disclosures(presentation_definition, claims_needed):
    return {
        "presentation_definition": {
            "id": "scraper-disclosure-request",
            "input_descriptors": [{
                "id": "minimal_id",
                "constraints": {
                    "fields": [{"path": [f"$.{c}"]} for c in claims_needed],
                    "limit_disclosure": "required",
                },
            }],
        },
    }

Comparison: VC formats for scrapers

FormatSelective disclosureSizeAdoption 2026
JWT-VCNoCompactHigh
SD-JWT-VCYesCompactHigh and growing
LDP-VCNo (with BBS+ yes)LargerModerate
mDoc (ISO 18013-5)YesCompactHigh in govt
AnonCredsYes (with ZK)LargerModerate

C2PA content credentials for provenance

C2PA content credentials are a parallel standard for media provenance. A scraper can verify a media file’s signed manifest to confirm origin and edit history. The verification flow is.

  1. Read the C2PA manifest from the file (JPEG, MP4, PDF support).
  2. Verify the signature against the trust list.
  3. Walk the assertion chain (capture, edits, transformations).
  4. Present provenance to the downstream consumer.
from c2pa import Reader

def read_provenance(media_path):
    reader = Reader.from_file(media_path)
    manifest = reader.json()
    return {
        "issuer": manifest.get("issuer"),
        "claim_generator": manifest.get("claim_generator"),
        "assertions": manifest.get("assertions", []),
        "valid": reader.validation_status() == "valid",
    }

Additional FAQ

Are VCs required to scrape?
Not yet. They are an option that improves trust and reduces friction with gated content.

What about issuer trust?
Verifiers maintain a trust list of accepted issuers. The trust list is updated as new issuers are accredited.

Can VCs be revoked?
Yes. The Status List 2021 and Bitstring Status List mechanisms publish revocation status that verifiers check at presentation time.

How does VC presentation interact with privacy law?
Selective disclosure formats minimise the personal data shared. SD-JWT-VC and AnonCreds support disclosure of only what is needed.

Common pitfalls when integrating verifiable credentials into a scraping stack

Five failure modes consistently bite teams that ship VC integration for the first time.

The first pitfall is skipping nonce and audience binding. A presented credential without a fresh nonce and an audience claim bound to your verifier endpoint is replayable. An attacker who captures the presentation off the wire can replay it against your verifier and obtain access. Always issue a one-time nonce per authorisation request, embed your verifier’s identifier in the audience claim, and reject any presentation that does not bind to both.

The second pitfall is caching revocation status too aggressively. Status List 2021 publishes revocation as a compressed bitstring that verifiers cache for performance. Cache lifetimes longer than 15 minutes risk granting access on credentials that were revoked between the cache fetch and the request. Set cache TTL to the value the issuer publishes in the Status List response, and never extend it locally.

The third pitfall is trusting the holder claim inside the credential. The credential’s subject DID is not the same as the presenter. A holder presenting a credential that was issued to a different subject must prove control of the subject DID through a signed proof of possession in the presentation envelope. Verifiers that match on the subject DID alone accept stolen or copied credentials.

The fourth pitfall is failing to update the trust list. The set of accredited issuers changes as governments add ecosystems and revoke compromised ones. A verifier with a stale trust list either rejects valid credentials from new issuers or accepts credentials from issuers that should be excluded. Pull the trust list daily and alert on changes.

The fifth pitfall is treating selective disclosure as optional. A verifier that requests every claim in a credential when only one is needed creates GDPR exposure and erodes user trust. Use OID4VP presentation definitions to request the minimum claim set and document the necessity of each claim in your privacy notice.

The W3C verifiable credentials data model

The W3C VC Data Model 2.0, published in 2023, defines verifiable credentials as a cryptographically signed claim about a subject made by an issuer. The data model is JSON-LD-based and supports multiple proof formats (linked data proofs, JWT, SD-JWT, BBS+).

A credential has four mandatory components. The @context binds vocabulary terms to URIs. The type identifies the credential schema. The issuer identifies who made the claim. The credentialSubject contains the actual claims. Optional components include validFrom, validUntil, credentialStatus (for revocation), evidence, and termsOfUse.

For scrapers consuming credentials the credentialSubject is the operational interest. The schema of the subject varies by credential type. A driving licence credential has different fields than an employment verification credential. A verifier must understand the schema to interpret the claims.

The 2026 ecosystem has converged on a small number of widely-deployed credential types. Identity proofs (driving licence, passport equivalents), age proofs (over 18, over 21), residence proofs (jurisdiction), and educational proofs (degree completion) are common. Industry-specific credentials (medical licences, professional certifications) are growing.

Issuer trust and trust list management

A verifier must decide which issuers to trust. The trust decision is the central security question for a VC-consuming system.

The 2026 patterns for trust list management include centralised trust lists (a government registry of accredited issuers), federated trust lists (a consortium of mutually recognising issuers), and reputation-based trust (issuers earn trust through track record). Each pattern has trade-offs in centralisation, governance, and operational complexity.

A scraper acting as a verifier should explicitly choose a trust list strategy. The choice affects which credentials are accepted, which issuers must be evaluated, and how trust list updates are propagated. Trust list updates should be auditable and timestamped.

The 2024 EU eIDAS 2.0 regulation specifies a trust framework for European Digital Identity Wallet credentials. Member states maintain national trust lists, and the European Union aggregates them into a federated trust list. A scraper accepting EUDIW credentials inherits the eIDAS trust framework.

Status mechanisms and revocation

A credential that has been issued may need to be revoked. Revocation mechanisms in 2026 include the W3C Status List 2021 specification, the Bitstring Status List, and the older Revocation List 2020.

Status List 2021 represents revocation as a bitstring published at a stable URL. Each credential has an index into the bitstring. A bit value of 1 indicates revoked, 0 indicates valid. The verifier fetches the bitstring at presentation time, indexes into it, and acts accordingly.

The bitstring approach is privacy-preserving (the verifier learns only the bit value, not the broader status of other credentials) and efficient (a single fetch covers many credentials). The trade-off is that the issuer must maintain the bitstring and the verifier must fetch it.

Bitstring Status List, a 2024 evolution, supports more than two states (valid, revoked, suspended) with a configurable bit width. The expanded vocabulary handles cases where a credential is temporarily inactive but not permanently revoked.

C2PA content credentials in production

C2PA content credentials shipped in production at scale during 2024-2026. Adobe Photoshop, Adobe Premiere, Microsoft Designer, and OpenAI’s image generation pipelines all embed C2PA manifests on output. Camera manufacturers including Leica, Nikon, and Sony shipped C2PA-capable cameras.

For scrapers the C2PA presence in production media files creates a new provenance signal. A scraper that reads the C2PA manifest at ingest can record the claimed origin, the editing history, and any AI generation flags. Downstream consumers can use the provenance for filtering, attribution, or trust scoring.

The 2026 best practice for scrapers handling media is to preserve C2PA manifests when storing the media. The manifest is typically a few kilobytes, small relative to the media file. Preservation enables future verification even if the original source becomes unavailable.

C2PA validation has cost. Verifying the signature requires fetching the issuer’s certificate chain and checking against trust lists. Scrapers operating at scale should batch verification or sample-verify rather than verify every file.

Next steps

If your operation might encounter VC-gated sources in the next 18 months, the highest-leverage move this quarter is to read one OID4VP implementation guide end-to-end and prototype a verifier endpoint. The technology is approachable; the leverage compounds. For broader emerging-tech context, head to the DRT emerging-tech hub and pair this with the decentralized identity guide.

This guide is informational, not engineering or legal advice.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
message me on telegram

Resources

Proxy Signals Podcast
Operator-level insights on mobile proxies and access infrastructure.

Multi-Account Proxies: Setup, Types, Tools & Mistakes (2026)