Part 2 of the Age Verification series · ← Previous · Next →
We wrote earlier this week about why every proposed age verification method makes the problem worse. Document uploads create breach targets. Biometric scanning collects the data the laws exist to protect. Third-party vendors consolidate risk. The jurisdictional patchwork is architecturally unsolvable.
The structural failure is that current laws treat age verification as an identity problem. It is not. “Is this person over 18?” is a boolean. The answer should require one API call and zero personal data in transit.
The question is simpler than the infrastructure#
Every current approach (document upload, facial age estimation, third-party identity providers, zero-knowledge credential wallets) exists because the industry frames age verification as identity verification.
Age verification is a set membership problem.
Given a population where age is already known to the government, the question “is this person over 18?” is a lookup. The federal government already holds the canonical data: name, date of birth, and government-issued identification numbers for every person in its systems. Social Security Administration. State DMVs. USCIS. The passport office. The database exists. It has existed for decades.
The problem is not that the data is missing. The problem is that every proposed solution creates a new copy of it.
The architecture#
The system has three components. A canonical hash. A government API. An official library.
The hash. The government assigns a Verification ID (VID) to each person: a high-entropy, randomly generated identifier unrelated to Social Security numbers, driver’s license numbers, or any existing credential. VIDs rotate on a regular cycle; a new VID invalidates the previous hash. If a VID is compromised, it can be revoked and reissued without affecting the internal record.
The canonical input format is a normalized representation of name and VID. Fixed field ordering. Unicode normalization to NFC. Defined character encoding: every character in every major script maps to a defined codepoint, so the canonical form is deterministic regardless of language.
The client library computes a deterministic hash from this input. The hash is not reversible, contains no personal information, and is unique per person. Because the VID is high-entropy and rotates, brute-force reversal is computationally infeasible even if an attacker knows the person’s name.
The API. The client library sends the hash and a platform-generated nonce to the government endpoint. No name. No date of birth. No ID number. No biometric. The API returns a signed response: the boolean, the hash, and the nonce, signed with the government’s private key. The signature proves the response is authentic. The nonce proves it is fresh and bound to this specific verification request. Rate-limited. Authenticated. Stateless.
The library. The government publishes official open-source client libraries in every major language. The library is the single source of truth for input canonicalization, hashing, and API communication. Developers integrate it the same way they integrate any SDK. The raw attributes never leave the device. The library hashes locally, calls the government API directly, and returns the signed response to the integrating application. The platform sees the hash but cannot reverse it: the hash contains no personal information, and VID rotation ensures it cannot be used as a permanent tracking identifier. The platform never sees the name, the VID, or any raw input field.
That is the entire system.
User Device Government API Platform
───────────── ────────────── ────────
│ │ │
│ │ 0. nonce │
│◀─────────────────────────────────────────────────│
│ │ │
│ 1. hash(name+VID) │ │
│ + nonce │ │
│ ───────────────────────▶│ │
│ │ │
│ 2. signed(boolean, │ │
│ hash, nonce) │ │
│ ◀───────────────────────│ │
│ │ │
│ 3. signed(boolean, hash, nonce) │
│ ────────────────────────────────────────────────▶│
│ │ │
The platform generates a nonce and passes it to the user’s device. The user enters their name and VID. The library hashes locally and sends the hash and nonce to the government API. The API returns a signed response: the boolean, the hash, and the nonce. The device forwards this signed response to the platform. The platform verifies the government’s signature, confirms the nonce matches, and reads the boolean. The platform sees the hash but not the name, VID, or any personal data. The hash is opaque and non-reversible.
The boolean has a property that simplifies everything downstream: “over 18” is monotonic. Once true, it stays true. A platform caches the verified result against its own user account: no expiration, no re-verification, no additional API calls. Unless the platform changes its age threshold, every verified user stays verified forever. If a platform raises its threshold (say, from 16 to 18), it invalidates all cached results and re-verifies every user with a single API call each. One bulk refresh, then back to steady state.
Users who receive a “false” are the only cases that change over time. They age up. The platform allows user-initiated re-verification: a user who turns 18 triggers a fresh API call, the cache updates, and access is granted. One call per status transition.
This keeps the API out of the critical path for returning users. If the government endpoint goes down, every previously verified user is unaffected. The API matters once per user at first verification, and once more for the fraction that ages across the threshold while using the platform. At scale, sustained query volume is a rounding error on the initial rollout.
What this eliminates#
The first post described four compounding failures of current age verification approaches. This architecture addresses each.
The verification paradox. Current methods collect the exact data they exist to protect. This system transmits a non-reversible hash. No PII crosses the wire. No biometric databases. No document uploads. No facial recognition.
The jurisdictional patchwork. A federal API provides one endpoint for all states. No geo-fencing. No conflicting compliance regimes. If Congress cannot agree on a uniform age threshold, the API can support multiple queries: over_13, over_16, over_18. The patchwork becomes a parameter.
The attack surface. Third-party verification vendors (the consolidation point that makes the current architecture a high-value target) are removed entirely. There is no private-sector SDK processing government IDs. There is no vendor database to breach. The Persona frontend exposure reported in February does not happen if the vendor layer does not exist.
Retention contradictions. The platform receives only a signed boolean and an opaque, non-reversible hash. The hash rotates with the VID and cannot be used for deanonymization. The platform retains the signed response as cryptographic proof of verification. The compliance evidence lives on both sides: the platform holds the signed response, and the government maintains an audit log that regulators can query directly.
The government already runs harder systems#
A rate-limited API returning a boolean is not a novel engineering challenge.
| System | Function | Scale |
|---|---|---|
| NICS | Firearm background checks | ~28M checks/year |
| IRS e-file | Tax return processing | ~150M returns/year |
| E-Verify | Employment eligibility | ~100M queries/year |
| Login.gov | Federated authentication | Dozens of agencies |
NICS returns approve, deny, or delay for firearm purchases in seconds. The age verification API is simpler: one hash in, one boolean out, no adjudication.
Beyond web platforms#
The API is not limited to websites. Any software installation that requires internet connectivity can integrate the same flow.
An operating system installer already requires an internet connection for package downloads, updates, and account setup. Adding age verification to the setup questionnaire is one additional step. The installer embeds the client library. During setup, the user enters their name and VID. The library hashes locally, calls the API, and the installer receives a signed response before proceeding. If the result is false, the installer enforces age-appropriate defaults: parental controls enabled, restricted repositories, limited account permissions. If the policy requires it, the installer can refuse to proceed entirely.
This generalizes to any distribution point with an internet requirement: mobile device setup, game console initialization, app store account creation. The same library, the same API call, the same signed response. The policy (what to restrict for minors) varies by context. The verification mechanism does not.
Privacy tradeoffs#
This architecture is not surveillance-free by default. The API model means the government observes query traffic: which hashes are queried, for which platform, and when. The platform holds an opaque hash permanently as part of the signed verification record. The hash is non-reversible and contains no personal information, but it is a stable identifier within that platform. VID rotation limits cross-platform correlation (a different platform querying later receives a different hash), but the originating platform retains the original indefinitely. These are legitimate concerns that should be addressed explicitly.
Three approaches, each with different tradeoffs.
Statutory no-log mandate. The enabling legislation prohibits retention of query metadata. The API processes the request and discards it. This is enforceable through the same oversight mechanisms that govern other federal systems handling sensitive data. Regulators can still audit platforms by requesting their signed responses (the government’s signature is verifiable against its public key without any log). But the government cannot proactively detect platforms that skip verification entirely. Maximum privacy, but enforcement is reactive.
Audit log and re-verify endpoint. The government maintains a minimal audit log: hash, platform identifier, timestamp, and result. Because verification is one-time per user, the log is small. The platform retains the signed response as cryptographic proof of verification. If a regulator questions a platform’s compliance, either side can produce evidence: the platform presents its signed response, or the government confirms the verification through its audit log. Combined with the monotonic caching described above, the government sees exactly one query per user per platform. No ongoing activity. No repeat calls. This is the approach the rest of this post assumes.
Static dataset distribution. The government publishes the full hash-to-age-flag dataset for client-side verification. No API calls at all. But every platform now needs a copy of a roughly 33-gigabyte dataset that must be kept in sync as VIDs rotate. The government loses all visibility into who is verifying and whether platforms are complying. No rate limiting. No authentication. No audit trail. Maximum privacy, but no operational oversight and a significant distribution burden.
All three are categorically better than the current approach: uploading a photograph of a driver’s license to a third-party vendor that stores it on an exposed server.
What remains unsolved#
This is not a complete solution. Three problems are structural, not architectural.
Government ID access. People without government-issued identification cannot generate a valid hash. Bellovin identifies this as the core bootstrapping limitation of every privacy-preserving age verification system. But this is not a limitation of the architecture. It is a failure of government identity infrastructure that should have been resolved decades ago.
The United States is one of the only developed nations without a universal, accessible government identity system. Estonia issues digital identity at birth. India’s Aadhaar system covers 1.4 billion people. The U.S. still requires in-person visits to underfunded DMV offices with inconsistent documentation requirements that vary by state, effectively locking millions of Americans (disproportionately low-income, elderly, disabled, rural, and minority populations) out of systems that increasingly depend on government-issued ID.
These same states have passed laws that require age verification to access online services. If the government mandates identity-dependent verification, it has an obligation to make identity accessible. That means funded, modernized, universally available identity issuance. Not as a prerequisite for this architecture, but as a basic function of government that has been neglected while legislators pile new mandates on top of infrastructure they never built. The blocker is not the hash or the API. The blocker is that the government keeps writing requirements against an identity system it has not finished deploying to its own citizens.
Credential sharing. If someone shares their name and VID with another person, that person can compute the hash and pass verification. Device binding or local biometric unlock mitigates this but adds complexity. No age verification system, including in-person ID checks at a bar, fully solves credential sharing.
Political will. This architecture requires the federal government to build and maintain an API. The current legislative approach (mandating that private companies figure it out) avoids government engineering effort at the cost of privacy, security, and architectural coherence. Seventeen states have passed age verification laws. None of them include a government-operated verification service. None of them fund one. The laws create obligations and externalize every cost.
If legislators believe children must be protected from certain online services, then the government has two honest options: provide the infrastructure to verify age without compromising privacy, or shut the services down until it can. What it cannot coherently do is mandate verification, refuse to build the verification system, refuse to modernize the identity infrastructure the system depends on, and then blame the private sector when the result is a surveillance apparatus. That is not child safety policy. It is liability transfer.
The structural argument#
The age verification debate has produced an extraordinary amount of complexity in service of a simple question. Zero-knowledge credential wallets. Biometric age estimation. Third-party identity providers with contractual liability chains. Device-level attestation APIs. Each adds infrastructure. Each creates new attack surface. Each collects data that would not otherwise exist.
Bellovin’s analysis of privacy-preserving age verification identifies the IDP bootstrapping problem as the core limitation of credential-based systems. Doctorow’s objection is that identity providers and certificate authorities will inevitably be forced to keep logs that destroy the privacy guarantees. Both critiques target the credential issuance infrastructure. This architecture does not have one. There are no identity providers. There are no certificate authorities. There is a hash and an API.
The government already has the data. The question is a boolean. The answer is one API call. The technical problem was solved the day hash functions and digital signatures were invented. The obstacle is a policy environment that keeps asking the private sector to build identity infrastructure the government already operates, and wondering why every solution creates a surveillance system.
Legislators who are serious about protecting children online should be asked a direct question: will you fund and build the federal age verification API, or will you continue to mandate that private companies collect the identity documents of every American who wants to use the internet? And if the government cannot modernize its own identity infrastructure fast enough to support the mandates it is passing, then it should have the honesty to halt enforcement until it can. The alternative (requiring verification without providing the means to do it safely) is not protecting anyone. It is creating a new attack surface and calling it protection.
Part 2 of the Age Verification series · ← Previous · Next →
For background on why current approaches fail, see To Protect Children, First Centralize Everything Worth Stealing. For the academic treatment of privacy-preserving age verification limitations, start with Bellovin’s paper. For the strongest counterargument to the entire enterprise, read Doctorow’s response.