The first thing almost every app does is ask you who you are. Email, name, sometimes a phone number, often more. We’ve come to treat this as the natural cost of using software.
But it isn't. It’s a habit. A series of reasonable individual choices by app builders that compounded over decades into a default that nobody actively chose. Email for password reset. Name for the welcome screen. Avatar for the navbar. Phone number for the two-factor authentication. Each was justified on its own. The aggregate is a complete identity held by every app you’ve ever signed up for, sitting in a database you have to trust will never be breached or subpoenaed.
What “privacy” usually means
When a product page mentions privacy, it almost always means protection: "We encrypt your data at rest and in transit. We’re SOC 2 compliant. We will not sell your information to third parties." These are good things. They are also a category of claim that quietly accepts the premise: that the app has your data.
There’s a different category of claim I’m interested in: "We don’t have your data in the first place." You can’t leak what you never collected.
The two aren’t opposed — most credible privacy postures use both. Protection is mostly a security problem. Non-collection is an architecture problem.
The privacy ladder
Most apps live somewhere on a ladder that runs roughly:
- Full identity. Email, name, phone, IP address, behavior. Encrypted at rest, but legible to anyone with database access.
- Minimization. Same data, but less of it. “We only collect what we need.” The line between need and want is sometimes fuzzy.
- Pseudonymization. Identifiers are replaced with opaque tokens; the mapping back to real identity is held separately, behind a key, or by a third party. The user can be re-identified by someone, but not by everyone.
- Anonymity. Nobody at the operator can identify the user, even with full database access. Rarer than people think — most systems that claim it are pseudonymous if you look closely.
- Unobservable. Even the fact that a particular user used the service at a particular time isn’t recoverable. The dark web lives here. Almost no SaaS does.
Each level has costs. Going from 1 to 2 costs you marketing reach. Going from 2 to 3 costs you support workflows (“what’s the email on your account?”). Going from 3 to 4 costs you account recovery, security notifications — basically every flow that assumes the operator can reach the user. Going from 4 to 5 costs you most of what a typical SaaS even is.
The interesting design space, for a typical app with normal users, is level 3. The question is: how far up level 3 can you push before the cost and friction outweigh the benefits?
A case: anonymous authentication
A small example, from a project I’ve been building. botchat is a chat product that lets users talk to multiple AI models at once. Authentication happens through the usual authentication (auth) providers — Google, Apple, etc. — and from the user’s perspective the flow looks completely ordinary: click a button, log in.
The unusual part is what happens on the server side. When the auth provider returns its payload, the app takes the provider’s stable user ID, and stores a secure hash of it (like a fingerprint). The name is discarded, as are the email, avatar, and all other personal information. All the app has is that fingerprint and a created-at timestamp. That’s the entire identity surface.
Sessions live in a short-lived token inside a marketing-free cookie. Conversations live in the user’s browser, in local storage, never on the server. Uploaded files are processed in memory and never written to disk. The server is, by design, a stranger to the people using it.
This is level 3 — pseudonymous, not anonymous. Google still knows exactly who logged in. If (geek alert) our salt ever leaked alongside the database, the hashes would become linkable back to auth provider subject IDs and the pseudonymity would collapse. These are the honest cost of the choice. Being precise about where you actually sit on the ladder matters more than picking the most flattering label for it.
The costs in product terms are real but tolerable. There are no password reset emails — OAuth handles that. There are no “we noticed a login from a new device” notifications. There’s no win-back campaign when a user goes quiet, because there’s no inbox to win them back through. Customer support can’t start with “what’s the email on your account?” because there isn’t one. Every one of those is a feature we chose not to have. None of them, it turns out, are load-bearing.
How much further could you go?
If pseudonymous auth is level 3, what does level 3-and-a-half look like? A few directions, in increasing order of absolutism:
- Hide the identity broker. Use a private relay or a similar abstraction, so that even the auth provider sees a per-app pseudonym rather than a stable identity. The operator knows less; the broker also knows less.
- Passkeys instead of OAuth. Using a secure keypair bound to the user's device. There’s no provider in the middle and no email by default. The operator stores a public key. Account recovery becomes the user’s problem — a real cost — but it removes the third party that knew everything.
- Zero-knowledge proofs. The user proves they hold a valid credential (a paid subscription, a membership, an auth login) without revealing which credential. The operator can gate access without identifying anyone. Well-studied in cryptography, rare in software.
- Network-level anonymity. Stop logging IP addresses. Treat network metadata as identifying data and discard it the same way you discarded the email. Most apps don’t — and most apps don’t realize they’re collecting it.
None of these are free. Each removes a layer of identifying information at the cost of a layer of product convenience. The right question isn’t “which is best” but “which of these costs would a particular product’s users actually feel, and which would they never notice?”
The operative question
When you sit with the problem long enough, you realize most apps have it backwards. They start from a default that collects everything and ask, for each piece of data, “is there a reason to stop collecting this?” The answer is almost always no, so the database grows.
A more honest default starts from nothing and asks, for each piece of data, “is there a reason to start collecting this?” The answer is sometimes yes — but it’s an informed yes, tied to a feature the user is actually asking for, not a future possibility the operator wants to keep open.
Risk-conscious app builders should treat users as strangers, and design the product so it can stay that way. Most of the time you can. Most of the time, when you can’t, the thing forcing you to know more is a feature you didn’t really need to build.