Design it together, then prove it

The verification loop has to prove itself every time, and it’s worth the attention.

This week I shipped a permissions change with Claude. The tests passed and it went out to staging.

That tells me the code ran. It doesn’t tell me it does what I meant. To trust it, and to actually know it works, you verify it. That is the part I want to show.

Claude, in its own words

Real data, not a fixture. I didn’t invent a clean test record. I used a document already in the staging data, uploaded earlier while checking the upload feature worked. Real data through the real flow, not a row I made up to pass my own test.

Every surface, not just the obvious one. The file was reachable two ways: one endpoint that lists it, and a separate one that serves the bytes. Both had changed in the code, but they are different paths, and a screen that hides the list while the download stays open is still a full leak. I checked each on its own.

The proof is in the delta: same call, swap the identity. I ran the same request and changed only who was asking. A stranger with nothing but the link got an empty list, and a flat “unservable” on the download. Someone tied to the account got every file. Same request, different person, different answer.

Hit the server directly, underneath the screen. I called the backend with no app in front of it, because that is how someone would actually come at it. The server is what decides who gets the file, so that is where I aimed. A green screen only tells you the happy path looks fine.

Verify the negative and the positive. I checked both directions: denied for the wrong person, and still working for the right one. A permissions change that locks everyone out passes the security test and quietly kills the feature.

Round-trip the toggle, then restore state. The change shipped with an opt-out, where an owner can make the sources public again. I flipped it on through the real screen, watched both endpoints open back up for the stranger, flipped it off, watched both close, then put the record back exactly as I found it.

Don’t let the check become the leak. To test the logged-in case I needed the active session, so I ran that call inside the browser, with the token never leaving the page, and kept only the result.

Anchor every pass to an output you can read. Nothing here was “should work.” Every line was something I could point at: an empty list, the word unservable, four filenames, an ok.

Why that walk is the job

This was a new access design, not a bug fix. What breaks a design like this is divergence: more than one path reaches the same file, and the paths stop agreeing on who’s allowed. So the test isn’t whether the feature worked once. It’s whether every path and every identity land on the same rule, every time.

Every step above is a refusal to settle. A fixture, when real data was sitting right there. One endpoint, when a second one served the same file. A single green result, when the proof only shows up in the contrast between two identities. The screen, when the server is the thing that decides.

Why I can let it work for hours

In every project, every feature, every piece of work, I make sure Claude has a way to validate itself. So once we’ve thought it through, planned, and decided what to build, I can trust the process and let it work for hours.