Engineering

We run a cross-tenant probe nobody asked for, because doors get found.

"Isolation is on" is a sentence; a probe that reads another tenant's data and fails is a defense.

ASR

Apollo Space Research

Apollo Space

· 11 min read

We wrote a test whose entire job is to steal from us. It signs in as one company, then reaches across the wall and asks for another company’s data, invoices, agents, conversations, the lot. Every time we deploy, that test runs before a single customer does. We are rooting for it to come back empty-handed.

The day it does not come back empty-handed, nobody ships.

Most teams will tell you their tenants are isolated. They mean it. They also can’t show you the test that would prove them wrong if they were lying to themselves.

This post is about the difference between believing your walls are up and owning a machine that keeps trying to walk through them. The short version: “isolation is on” is a sentence; a probe that reads another tenant’s data and fails is a defense.

The naive version: trust the WHERE clause

The obvious way to keep two customers’ data apart is to tag every row with whose it is, and then remember to filter on that tag everywhere.

You add an org_id column to every table. Every query that reads data gets a condition: only rows where org_id equals the company that’s logged in. The plan is sound. For the first fifty queries, it even holds.

Then the codebase grows, and the plan starts depending on memory.

A new endpoint ships on a Friday and the developer forgets the filter on one of three queries. A reporting job written for an internal dashboard runs without a tenant scope because, at the time, there was only one tenant. An agent gets a new tool that runs a query the original author never reviewed. None of these are exotic. They are the most ordinary thing in software: a human being, under deadline, forgetting one line in one place out of ten thousand.

And the failure is silent. The missing filter doesn’t throw an error. The query runs fine, returns rows, the feature works in the demo. It just also returns rows that belong to someone else, and you don’t find out until a customer does. The whole defense rested on every engineer remembering one clause forever, and that is not a defense. That’s a wish with good intentions.

A filter you have to remember is a filter you will eventually forget.

Two walls, because one wall trusts your memory

The fix is not to remember harder. It’s to make the wall something the database enforces whether or not anyone remembered the clause.

Two ways to keep tenants apart. The naive lane trusts every query to remember a tenant filter, and the one query that forgets leaks another company's rows. The enforced lane puts the rule inside the database itself, so a forgotten filter returns nothing instead of someone else's data.

Modern databases can hold the rule themselves. You tell the database, once, at the table level: a row is only visible to the company that owns it, full stop. After that, it does not matter whether a query remembered to filter. The database refuses to hand back rows that don’t belong to the caller, the same way it refuses to divide by zero. The rule lives below the application, where a forgetful Friday endpoint can’t route around it.

That’s the first wall, and it is the strong one. It turns “every engineer must remember the clause” into “the engine enforces the clause,” which is the difference between a policy and a habit.

We keep the application-level filter too. Not because we don’t trust the first wall, because two independent walls fail independently. If a configuration mistake ever weakened the database rule, the explicit filter in the query is still standing, and vice versa. Belt and braces. The key idea is simple: the cost of a second wall is a few extra characters per query, and the cost of one wall failing alone is a headline.

But here’s the trap that catches careful teams. You can build both walls correctly on Monday and have a hole by Friday, because walls are configuration, and configuration drifts. A database role created with one too many privileges quietly bypasses the rule it was supposed to obey. A migration alters a table and the protection doesn’t carry over. The walls are real on the day you build them. The question is whether they’re still real on the day that matters, and you cannot answer that question by reading the code, because the code looks fine. The hole is in the gap between what the code says and what the running system does.

So the real defense isn’t the walls. It’s the thing that keeps checking the walls.

The probe: a test that tries to rob you

Here is the move that turns isolation from a belief into a property. We wrote an adversary and pointed it at ourselves.

The probe is a test, but it doesn’t test the way most tests do. A normal test confirms the happy path: log in as a company, fetch your own data, assert it’s there. Useful, and completely blind to the thing we actually fear. Our happy-path tests passed on the worst day a multi-tenant system can have, because returning your own data correctly and returning someone else’s data by accident are not mutually exclusive. A leak doesn’t break the feature. That’s what makes it a leak.

So the probe inverts the question. It authenticates as one tenant, then deliberately tries to read a second tenant’s rows, by id, by query, through the agent tools, through the API routes, anywhere a real attacker or an honest bug might reach. And it asserts the opposite of a normal test. It does not assert “I got data.” It asserts “I got nothing, and I was correctly refused.” A pass is a locked door. A returned row is a failure, loud and red, before that build is allowed anywhere near a customer.

A probe authenticates as Tenant A, then reaches for Tenant B's rows through every path an attacker could use. The walls refuse it, the probe asserts it got nothing, and the build is allowed to ship, a returned row would have stopped it.

One detail matters more than it looks. The probe connects to the database as the same kind of restricted account the real application uses, never as an all-powerful admin role. This sounds like a footnote and is actually the whole point. A superuser account in many databases is permitted to ignore the row-level rule entirely; it sees everything by design. If you test your isolation while connected as that account, every probe passes, because the account you tested with was never subject to the wall in the first place. You’d have proven that your bypass works. We run the probe as the locked-down role the product runs as, so the test feels exactly what a customer’s session feels. Test the system you ship, not a friendlier cousin of it.

And the probe runs on every deploy. Not once during an audit, not quarterly when someone remembers. Every build, before release, the adversary takes its run at the walls. The walls don’t drift unnoticed, because something is always trying to walk through them.

The door it will find

We did not write that opening line as a metaphor. The reason you run a probe like this is that, run against a fresh corner of any growing system, it eventually finds a door.

This is the part teams don’t talk about, so we will: building the walls and building the probe are two different days, and the first time a probe runs against a corner of the system the walls were assumed to cover, it can come back with rows it should never have seen. Picture how it happens. A path exists, a particular query reachable through a particular tool, where the tenant scope was never applied, and the database role in that path has enough latitude to return the rows anyway. Both walls have a matching gap in the same spot. The probe walks straight through.

Here is the thing to sit with. That kind of gap is already in the running system, on the day it exists. The probe doesn’t create the hole; it finds one that was there, one the happy-path tests were happy about, that a code review passed, that a demo showed working. The only difference between a team that catches it and a team that ships it is whether an adversary of your own got there before a customer’s curiosity did. The door is real either way. You either own the thing that knocks on every door before strangers do, or you don’t.

You don’t get to find out your isolation works. You only get to find out where it doesn’t, and you’d rather find that yourself.

You close a found door the boring way, scope the query, tighten the role, re-run the probe until it comes back refused, and then you add that exact path to the probe’s permanent route list, so it can never quietly reopen. That’s the loop that matters. Every door the probe finds becomes a door it checks forever. The defense gets stronger each time it catches something, which is the opposite of how trust-the-filter degrades over time.

Why this is a defense and not a checkbox

There’s a softer version of all this that looks the same on a slide and isn’t the same at all.

The soft version is a sentence in a security questionnaire: Yes, tenant data is isolated. Every vendor checks that box, including the ones who are about to leak. The box asks whether you intend to isolate tenants. It does not ask whether a machine verifies that intent on every deploy, against the running system, as the restricted role a customer actually uses. Those are wildly different claims, and only one of them survives contact with a careless Friday.

The difference is what happens when someone makes a mistake, and someone always makes a mistake. Under the checkbox, the mistake ships, sits quietly returning the wrong rows, and surfaces as an incident. Under the probe, the mistake meets an adversary on the same day it’s written, fails a deploy, and surfaces as a red test on a developer’s screen with nobody outside the building any the wiser. Same mistake. The presence or absence of the probe decides whether it becomes a breach or a bug.

That’s the trade we’ll defend in any room. The probe costs us deploys it stops and engineer-minutes it spends. What it buys is that the worst category of failure in multi-tenant software, one customer seeing another’s data, has to get past something built specifically to catch it, every single time, instead of past a human who was tired and remembered nine clauses out of ten.

The turn: paranoia is a service you owe people

The database roles and the route lists are recent dress on something older than software.

When a customer hands you their data, they’re not buying your features. They’re extending you trust they can’t verify themselves, they cannot read your code, cannot see your tables, cannot watch your deploys. They are taking your word that their books and their conversations and their secrets won’t end up in front of a competitor who happens to use the same product. The probe is what we do with that trust when nobody’s watching. It is us, on an ordinary Tuesday with no audit scheduled and no customer asking, paying an agent to try to rob our own customers so that no one else can.

That posture, assume the wall has a hole, go find it before anyone else does, and never stop looking, is not a feature you can install. It’s a temperament, and it has to be there on the boring days, the ones with no incident and no deadline, when it would be so easy to trust the green tests and ship. The probe is just that temperament made mechanical, so it holds even when the humans are tired. The most reassuring thing we can tell a customer is not “our isolation works.” It’s “we have a machine whose whole job is to prove it doesn’t, and it runs before you ever log in.”

“Isolation is on” is a sentence. A probe that reads another tenant’s data and fails is a defense. We would rather hand you the second thing.


That’s what we’re building at Apollo Space, an operating system that doesn’t ask you to trust its walls, but shows you the adversary it runs against them every day. If you’ve ever signed a vendor’s questionnaire and quietly wondered who actually checks the box, you already understand why we’d rather break in ourselves first.

Apollo runs your company's repetitive ops so your team doesn't.

Join the waitlist for early access, founding-user pricing, and a front-row seat as we ship.

Join the waitlist