Dogfooding found the gap a thousand tests missed: we tried to hire into our own OS
You don't discover what's missing by testing the system, you discover it by living inside it.
Apollo Space Research
Apollo Space
The suite was green. Every endpoint answered, every record saved, every page rendered in the time it was supposed to. By every number we tracked, the system worked. Then we tried to do something a real company does on a random Tuesday, add a person to it, and walked straight off the edge of the map. There was a button to invite. There was nothing on the other side of accepting. The new person could log in and stand in a room where no work was addressed to them, because we had built a system that could do the job and forgotten to build the part where someone new arrives to do it.
A thousand tests had run that morning. Not one of them had ever tried to be the new hire.
That’s the whole lesson, and the rest of this post is why it’s not an embarrassing footnote but the most reliable way we know to find what’s actually broken. You don’t discover what’s missing by testing the system. You discover it by living inside it.
The thing tests are structurally blind to
Here’s the model every engineer trusts, and should: you write tests that assert the system does what you asked. The invite email sends. The record persists. The page loads in time. Green means the behavior you specified is the behavior you got. It’s the best lie-detector we have, and we’d never ship without it.
But look closely at what that sentence assumes. A test can only check the behaviors you thought to name. It is a perfect memory of your own imagination, and a total blank everywhere your imagination didn’t go.
The new-hire flow wasn’t failing a test. It had no test, because nobody had pictured the moment. We had tests for sending an invite and tests for storing a user and tests for rendering the dashboard, each one green, each one true. The thing that was missing lived in the seam between them: what is this person supposed to do on day one, and how does the system hand them their first real piece of work? No assertion covers a question nobody asked.
This is the structural blind spot, and it has nothing to do with how good your tests are. A million more of them, all passing, would not have caught it. They were all looking inward, confirming the parts we’d already imagined. The gap was in the part we hadn’t.
The naive fix: write more tests
The obvious response to “a test didn’t catch it” is “write more tests.” Add coverage for onboarding. Assert the new person sees a task. Close the hole and move on. It feels rigorous, and it’s the move we reached for first.
It doesn’t work, and the reason is humbling.
To write the test that would have caught this, we’d have needed to already know the flow was missing. The test is downstream of the insight, not a source of it. We can only assert “the new hire sees their first task” after we’ve felt the absence of that task, after we’ve been the confused people logging into the empty room. Writing more tests hardens the map you already drew. It does nothing about the territory you never walked. You can spend a year raising coverage from high to higher and never once stumble into the missing continent, because coverage measures how thoroughly you check what you imagined, not whether your imagination was complete.
More tests make a known system more reliable. They are powerless to reveal an unknown one. That’s not a flaw in testing, it’s the definition of what testing is.
Our way: become your own worst-onboarded user
So we stopped trying to test our way to the gap and did the only thing that actually surfaces it. We tried to live inside the system as the people we built it for, not as the engineers who knew every shortcut, but as a stranger meeting it for the first time.
The key idea is simple. You don’t discover what’s missing by testing the system, you discover it by living inside it.
Living inside it means using the product to run the actual operation, with the actual friction, where the cost of a missing piece lands on you instead of on a user you’ll never meet. When we tried to onboard a new person and hit the empty room, that wasn’t a bug report filed by a tester checking a box. It was a wall we walked into while trying to get real work done, which is the only kind of wall that tells you the truth. A tester asks, “does the button work?” Someone living inside the system asks, “I clicked the button, now what am I supposed to do, and why is nothing here?”
That second question is the one tests can’t generate, because it comes from intent, not specification. The user has a goal that spans ten features the test suite checks one at a time, and the goal is where the seams show. Each feature passes. The journey across them snaps in half.
When you treat your own friction as the signal, the gap stops being an embarrassment and becomes the most valuable thing you found all week. The empty room wasn’t a failure to ship. It was a flow we now knew we owed the next real person, discovered for the cost of one afternoon instead of one churned customer.
Why an OS especially has to be lived in
This matters more for some software than others, and it matters most for the kind we’re building.
A single-purpose tool has a small surface. You can imagine every path through a stopwatch app, so testing-as-you-imagined gets you most of the way. But an operating system for running a company isn’t a tool you open and close. It’s the place the work lives, people, tasks, decisions, the handoffs between them, the day-one arrival and the someone-just-left departure. Its value is in the connective tissue, and connective tissue is exactly what unit tests can’t see, because every unit passes in isolation while the thing that’s broken is the space between them.
Think of the difference between testing a door and living in a house. You can verify the door opens, closes, and locks, all green. You only discover there’s no hallway behind it by trying to walk somewhere.
The hire-a-person gap was a missing hallway. The door worked perfectly.
An OS is mostly hallways. The features are the easy part; the ways they connect into a life someone can actually live are the hard part, and they are invisible to anyone who only checks the doors. The only way to find the missing hallway is to try to get from one room to another while carrying something real, which is to say, to live there. That’s why we make ourselves the first inhabitants of every floor we build, before we ask anyone else to move in.
A test asks whether the door opens. Living inside the system asks whether there’s a room on the other side.
Why this is a discipline, not a confession
It would be easy to read all this as “we shipped something incomplete.” That’s not the lesson, and reframing it matters, because the failure mode here is universal, it happens to every team that builds a system rich enough to be lived in.
Every sufficiently connected product has gaps its makers cannot see from the inside of their own assumptions. The question is never whether they exist. It’s whether you find them by walking into them yourself, on your own time, at your own cost, or whether your customer finds them first, on the day they were counting on you. Dogfooding isn’t penance for incomplete work. It’s the cheapest possible place to take the hit.
So we made it a rule rather than a reflex. Before a flow is real, someone on the team has to live the whole journey end to end, arrive as the new person, do the work, hit the seam, name the friction out loud. The friction becomes the next thing we build, not the next thing we apologize for. The empty room turned into an onboarding flow within days, not because a test went red, but because a human went looking for their first task and found nothing, and that nothing was louder than any failing assertion.
The discipline is to keep moving in deeper. Every floor we finish, we try to live one floor up. The gaps stay one step ahead of the people we’d otherwise ship them to.
The turn
We keep coming back to the moment in the empty room, not the missing flow, but the feeling of it. The small, specific confusion of being a person who’d just been handed access to a system and had no idea what they were supposed to do with it. No test would ever have felt that, because feeling it requires wanting something the software can’t yet give you.
That wanting is the part you can’t automate. A suite checks whether the system does what we declared. It cannot check whether the system is good to be inside, whether arriving feels like joining a team or like waking up in an empty office with the lights off. The only instrument that measures that is a human being trying to live a real day through the product and noticing, in their gut, where it went cold. Our best feature discoveries didn’t come from a dashboard. They came from one of us getting quietly frustrated while trying to use our own thing, and refusing to look away.
Caring enough to be your own worst-onboarded user, to seek out your own friction instead of routing around it, is not something you can install. It’s a posture. And it’s the one that finds the gap a thousand tests missed.
That’s what we’re building at Apollo Space: an operating system we have to be willing to live inside before we ask anyone else to. The day our own onboarding left us standing in an empty room, the system was technically perfect and quietly unlivable, and the only reason we know the difference is that we went looking for it ourselves. If the most honest test you can run is to try to do your own job inside your own product, we think every team should be brave enough to fail it first.
Apollo runs your company's repetitive ops so your team doesn't.
Join the waitlist for early access, founding-user pricing, and a front-row seat as we ship.
Join the waitlistThe slow death of a marketer's voice
You publish one real piece a week and quietly translate it into ten, and each translation is a tiny chance to sound a little less like yourself. We built the OS because nothing on the market was guarding that.
Product ThinkingThe day someone quits, your company forgets how it works
Onboarding isn't broken because training is bad. It's broken because your company can't remember, and we got tired of watching the answer walk out the door.
Product ThinkingThe first thing a new hire should do is read the company
A great onboarding doesn't hand you docs, it already knows who you are by the time you log in.