Automation Thesis

If agents are the new apps, something has to be the new OS

Apps need an OS underneath, and a model API is not an OS any more than a CPU is Windows.

ASR

Apollo Space Research

Apollo Space

· 11 min read

Hand a developer a raw CPU and ask them to ship a word processor by Friday. No files, no windows, no memory protection, no way to talk to the disk, just an instruction set and a clock. They will spend the first six months not writing the word processor. They will spend it building the layer that should have already existed: a way to store a document, a way to draw on the screen, a way to keep one program from scribbling over another. They will, in other words, accidentally write an operating system before they get to write the app.

That is roughly where everyone building with AI agents is standing right now. We have the CPU. We are pretending it’s the computer.

Apps need an OS underneath, and a model API is not an OS any more than a CPU is Windows.

The category everyone is repeating

There’s a line you hear at every conference now: agents are the new apps. It’s a good line, and we think it’s correct. A decade ago you built a company by stringing together SaaS apps, one for email, one for the CRM, one for invoicing. The next decade you’ll build it by composing agents: one that handles inbound, one that chases collections, one that watches the numbers and flags the anomaly.

But the people repeating the line keep stopping one layer too early. They treat “agent” as the whole stack, as if an app were the only thing a computer needs. It isn’t, and it never was.

An app is not a computer. It’s the top floor of one. Everyone is shipping top floors and standing on air.

Apps don’t run on nothing. They run on an operating system that nobody talks about precisely because it works, it schedules them, remembers their state, lets them reach the hardware, and keeps them from corrupting each other. Take that layer away and an “app” is just a clever idea with nowhere to stand. So if agents really are the new apps, the more important question is the one nobody’s racing to answer: what’s the new OS?

The naive answer: the model is the OS

Here’s the answer most teams reach for first, because it’s sitting right there. The model is so capable, it reasons, it writes code, it calls tools, that it feels like the whole platform. So you wire your agent straight to a model API, give it a system prompt, hand it a few tools, and ship. The model is the brain. The brain is the computer. What more do you need?

For a demo, nothing. For a real company, almost everything.

A model API is a CPU. An extraordinary one, but a CPU. It does exactly one thing, brilliantly and statelessly: you hand it tokens, it hands you tokens back, and then it forgets you completely. It has no memory of the last conversation. It has no clock, so it can’t decide to do anything on Tuesday. It has no idea which tools exist in your company or which ones this particular agent is allowed to touch. It cannot tell whether the agent acting right now belongs to you or to the company down the street whose data must never, ever cross into yours.

Picture what happens when you build the company straight on the CPU. Every agent reinvents memory from scratch, badly, in its own prompt. Every agent re-learns which tools exist by being told again, every single time. Nobody schedules anything, so the whole system goes back to waiting for a human to poke it. And the day you have two customers, you discover the model never had a concept of whose agent this is, that boundary was always going to be your job, and you hadn’t built it.

That’s not a missing feature. That’s a missing floor of the building. The model gave you raw compute and the illusion of a platform, and the gap between the two is exactly the operating system nobody wrote.

On the left, an agent wired straight to a model API has raw reasoning but no memory, no clock, no tool registry, and no tenant boundary, four holes where an operating system should be. On the right, the same agent sits on an OS layer that fills all four.

What the OS actually has to provide

So let’s build the missing floor on purpose. The useful move is to take the four things a real operating system does for a real app and ask, plainly, what each one becomes when the app is an agent. The mapping is almost suspiciously clean.

A process model: agents are processes, not prompts

On a computer, the OS doesn’t run one program at a time and forget it the moment it closes. It runs many, isolates them, restarts the ones that crash, and keeps a record of what each one did. A process is a first-class thing with a lifecycle, not a one-shot.

The naive agent is the opposite: a single call, fired and forgotten. It runs once, returns text, and leaves no trace anyone can audit later. That’s fine for a chatbot and useless for a company, because a company needs to know what acted, when, on whose behalf, and what it touched. An agent OS treats each agent as a real process, something you can start, supervise, watch live, replay when it goes wrong, and hold accountable. The unit of work stops being a prompt and becomes a running thing with a history.

Memory management: a brain, not a bigger prompt

The naive fix for “the model forgets” is to stuff more into the prompt. More history, more context, more of yesterday pasted in at the top. It works until it doesn’t, which is soon, the window fills, the cost climbs, and the model still can’t remember anything you didn’t manually carry forward this exact call.

That’s not memory. That’s a person re-reading their entire diary out loud before every sentence.

A real OS manages memory so a program doesn’t start from zero each time. The agent equivalent is a memory layer the OS owns, not the prompt: durable state that remembers the customer said yes to this proposal, that the renewal lands next month, that this agent already tried that approach and it failed. The model stays stateless, that’s fine, CPUs are stateless too. The remembering moves down a floor, to the layer built to hold it.

Drivers: one tool registry, not a thousand integrations

A driver is the cleverest trick in an operating system: your program says “print,” and it just works, even though the program has never heard of your specific printer. The OS bridges the gap once, for everyone.

Without that layer, every agent re-solves tool access by hand. This agent gets the calendar wired in one way, that agent gets it wired another, and when the calendar’s API changes you go fix it in fourteen places. It’s the integration-hell that consumed the last era of software, quietly reborn one agent at a time. The OS answer is a single registry of tools and the permission to use them, the agent says “read the inbox” or “post to the channel,” and the layer underneath handles which tool, which credentials, which account, whether this agent is even allowed. Write the bridge once. Every agent crosses it.

Isolation: the tenant boundary the model never had

This is the primitive people skip, and it’s the one that ends companies. A real OS keeps one process from reading another’s private memory, that boundary is the reason you can run untrusted programs side by side and sleep at night.

A model API has no such concept. It cannot tell you that the agent running this second belongs to one customer and the data it’s about to read belongs to another. If you didn’t build that wall, there is no wall. And “we’ll add multi-tenancy later” is the phrase that precedes the breach, because it’s not a feature you sprinkle on top, it’s the foundation the whole house rests on. The OS has to enforce, at the floor below every agent, that one customer’s agents can never see another customer’s world. Not by convention. By construction.

The four jobs of a computer operating system map cleanly onto an agent operating system: the process model becomes supervised agents, memory management becomes a durable company brain, drivers become one shared tool registry, and process isolation becomes the tenant boundary between customers.

Why “just use the framework” isn’t the OS either

There’s a second naive answer worth taking seriously, because it’s smarter than the first. Fine, you say, I won’t build on the bare model, I’ll grab an agent framework. That’s the OS.

It isn’t, and the reason is worth being precise about. A framework is a library you call. An operating system is a thing you run on. The difference sounds academic until you feel it: a library helps you write one agent well, in your process, on your machine. It does not schedule a hundred agents across a fleet, it does not own the durable memory they share, it does not enforce the boundary between your customers, and it does not keep running when your code isn’t. A framework is a very good set of power tools. It is not the building they’re used to construct.

You can tell the two apart by a simple test. When you close your laptop, does it stop? A library stops, it only existed inside your running program. An operating system keeps going, because being always-on is the entire point. That’s the line between a tool you reach for and a platform you stand on.

Apps need an OS underneath, and a model API is not an OS any more than a CPU is Windows.

A field note on the floor that wasn’t there

Here’s a failure mode every team building on agents meets, usually around the second real customer.

The first version works beautifully. One agent, one user, wired straight to the model, doing something genuinely useful. It demos so well that everyone forgets it’s standing on air. Then you add the second customer, and the questions start arriving all at once, the questions an operating system would have answered before you ever asked. Which agent is running right now, and for whom? Where does its memory live so it survives a restart? When the calendar tool changes, how many places break? What stops customer A’s agent from ever touching customer B’s data? None of those are agent questions. They’re all OS questions, and they were always going to come due.

The lesson isn’t that the early version was wrong. It’s that the operating system is the part you don’t see until it’s missing, and by the time it’s missing, you’re building it under pressure, in production, with customers watching. The disciplined move is to build the floor first, on purpose, while it’s cheap. The model is the CPU you rent. The OS is the thing you actually have to own.

The turn: the OS frees the part only you can do

Here’s what all of this is really for, and it isn’t the architecture.

When there’s no operating system underneath your agents, you are the operating system. You’re the one tracking which agent did what. You’re the memory, holding in your head the context the model keeps forgetting. You’re the integration layer, wiring each tool by hand. You’re the boundary, the last line of defense making sure nothing crosses where it shouldn’t. It’s the same trap the founder has always been in, just one level up the stack, and it’s the least leveraged thing a sharp person can spend a day on.

Build the OS properly and all four of those jobs move off you and down a floor. What’s left is the part no scheduler, no memory layer, no registry can do: deciding which agents are worth building, what they should be for, what “good” means for the people you serve. That’s the work an operating system was always meant to free you for, it runs what has to run so you get to think about what’s worth running at all. The first OS did that for your files. The next one does it for the agents that now do the work.


That’s what we’re building at Apollo Space: the operating system underneath the agents, so you get to design what the company does instead of being the thing that holds it together. If agents are the new apps, somebody has to build the floor they stand on. We’d rather it be a foundation you can see than the one you only notice when it’s gone.

Apollo runs your company's repetitive ops so your team doesn't.

Join the waitlist for early access, founding-user pricing, and a front-row seat as we ship.

Join the waitlist