Automation Thesis

Spawn your kind-of-Hermes-agent, Apollo's thesis

The gap between 'AI is a tool' and 'AI is a coworker' is the AI-OS wedge. Here is what we are building at Apollo to close it, and why we think the rest of the market is still building the wrong half.

ASR

Apollo Space Research

Apollo Space

September 9, 2025 · 11 min read

The moment it clicked

A few weeks ago I was on the other side of town, no laptop, when a teammate pinged me on Telegram: a regression in one of our internal tools, a customer-facing bug, the kind of thing that normally pulls you back to a desk for an hour.

I forwarded the message to my personal Hermes Agent, a self-hosted, single-user agent that lives on an EC2 instance I run. Two minutes later Hermes had pulled the repo, opened the failing log, traced the regression to the commit that caused it, drafted a patch, and was asking me on Telegram whether to open the PR. I said yes. It opened the PR, pinged the on-call engineer, and stayed in the thread to answer follow-up questions in the engineer’s voice while I went back to what I was doing.

I did not touch a laptop. I did not type code. I did not even open a browser. A coworker had handled it.

That is the moment a lot of us are quietly having in 2026, on Telegram or Slack or WhatsApp, with a single-user side-project agent, and then walking back into our actual product company and noticing we don’t have anything remotely like it for our team. That gap is the entire reason Apollo exists.

”AI as a tool” vs “AI as a coworker”

The dominant frame today is still AI as a tool. You open Cursor, you ask, it responds, you close it. You open Notion AI, you summarize a page, you close it. You ping a Slack bot, it answers, the thread ends. Each tool is excellent at what it does, but each one is invoked, it sits dormant until you summon it, and then it goes away.

A coworker is different. A coworker runs while you sleep. A coworker has opinions, a way of working, a tone, things they will and will not do. A coworker has memory that compounds across days and projects. A coworker has scoped permissions, your CFO does not have your engineer’s GitHub access. A coworker can be paused, reassigned, or fired, and the next person who steps into that seat inherits the same context.

The gap between those two pictures is not a model capability gap anymore. It is an operating system gap. The model is the brain. The harness is the body. What is missing is the workplace, the multi-tenant runtime that turns a brain plus a body into an employee on a payroll.

Mitchell Hashimoto’s now-canonical equation has been Agent = Model + Harness for a year. The 2026 reality is that you need a third term: Agent in production = Model + Harness + Workplace. The workplace is the part where companies actually live.

Why the current generation of tools does not close this gap

It is worth being precise about why the things on the market today, however good, do not cross from tool to coworker.

Notion AI, Linear AI, Slack AI are large language models bolted onto a single document, board, or channel. They have no persistent identity, no scoped credentials of their own, no memory that crosses surfaces, and no way to act outside the app they live in.
Cursor, Copilot, Claude Code are pair programmers. They are spectacular at the task they are designed for, you, at your laptop, writing code, and they are deliberately ephemeral. They are not designed to run on Sunday afternoon while you are at the beach.
Sierra, Decagon, Harvey are vertical agents. Each owns one workflow well, customer support, legal review, and each is a closed product, not a substrate for the rest of a company’s work. They are not the OS; they are the first apps on the OS.
Hermes Agent (the open-source agent I used in the opening story) is a magnificent piece of software for one person. Its design center, per its own architecture notes, is a single-user, server-resident personal agent with profiles as separate home directories on the host. That is the right design for the personal use case. It is structurally the wrong primitive for a company with concurrent users, shared knowledge, departing employees, and credential revocation.

What is missing, and what every company already trying to deploy agents internally is rediscovering the hard way, is the layer that turns any of these into an organizational capability: per-person agents with isolated state, agents that talk to each other inside an org without leaking across orgs, ETHOS files that you can read and edit, observable memory you can audit, tool grants you can revoke when someone leaves.

What we are building at Apollo

Apollo is the AI-OS for companies. Not a vertical agent, not a chat product, not a wrapper. The OS layer that the next generation of agent-driven companies will run on. The thesis breaks down into four load-bearing pieces.

1. A multi-tenant, per-person agent runtime. Every user in Apollo spawns their own kind-of-Hermes, a personal agent isolated by row-level security in the database, billed per organization, observable end-to-end. The agent runtime delegates the commodity layer (the model loop, MCP tool-calling, compaction, sandboxed execution) to Anthropic’s Managed Agents and similar managed runtimes, see Anthropic’s Scaling Managed Agents: Decoupling the brain from the hands from May 6 2026 for the architectural pattern we follow. We do not fork a personal-agent codebase and pretend it is multi-tenant. We build the multi-tenant control plane on top of a managed substrate.

2. ETHOS-as-markdown. Every Apollo agent, a Chief of Staff, a CMO, a Coding Specialist, carries an editable markdown file describing its personality, its guardrails, the information it is allowed to share with sibling agents, the way it sounds when it writes. The ETHOS is human-readable, version-controlled, and dogfoodable. If you do not like the way your CMO agent writes LinkedIn copy, you open its ETHOS and change a paragraph. The next post is in your voice.

3. Scoped tools and observable memory. Every action an Apollo agent takes flows through a connector layer (Composio for the 1,000+ third-party-tool path) with per-org, per-agent credential scoping. Every memory the agent writes lives in a single observable table, searchable, exportable, revocable. We do not believe in agents whose internal state is a black box. The trace flywheel, capturing structured trajectories of every agent run, is the moat Harrison Chase has been writing about for a year, and it only works if memory and tools are observable from day one.

4. A central kanban for humans. Apollo agents do not work in the dark. They claim tasks from a board the humans can see, they open PRs, they post in a group chat, and the humans review what they ship. The agent operates the way a junior employee operates, in front of a manager, with structured deliverables, with a feedback loop. Founder-grade autonomy is the long-term destination; reviewed-junior is the credible starting point.

We are explicit about what we do not build. We do not build the model. We do not build the inference router. We do not build the agent loop, the MCP protocol, the basic compaction, or the code-execution sandbox. Those are commodity plumbing, see a16z’s “From System of Record to System of Intelligence” (a16z, 2024–2025) for the strategic frame the industry has converged on. We build the workplace.

Three concrete moments

Abstraction is cheap. Here is what the workplace looks like when you actually use it.

The Chief-of-Staff morning briefing. It is 7:00 AM. Your Apollo Chief of Staff has already pulled your calendar, scanned yesterday’s traces from every other agent on your team, read the overnight Slack channels, and posted in your group chat a 200-word brief, what shipped, what slipped, what you need to decide today, what is blocked on you. By the time you finish your coffee, you know more than you did yesterday and you have not opened a tab.

The CMO agent running LinkedIn. You give the CMO agent a Mission (“WSR-2026 pre-launch buzz, June 7–9”). The CMO opens a group conversation with three sibling agents, a Brand Guardian, a Composer, a Publisher, and they spend the next ten exchanges fighting about the post. The Brand Guardian rejects two drafts for being off-voice. The Composer pulls visual assets. The Publisher schedules the final cut. You see the whole conversation in Apollo. You hit approve. The post goes live with no human typing involved.

The Coding Specialist claiming a task. A Coding Specialist agent picks up a Foundational Structure task with a fixed budget, cost_cap_usd: 5, wall_clock_max_minutes: 60, write-boundary: one repo, one branch. It spawns a sandboxed runner (Vercel Sandbox or Cloudflare Sandboxes, both shipped in 2026 as the “sandbox-as-tool” tier of the agent-outside / sandbox-as-tool pattern Anthropic ratified), writes the code, runs the tests, opens a PR, and waits. A five-lens reviewer fleet, security, dev, business, adversarial, cost, runs against the PR in parallel. The human merges or rejects. The agent learns from the verdict on the next task.

None of these moments require AGI. They require an operating system that lets agents and humans share work, share context, and share accountability.

Why now

The market has decided agents matter. The question is no longer whether, it is what layer wins.

April 8 2026: Anthropic shipped Managed Agents, a hosted runtime with sandboxed execution, durable sessions, MCP tool routing, OAuth credential vault, and per-session billing. Building your own harness now competes directly with Anthropic’s infrastructure. That is the wrong fight.
May 6 2026: Anthropic followed up with Dreaming, Outcomes, and Multiagent orchestration, scheduled memory pruning, separate grader agents with rubrics, and lead-and-specialist subagent fan-out. Harvey reportedly saw a ~6× completion-rate lift attributable specifically to Dreaming. The “managed substrate” is now genuinely good.
March 2026: Cognition acquired Windsurf for $10.2B, consolidating the IDE-resident coding-agent layer.
2025–2026: Sierra raised at $15.8B, Harvey at $11B, Decagon at $4.5B, three vertical agent companies, each owning one workflow, each commanding the kind of multiple that previously belonged to platforms.
The investor frame is sharpening too, see Tobi Lütke’s public X thread on agentic commerce and LangChain’s DeepAgents pattern for the architectural posture the open-source side has converged on.

Every one of those signals points at the same thing. The agent loop is solved. The vertical agents are getting bought or richly priced. The substrate is consolidating. The layer that is still wide open, the workplace, the AI-OS, the thing that turns a swarm of agents into a company, is the next wedge. And it is the layer Apollo is built for.

What’s next

Apollo is being built right now, in public, by a small team using Apollo to build Apollo. Every week the system runs more of our own work, and every week we ship the parts of it that have earned their place. Customer-zero is us, there is no other way to know if you are actually building a coworker or just a fancier tool.

The first public moment is at Web Summit Rio in early June 2026. If you are an AI-native founder, a function lead in a fast company, or anyone who has had the same “I just sent one Telegram message and a coworker did the rest” moment we did, come find us. The early-access list lives on apollospace.ai.

The next post in this series goes deeper on the trace flywheel, how observable memory and structured agent traces compound into the kind of system-of-intelligence advantage that does not get arbitraged away by a model release.

Sources and further reading (all accessed 2026-05-20):

Anthropic, Managed Agents launch, https://www.anthropic.com/news/managed-agents
Anthropic, Scaling Managed Agents: Decoupling the brain from the hands (May 2026), https://www.anthropic.com/research/managed-agents
a16z, From System of Record to System of Intelligence, https://a16z.com/system-of-intelligence-ai/
Tobi Lütke on agentic commerce (X), https://x.com/tobi/status/1857893987856846896
LangChain, Deep Agents, https://blog.langchain.com/deep-agents/
Composio toolkits (1,000+ tool MCP layer), https://composio.dev/
Vercel Sandbox, https://vercel.com/docs/vercel-sandbox
Cloudflare Sandboxes (GA April 2026), https://blog.cloudflare.com/cloudflare-sandboxes-ga/
Hermes Agent (Nous Research, open source), https://github.com/NousResearch/hermes-agent
Sierra, Harvey, Decagon valuations, public reporting via The Information and Bloomberg, 2025–2026

Apollo runs your company's repetitive ops so your team doesn't.

Join the waitlist for early access, founding-user pricing, and a front-row seat as we ship.

Join the waitlist