Engineering

Your agents have a credential-sprawl problem

Bolting a dozen tool servers onto a chatbot does not give you an AI company, it gives you a dozen credential leaks and a context tax; an OS owns the connections so the agent does not have to.

ASR

Apollo Space Research

Apollo Space

September 16, 2025 · 9 min read

Connect an agent to your calendar, your inbox, your CRM, and your billing tool, and you have just handed it four sets of keys to hold. Connect three agents to the same four tools, and you are holding twelve. Each one is a token that can be stolen, a scope that’s wider than it needs to be, and a schema the model has to read before it does a single useful thing. The demo worked. The arithmetic didn’t.

This is the part nobody screenshots. The agent that books the meeting gets the applause. The pile of credentials it had to carry to get there is the bill that arrives later, and it arrives as a leak, a wrong tool call, or a context window with no room left to think.

Here’s the thesis, and the rest of this post is the mechanism. Bolting a dozen tool servers onto a chatbot does not give you an AI company, it gives you a dozen credential leaks and a context tax. An OS owns the connections so the agent does not have to.

The naive version: give every agent its own keys

The first thing everyone builds is the obvious thing. You want your agent to read the calendar, so you give it a calendar token. You want it to send mail, so you give it a mail token. CRM, billing, docs, chat, each one gets wired in, each one is a connection the agent holds directly. It feels like progress, because each new tool makes the agent visibly more capable.

Then you add a second agent. And a third. Sales, ops, finance, each needs its own slice of the same tools. Nobody decided to build a mesh; it just grew, one reasonable wiring at a time.

On the bolt-on side, three agents each hold their own tokens to the calendar, inbox, CRM, and billing, forming a tangle of nine direct connections. On the hub side, the agents ask one OS permission layer, which lends each tool under a single scoped grant.

The picture is the problem. Three agents and four tools is not three plus four relationships. It’s a grid, every agent times every tool it touches, and the grid is made of secrets. The tangle is the architecture, and the architecture leaks.

Why it fails, part one: the credentials get away from you

A token sitting in one place is a thing you can guard. A token copied into every agent that needs it is a thing you have lost track of.

This is not a hypothetical failure mode. GitGuardian’s State of Secrets Sprawl report counted 28,649,024 new secrets exposed in public GitHub commits across 2025, a 34% jump over the year before, the largest in the report’s history (Help Net Security, Snyk). The agents we’re building don’t reduce that number; they multiply it. By one industry count, the average enterprise now runs somewhere between 82 and 144 non-human identities for every single human one (Token Security). Every one of those machine identities is a credential to issue, scope, rotate, and, when it leaks, revoke.

Now play the rotation forward. A calendar token expires. In the bolt-on model, you don’t rotate one credential, you chase it through every agent that copied it, and you pray you found them all. Miss one, and an agent quietly breaks in production. Miss one the other way, leave a dead token live, and you’ve left a door open.

The tangle is the architecture, and the architecture leaks. The mechanism is simple: a secret duplicated N times has N places to escape from and N places to forget.

Why it fails, part two: the context tax

The second cost is quieter, and it shows up inside the model itself.

Every tool an agent can call has to be described to it first, the tool’s name, its parameters, what it returns. Bolt on enough tools and those descriptions alone can swallow the conversation before it starts. GitHub’s own tool server consumes roughly 17,600 tokens of definitions per request, and connecting just a few servers pushes you past 30,000 tokens of metadata before the agent does any work (StackOne). Wire up a handful and you can watch three servers eat 143,000 of a 200,000-token window, 72% of the agent’s working memory spent before it read a single word from you.

That’s not just waste. It’s a measurable drop in the agent getting things right.

Researchers behind the RAG-MCP project measured what happens to tool selection as the menu grows: faced with a bloated tool set, baseline accuracy sat at just 13.62%, under one in seven, and fetching only the relevant tools on demand more than tripled it, to 43.13% (RAG-MCP, arXiv:2505.03275). Bury the right answer in a longer list and the agent picks the wrong tool roughly six times out of seven, not because it got dumber, but because you buried it. The working rule of thumb that’s settled out of this is blunt: keep an agent to no more than 10 to 15 tools at a time.

So the bolt-on model fails twice. It leaks the keys, and it taxes the thinking. Add a tool to make the agent smarter, and past a point you make it both leakier and less accurate at once.

The OS version: own the connection once, lend it under scope

Here’s the move. The agent should not hold the keys. The operating system should, and lend each tool, scoped, only for as long as the work takes.

On the bolt-on side, the agent carries three burdens: secrets to rotate in every agent, tool schemas flooding context, and audit gaps with no single ledger. On the OS side, each burden becomes its OS-owned counterpart: connections rotated once, tools fetched on demand, and one audit trail of who touched what.

Think about how your laptop already does this. Your code doesn’t carry the driver for your specific printer; the OS owns that connection and your program just says “print.” The application never holds the hardware’s secrets, and it never needs to, the permission layer sits in between, granting access per request and revoking it the moment the job is done. A company OS treats every integration the same way. The connection to the calendar, the inbox, the CRM, the billing tool is owned in one place. An agent doesn’t have the CRM token; it asks the OS to do the CRM thing, and the OS decides, under that agent’s scope, for that one action, whether to lend the access.

Watch what each failure becomes when you move the connection off the agent.

Rotation stops being a manhunt. The credential lives in one place, so it’s rotated in one place. Expire it, replace it, and every agent keeps working through the same front door, none the wiser. There is no copy to chase because there was never a copy.

The blast radius shrinks to a scope. A leaked agent is no longer a leaked vault. It can do what it was scoped to do and nothing more, because it never held the underlying key, it held a narrow, revocable grant. This is the direction the whole field is moving: scoped, delegated access as the default, vault-backed, with a distinct identity per agent rather than a shared secret passed around (1Password, GitGuardian).

The context tax gets paid down. When the OS owns the tools, the agent doesn’t need every schema loaded just in case. It asks for the tool it needs, when it needs it, and the rest stays out of the window, leaving room for the model to actually reason. The fix to bloat the industry keeps landing on is the same shape: don’t pour every definition into context up front; fetch on demand (Anthropic, The New Stack).

The audit trail becomes one ledger. When every tool call routes through one permission layer, you have a single place that knows which agent touched which system, when, and under what grant. In the tangle, that record was scattered across a dozen servers, if it existed at all. An OS that owns the connections owns the answer to “who did what” by construction.

Same agents. Same tools. Opposite posture. The capability didn’t change, the place the keys live did. In the bolt-on model the tangle is the architecture, and the architecture leaks; here the connection is the architecture, and the architecture holds.

The turn: integrations are infrastructure, not features

It’s tempting to treat “connect a tool” as a feature you add to an agent. It isn’t. It’s infrastructure you build once, underneath all of them, and the difference between those two framings is the difference between a demo and a company you can run.

The teams that win the next few years won’t be the ones with the most tool integrations bolted onto the most agents. Counting integrations is counting credentials you have to guard and schemas you have to pay for. The teams that win will be the ones where an agent can reach for any tool in the company and never has to hold the key to it, where adding the tenth agent doesn’t mean issuing the fortieth secret, and adding the twentieth tool doesn’t crowd out the agent’s ability to think.

That’s not a security checkbox. It’s what lets you actually scale the number of agents doing real work, because the cost of each new one stops compounding. You stop being the person who tracks where all the keys went. The operating system already knows, because it was the one holding them the whole time.

The tangle is the architecture, and the architecture leaks, until the connection moves off the agent and into the platform underneath it. We’re building this at Apollo Space, an AI-native operating system for companies, where the connections live in the platform and the agents just do the work, because an OS owns the connections so the agent does not have to. If your AI roadmap has started to look like a spreadsheet of tokens to rotate, that’s not an integration problem. It’s a missing operating system.

Apollo runs your company's repetitive ops so your team doesn't.

Join the waitlist for early access, founding-user pricing, and a front-row seat as we ship.

Join the waitlist