The confused deputy comes for your AI agents
When an orchestrator agent delegates to a sub-agent, it can quietly hand over a token that does far more than the sub-agent should. Here's the token-chain risk, IBM and Red Hat's Kagenti blueprint, and a zero-trust pattern you can run without re-platforming onto Kubernetes.
I run a small fleet of AI agents over the homelab — one each for systems, network, security, identity, and endpoints. They share a framework, and increasingly a coordinator agent hands work down to the specialists. Convenient. Right up until I asked the boring auditor’s question:
When the coordinator delegates “check the firing alerts” to the security agent, what credential travels with that request — and what else could that credential do?
In the naïve design the answer is uncomfortable. The coordinator forwards the same token it was already holding. The security agent now possesses a credential that can also reach the network agent’s UniFi controller, the Proxmox API, the secrets store — everything. Nobody intended that. The token just came along for the ride.
That is the confused deputy: a component is allowed to use its authority on behalf of something that should never have had it. The OAuth working group’s framing of the agent version is sharp — the token that checks flight availability also authorises completing the purchase and charging the corporate card. The Model Context Protocol spec is blunt about the mechanism behind it and forbids token passthrough outright. Here’s the shape of the fix on one page, then the walk-through.
The blueprint everyone points to: Kagenti
If you go looking for the canonical open-source answer, you land on Kagenti (rebranding to “Rosso”) — an IBM Research and Red Hat project. It does the right things. Every agent gets a cryptographic workload identity through SPIFFE/SPIRE, on short-lived, auto-rotated certificates, so which agent is really calling? is answered by maths rather than a shared API key. Broad tokens are swapped, at every hop, for short-lived, scoped, audience-bound tokens via OAuth 2.0 token exchange (RFC 8693) in Keycloak — a sub-agent literally cannot receive permissions the orchestrator doesn’t already hold. A gateway performs that exchange on every tool call, with service-mesh mTLS between agents. It is the most complete embodiment of agentic zero-trust in the open right now. Study it.
The reality check (the auditor’s footnote)
Here’s the part the launch posts skip: Kagenti is a Kubernetes control plane, and it’s early. It assumes Istio, Keycloak, SPIRE, sidecar injection, and the A2A protocol; the builds people are road-testing are alpha. If you’re Red Hat shipping a platform, wonderful. If you’re running Python CLI agents and local MCP servers — like most people experimenting today — adopting Kagenti isn’t “adding a tool,” it’s re-platforming the whole fleet onto Kubernetes. And pinning a security boundary to alpha software is exactly what a sane dependency policy tells you not to do.
So don’t adopt the runtime. Adopt the pattern. (And mind the name collision: there’s a separate, similarly-named CNCF project called kagent, from Solo.io — different team, different codebase. Easy to conflate when you go searching.)
The pattern you actually need: two control points
Strip the problem to the bone and there are only two jobs — the two boxes outlined above.
Control point 1 — issue the right token, at the source and at every hop. Instead of forwarding whatever it’s holding, an agent asks the identity provider to exchange its token for a new one that is scoped down to just this task and audience-bound to exactly the next service. That’s RFC 8693, and you almost certainly already run an IdP that can do it — I use Authentik. No Kubernetes required.
Control point 2 — validate the chain and authorise, at the point of use. Before doing anything, every agent and tool runs the same check: verify the token’s signature, confirm it is the intended audience, confirm it hasn’t expired — then walk the delegation chain (the act claim: you → coordinator → security agent) and confirm scope only ever narrows. Then ask a policy engine — OpenFGA, OPA, or Cedar — the real question: may the coordinator, acting for me, read alerts on the security agent?
Put that validation in the shared agent-core, once, as middleware on the BaseSkill base class — so every agent inherits it instead of each repo re-implementing it. The chain becomes verifiable end to end, and over-broad tokens become structurally impossible rather than merely discouraged.
Here’s the whole point in two lines:
ANTI-PATTERN (token passthrough = confused deputy)
you ─[broad]→ core ─[SAME]→ sub-agent ─[SAME]→ Wazuh + UniFi + Proxmox + secrets …
forwarded as-is everything inherits everything
TARGET (scoped exchange at every hop)
you ─[A: triage]→ core ─[B: read:alerts@sec-eng]→ sub-agent ─[C: read:alerts@wazuh]→ Wazuh only
scope narrows ↓ at every hop · each token audience-bound · the chain is verifiable from act
The one thing to do today (no new infrastructure)
Before any of that machinery, adopt a single design rule: every agent holds its own narrowly-scoped credential and never receives a broad upstream token. That kills the passthrough risk by construction, today, with nothing new installed. It’s the natural next step from where the fleet already keeps its secrets — see Skills, MCP, and where the credentials belong. The token-exchange and policy layer is the hardening you build toward; “stop forwarding tokens” is the fix you ship this afternoon.
Where this is heading (and why I’m not betting on it yet)
There’s promising standards work — IETF drafts on attenuating agent tokens that narrow cryptographically and offline at each hop, verifiable from the root key alone. Conceptually perfect. But they’re early individual Internet-Drafts, not adopted standards, and building a production boundary on a three-month-old draft is the bleeding edge I rent, not the baseline I own. Track them; don’t depend on them. The boring, available combination — RFC 8693 exchange, audience binding, an act-chain validator, and a policy engine — already gets you most of the way, on a stack you already run.
The interesting question with AI agents was never “what can they do?” It’s the auditor’s question: what are they allowed to do, on whose behalf, and can you prove it? Get the token chain right and the rest of the agentic-security conversation gets a lot calmer.
(This sits one layer above how the agent fleet is wired — same fleet, now looking at what travels between the agents rather than how they’re stacked.)