Scale a fleet to a thousand agents and you have not built one large program — you have built a thousand small principals, each capable of acting on its own. The security question stops being "is the model safe" and becomes "what can this specific agent do, and can it prove it was allowed to."
Identity first
In the systems we build, every agent gets a cryptographic identity at provisioning. No identity, no actions. Permissions are scoped to that identity and evaluated on every operation, so a stolen prompt cannot borrow another agent's reach.
$ bytevon agent inspect --id agent:resume-analyzer
identity: arn:bytevon:agents:us-east:a-4f21
permissions: READ resume_store WRITE calendar
sandbox: network:none fs:read-only
✓ SECURITY POSTURE — compliant
Sandbox the untrusted, log the rest
Untrusted operations run in a read-only sandbox with no network egress unless explicitly granted. Everything an agent does lands in an append-only audit log. The log is not a debugging convenience — it is the system of record an incident review depends on.
- Cryptographic identity per agent, checked on every action.
- Least-privilege permissions, denied by default.
- Read-only sandboxing for untrusted tool use.
- Immutable, queryable audit trail nothing can erase.
The threat model for agents is not the model misbehaving. It is a legitimate-looking agent doing exactly what it was tricked into being allowed to do.
Get identity and audit right and most of the scary failure modes become detectable, scoped, and reversible. That is the whole game.