AI Agent Security Starts With Permissions, Not Prompts
AI Agent Security Starts With Permissions, Not Prompts#
You cannot prompt your way out of bad permissions.
A careful system prompt is useful. It is not a security boundary.
If an agent has broad access to tools, secrets, customer data, production systems, or billing actions, the model is not the only thing that needs guardrails. The architecture does.
The first rule#
Give the agent the least power that still makes the workflow useful.
That usually means:
- Read before write.
- Preview before execute.
- Scoped access before global access.
- Human approval before risky action.
- Audit everything the agent can touch.
This is not anti-agent. This is how agents become usable.
Scopes should match workflow stages#
Do not give one token to the whole agent.
Split scopes by stage:
- Intake: read message and metadata.
- Research: read docs, tickets, repos, logs.
- Drafting: no external writes.
- Approval: request human decision.
- Execution: narrow write permission for approved action.
- Audit: record what happened.
If the workflow stage cannot do the action, the model cannot accidentally do it.
Tool isolation#
Tools need risk levels.
Example:
- Low risk: search docs, read public runbook.
- Medium risk: read customer account, fetch logs, inspect private repo.
- High risk: post customer message, create refund, change config.
- Critical risk: deploy, rollback, delete data, rotate secrets.
Each tier should have different auth, logging, and approval behavior.
Prompt injection is a permissions problem#
Prompt injection is worse when tools are too powerful.
If a malicious support ticket tells the agent to ignore instructions and export customer data, the right defense is not only "do not follow malicious instructions."
The right defense is:
- The support ticket is untrusted input.
- The tool layer enforces data boundaries.
- Sensitive actions require approval.
- Retrieved content is labeled by trust level.
- The agent cannot exfiltrate secrets it cannot access.
The safest secret is the one the model never sees.
Secrets#
Do not place secrets in prompts, Skills, or agent memory.
Use a vault. Give tools server-side access. Return only the minimum data the model needs.
If the model needs to know whether a payment failed, return that fact. Do not hand it the raw payment provider payload unless the workflow truly requires it.
Audit logs#
For every meaningful run, log:
- User.
- Workspace.
- Agent.
- Skill version.
- Model route.
- Tool calls.
- Data scopes.
- Approval decisions.
- Output.
- Error or retry state.
When something goes wrong, you need a trace. Not a shrug.
Build it in Codelit#
Try this:
Design an AI agent security architecture for a SaaS workflow. Include scoped permissions, tool risk tiers, MCP servers, secret vaulting, prompt injection defense, approval gates, audit logs, evals, and incident rollback.
Map the agent security architecture
Prompts tell the agent what to do. Permissions decide what it can do.
Try it on Codelit
Agent Workflow Builder
Map agents, tools, model routing, approvals, evals, and deployment before wiring connectors
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Netflix Video Streaming Architecture
Global video streaming platform with adaptive bitrate, CDN distribution, and recommendation engine.
10 componentsSearch Engine Architecture
Web-scale search with crawling, indexing, ranking, and sub-second query serving.
8 componentsGoogle Search Engine Architecture
Web-scale search with crawling, indexing, PageRank, query processing, ads, and knowledge graph.
10 componentsBuild this agent workflow
Generate a production workflow for AI Agent Security Starts With Permissions, Not Prompts in seconds.
Try it in Codelit →
Comments