AI Agent Architecture, Explained for Decision-Makers

A vendor sends you a proposal. The deck shows a diagram with boxes and arrows: an LLM in the middle, some databases on the side, a few API icons. It looks complete. But the diagram doesn’t tell you whether the system will behave reliably under load, what happens when an API goes down, how much each conversation costs to run, or whether you’ll be able to switch providers in two years.

Those answers live in the architecture. And you don’t need to be an engineer to ask the right questions about it.

This article translates AI agent architecture into the terms that matter for procurement: reliability, operating cost, and vendor dependency. Think of it as a checklist for auditing a vendor’s design before you sign.

The Four Pillars Every AI Agent Architecture Has (Whether the Vendor Labels Them or Not)

A production-grade AI agent is built from four logical components. Vendors name them differently, combine them in various ways, and sometimes bundle them inside a single platform you can’t see inside. But they’re always there.

1. The Planner (the “thinking” layer)

The planner is the large language model (LLM) at the core of the agent. It receives a goal or a user message, reasons about what to do next, and decides which tools to call. Some architectures use a single LLM call per step; others run multiple rounds of reasoning before acting.

What to ask a vendor:

Which LLM model powers the planner, and how is it updated? (A model upgrade that changes behavior without notice is a reliability risk.)
Is the planner deterministic or probabilistic? Can you set confidence thresholds or fallback conditions?
How is the planner prompted? Do you have any visibility into or control over the system prompt?

2. Tools (the “hands” of the agent)

Tools are the connections to the outside world: database queries, API calls, form submissions, file reads, email sends. An agent without tools is just a chatbot. The tools layer is where agents actually do things.

The number, reliability, and scope of tools determine what the agent can accomplish — and what it might do by accident. A tool that can send emails is useful; one that can send emails to anyone without an approval step is a liability.

What to ask a vendor:

Which tools does the agent have access to, and can that set be restricted?
Are tool calls logged in an auditable way?
What happens when a tool call fails — does the agent retry, escalate to a human, or silently fail?

3. Memory (what the agent remembers)

Memory in AI agent architecture is more nuanced than it sounds. There are at least three distinct types:

Short-term memory: the current conversation context, held in the LLM’s context window. It resets when the session ends.
Long-term memory: facts stored in a database and retrieved when relevant — customer history, product knowledge, prior interactions.
Procedural memory: rules, workflows, and patterns baked into the agent’s instructions or fine-tuning.

Memory design has a direct impact on user experience and cost. Agents with well-designed long-term memory feel coherent and context-aware. Agents relying purely on short-term memory re-ask questions users already answered.

What to ask a vendor:

What memory types does the architecture include?
Where is customer data stored, and under which jurisdiction? (For Swiss businesses, this matters for nFADP compliance — see our article on AI Agents and Swiss Data Protection.)
Can long-term memory be cleared or corrected if it stores incorrect information?

4. Guardrails (the safety and control layer)

Guardrails are the constraints that keep the agent on-task and within acceptable behaviour. They operate at several levels: input filtering (blocking harmful or out-of-scope requests), output validation (checking responses before they’re sent), scope limits (preventing the agent from taking actions outside its defined role), and escalation logic (knowing when to hand off to a human).

This is the component most often underspecified in vendor demos. A demo works because the vendor controls the inputs. Production works because the guardrails hold when inputs are unpredictable.

What to ask a vendor:

What happens when a user tries to make the agent do something outside its scope?
Is there a human-in-the-loop option for high-stakes actions (large transactions, sensitive data access)?
How are guardrail rules updated as the business evolves?

How Architecture Choices Drive Your Operating Costs

AI agent costs aren’t just a licence fee. The architecture determines a significant portion of the per-interaction cost, which compounds at scale.

The main cost driver is LLM token consumption. Every message in the context window costs money. An agent that passes the full conversation history on every step — rather than using structured long-term memory — burns tokens linearly with conversation length. For a company handling hundreds of interactions per day, the architectural choice locks in the cost curve.

Other cost drivers worth examining:

Tool call frequency: more tool calls per interaction means more latency and sometimes additional third-party API costs.
Model tier: some architectures use expensive frontier models for every step; others reserve those for complex reasoning and use lighter models for simpler subtasks (a pattern sometimes called a “model cascade”).
Retry logic: an agent that retries failed tool calls without circuit-breaker logic can multiply costs during outages.

Vendor Lock-In Lives in the Architecture, Not the Contract

The most overlooked procurement risk in AI agent projects is architectural lock-in. A contract can include exit clauses; an architecture cannot always be migrated cheaply.

Lock-in typically accumulates in three places:

Proprietary memory stores: if the agent’s long-term memory is held in a vendor-specific database format, migration means re-importing and re-validating all historical context — often months of accumulated data.

Hard-coded tool integrations: agents built directly on a vendor’s SDK using proprietary tool-calling APIs require rewriting when you move platforms. Compare this to agents built on open standards like the Model Context Protocol (MCP), where tools are more portable. We cover this in more depth in AI Agent Platform Lock-In: The Risks Nobody Prices In.

Model dependency: if the planner logic was heavily prompt-engineered around one specific model’s quirks (say, a model that’s since been deprecated or significantly updated), migrating to a different LLM may require re-tuning the entire system.

The safest architectures keep these layers loosely coupled: the planner can swap models; the memory layer uses standard retrieval patterns; the tools connect via documented APIs or open protocols. This isn’t always achievable on a budget or timeline, but it’s the question to ask.

A Practical Audit Checklist Before You Commit

When evaluating an AI agent proposal, these questions surface design quality faster than any demo:

Reliability

What is the failover behaviour when the LLM API is unavailable?
How does the agent handle ambiguous or conflicting inputs?
Is there a mechanism to detect and break infinite loops or runaway tool chains?

Cost

What is the estimated per-interaction token cost, and how does it scale with conversation length?
Does the architecture use a single model for all tasks, or a cost-tiered approach?

Lock-in

Which components are proprietary, and which use open standards?
Where is data stored, and in what format is it exportable?
If you switched the LLM provider, what would need to be rewritten?

Control

What guardrails are in place, and who can modify them?
Is there a full audit log of tool calls and agent decisions?

When Architecture Reviews Matter Most

Not every AI agent project warrants a deep architectural review at procurement. A simple FAQ chatbot with no tool access and no data persistence is low-risk regardless of how it’s built.

The architectural stakes rise with:

Tool access to sensitive systems (CRM, ERP, financial data, customer records)
High interaction volume where cost-per-interaction compounds
Multi-step autonomous workflows where a bad early decision propagates
Regulatory exposure — anything touching personal data under GDPR, the Swiss nFADP, or both — many Swiss businesses are subject to both regimes simultaneously

If your use case sits in any of these categories, understand the architecture before the contract, not after. For a broader look at good agent design, see AI Agent Orchestration: Making Agents Work as a System and What Are AI Agents? A No-Hype Guide for Business Leaders.

The build-vs-buy decision is also shaped by architecture: platforms trade configurability for speed; custom builds trade speed for control. Build vs Buy: A Decision Framework for AI Agents examines that trade-off directly.

What a Good Architecture Review Looks Like

At Orange ITS, when we assess an AI agent project — whether we’re building it or reviewing an existing system — we map the four components explicitly before writing a line of code. That means defining the planner’s model and fallback behaviour, specifying which tools the agent can invoke and under what conditions, designing the memory model for both performance and data residency, and writing the guardrail logic before the happy path.

It’s not glamorous. It’s also the reason that systems built this way don’t surprise their owners six months in with spiralling API bills, compliance questions, or a vendor saying migration will cost more than the original build.

If you’re evaluating an AI agent project and want a second opinion on a vendor proposal — or a clear-eyed view of what architecture fits your use case — a 30-minute call with our team is the fastest way to get there. We’ll map the four components against your requirements and flag where the design holds and where it carries hidden risk.

Frequently asked questions

What are the main components of an AI agent architecture?

Every production AI agent has four logical components: a planner (the LLM that reasons and decides), tools (connections to databases, APIs, and email), memory (short-term, long-term, and procedural), and guardrails (input filtering, output validation, scope limits, and escalation logic). Vendors label and bundle them differently, but they are always present.

How does AI agent architecture affect operating costs?

The main cost driver is LLM token consumption: an agent that passes the full conversation history on every step burns tokens linearly with conversation length, which compounds at scale. Tool call frequency, model tier choices, and retry logic without circuit breakers also drive costs up.

Where does vendor lock-in hide in an AI agent system?

Lock-in accumulates in proprietary memory stores, hard-coded tool integrations built on a vendor's SDK, and planner logic prompt-engineered around one specific model. Contracts can include exit clauses, but a tightly coupled architecture cannot always be migrated cheaply.

What questions should I ask a vendor about their agent's memory design?

Ask which memory types the architecture includes, where customer data is stored and under which jurisdiction, and whether long-term memory can be cleared or corrected if it stores wrong information. For Swiss businesses, storage location matters for nFADP compliance.

When does an AI agent project need a deep architecture review?

The stakes rise when the agent has tool access to sensitive systems like CRM or financial data, runs at high interaction volume, executes multi-step autonomous workflows, or touches personal data under GDPR or the Swiss nFADP. A simple FAQ chatbot with no tool access is low-risk regardless of how it is built.

AI Agent Architecture, Explained for Decision-Makers

The Four Pillars Every AI Agent Architecture Has (Whether the Vendor Labels Them or Not)

1. The Planner (the “thinking” layer)

2. Tools (the “hands” of the agent)

3. Memory (what the agent remembers)

4. Guardrails (the safety and control layer)

How Architecture Choices Drive Your Operating Costs

Vendor Lock-In Lives in the Architecture, Not the Contract

A Practical Audit Checklist Before You Commit

When Architecture Reviews Matter Most

What a Good Architecture Review Looks Like

Frequently asked questions

Related insights

AI Agent vs AI Assistant vs Copilot: What's the Difference?

Agentic Workflows: Beyond Simple Automation

Multi-Agent Systems: When One AI Agent Isn't Enough

Put these ideas to work