How Do AI Engineers Build Better Agents with the Four-Pattern Framework?

For AI engineers and agent developers · Based on Swanepoel's Best Agents Four-Pattern Framework

// TL;DR

AI engineers use Swanepoel's Four-Pattern Framework to architect agents that go beyond basic prompt-and-tool setups. By implementing Focus Modes (constrained task-specific modes), Transparent Execution (visible process), Personalization (user-specific Playbooks and Memory), and Reversibility (undo at every level), engineers build agents that earn user trust, produce outputs matching user expectations, and handle high-value tasks. Apply it during initial design or when iterating on agents that underperform on trust, quality, or adoption metrics.

Why Do Most AI Agents Feel Untrustworthy Despite Working Correctly?

The root cause is structural, not model-related. Most agents return a final result without showing how they got there. Users receive an answer but have no way to verify the process, catch wrong assumptions, or intervene before wasted work. Swanepoel's framework calls this the delegation trap — pure hand-off without collaboration.

The fix is Transparent Execution. Surface the agent's live to-do list, the tools it called, the inputs and outputs of each call, and any assumptions it's making. Users don't need to read every detail, but they need the ability to check. Progressive disclosure works well: summary by default, expandable details on demand.

This single pattern — making the process visible — often produces the largest trust improvement with the least engineering effort.

How Should I Structure My Agent's Modes for Maximum Quality?

Stop building one undifferentiated agent that handles every request the same way. Identify the 2–5 distinct task types your agent performs. For a coding agent, these might be Planning Mode, Implementation Mode, Code Review Mode, and Debug Mode.

For each mode:

- Drop irrelevant tools. A planning mode doesn't need code execution.

- Refine the system prompt. Tailor instructions, tone, and output format to the mode.

- Set user expectations. Tell users what this mode does, what inputs it needs, and what outputs to expect.

The engineering benefit is enormous: you can write targeted evaluations (e-vals) per mode, catch regressions in specific task types, and ship improvements to one mode without risking others. This is dramatically more effective than holistic evaluation of a do-anything agent.

How Do I Make My Agent's Output Feel Like the User Wrote It?

This is the Personalization pattern, and it's where most agents fall flat. Generic outputs that could come from anyone are a signal that the agent lacks the user's context.

Implement three layers:

1. Playbooks — Document how the target user or organization actually does the task. Decision criteria, formatting preferences, source hierarchies, analytical frameworks. Feed this as persistent context.

2. Memory — After each session, extract and store learnings: corrections the user made, preferences expressed, terminology used. Retrieve relevant memories in future sessions.

3. Connected Systems — Integrate the user's knowledge bases, tools, and data sources so the agent works with the user's actual information, not generic training data.

The quality test: show the output to someone who knows the user. Would they think the user did it, or a random AI did it?

How Do I Handle the Risk of Irreversible Agent Actions?

Engineer Reversibility before something goes wrong, not after. Identify every action your agent takes with meaningful downside — API calls, file modifications, account changes, sent messages.

For each:

- Stage the action as pending-confirmation, showing exactly what will happen.

- Log a rollback path: the compensating transaction or prior state needed to undo it.

- Where possible, generate parallel outputs so the user can pick the best and discard the rest.

- Integrate with platform-native change tracking (Git, Word track-changes, database transaction logs).

Reversibility unlocks high-value use cases. Without it, users restrict agents to low-stakes tasks. With it, users authorize bold experiments knowing the worst case is just an undo.

What's the Recommended Implementation Order?

Run the audit first: score your agent as absent, partial, or present on each of the four patterns. Start with the pattern that addresses your biggest user complaint. For most agents, Focus Modes and Transparent Execution deliver the fastest improvements. Personalization and Reversibility layer on top once the foundation is stable.

Evaluate and iterate each pattern independently. Don't try to improve everything at once — the constrained surfaces created by Focus Modes make targeted improvement practical and measurable.

Next step: Audit your current agent against all four patterns right now. Score each as absent, partial, or present, and start designing the weakest pattern first.

// FREQUENTLY ASKED QUESTIONS

What's the minimum viable implementation of the Four-Pattern Framework?

Start with Focus Modes and Transparent Execution. Split your agent into 2–3 modes with constrained tool sets and add a visible task list showing the agent's steps. These two patterns alone address the most common agent failures: confused behavior and low trust. Layer in Personalization and Reversibility as you iterate.

How do I write evaluations for individual Focus Modes?

Create test cases specific to each mode's constrained task type. For a Research Mode, test whether it finds the right sources and surfaces uncertainties. For a Drafting Mode, test output quality and format adherence. Evaluate modes independently so improvements to one don't cause regressions in others. This is dramatically more tractable than evaluating a general-purpose agent.

Should I store Playbooks in the system prompt or in a retrieval system?

For short Playbooks (under 2,000 tokens), the system prompt works well and is simpler. For longer Playbooks or multiple Playbooks per mode, use retrieval (RAG) to pull relevant sections based on the current task context. Either way, the Playbook content should be versioned and editable by the user or organization.