Durable Sessions vs Product Skill Architecture: Which to Use?
// TL;DR
Choose the Durable Sessions Framework if your AI product suffers from dropped streams, broken multi-device experiences, or no user control during generation — it fixes the real-time delivery layer. Choose the Product Skill Architecture Method if agents produce stale, unsafe, or incorrect outputs because they lack your product's specific context. These frameworks solve completely different layers of the AI product stack, and most mature AI products will need both.
// HOW DO THEY COMPARE?
| Dimension | Christensen Durable Sessions AI UX Framework | Rodrigues Product Skill Architecture Method |
|---|---|---|
| Best for | Fixing broken real-time streaming, disconnection handling, and multi-device AI chat UX | Closing the knowledge gap between AI agents and your specific product's APIs, security rules, and workflows |
| Problem layer | Infrastructure and delivery — how agent output reaches the client | Context and correctness — what the agent knows about your product |
| Complexity | High — requires architectural redesign of streaming transport, pub/sub infrastructure, and client subscription model | Low to moderate — involves writing and iterating on a markdown skill document plus building an eval suite |
| Time to apply | Weeks to months depending on existing architecture; significant backend refactoring | Days to a week for v1; ongoing iteration via eval cycles |
| Prerequisites | Existing AI chat or agent-streaming product; understanding of SSE, WebSockets, and pub/sub patterns | Existing product documentation; known agent failure modes; ideally an MCP server or tool integration already in place |
| Output type | Redesigned streaming architecture with a persistent Durable Sessions layer between agents and clients | A versioned skill.md file (with optional reference files) bundled in a repo, plus an eval suite |
| Creator background | Mike Christensen, Ably — specialist in real-time infrastructure and messaging | Pedro Rodrigues, Supabase — specialist in developer platforms and agent tooling |
| Multi-agent support | Excellent — directly solves the orchestrator relay bottleneck by letting sub-agents write to a shared session | Not directly addressed — focused on single-agent context quality rather than multi-agent coordination |
| Eval / testing approach | Manual integration testing: drop connections, open second tabs, send cancel signals | Structured eval suite with graded completeness scores across baseline, MCP-only, and MCP+skill conditions |
| Ongoing maintenance | Low once implemented — the session layer is stable infrastructure | Ongoing — skill.md must evolve as your product's APIs, security rules, and workflows change |
What does the Christensen Durable Sessions AI UX Framework do?
The Durable Sessions Framework, introduced by Mike Christensen of Ably, diagnoses and fixes the real-time delivery layer of AI chat and agent products. It starts from a core observation: most AI products use direct HTTP streaming (typically SSE via tools like the Vercel AI SDK), which couples the health of the response stream to a single client connection. If that connection drops, the stream is gone. If a user switches devices, they lose visibility. If they press "stop," the system can't distinguish that from a network disconnect.
The framework introduces the concept of a Durable Session — a persistent, stateful, shared resource sitting between the agent layer and the client layer. Agents write events to the session; clients subscribe to the session. This architectural inversion unlocks three foundational capabilities: Resilient Delivery (streams survive disconnections), Continuity Across Surfaces (sessions follow users across tabs and devices), and Live Control (clients can steer, interrupt, or cancel agents mid-generation). The natural implementation substrate is a pub/sub channel model, and the framework requires replacing SSE with a bidirectional transport like WebSockets for full functionality.
This is a deep infrastructure redesign. It is best suited for teams whose AI product already works at the model/agent level but breaks under real-world conditions like flaky mobile networks, multi-device usage, or concurrent multi-agent activity.
What does the Rodrigues Product Skill Architecture Method do?
The Product Skill Architecture Method, introduced by Pedro Rodrigues of Supabase, solves a fundamentally different problem: agents don't know enough about your specific product to use it correctly. There is a context gap between what a model learned in training and what it needs to know about your product's current APIs, security requirements, and optimal workflows.
The method provides a structured process for building a skill.md — a markdown instruction file that gives agents the product-specific guidance they lack. The core principles are ruthlessly practical: don't duplicate your docs (point agents to your single source of truth instead), put all non-negotiable rules directly in skill.md (because agents will skip reference files), be opinionated about workflows (encode the exact step sequences you know work best), and start minimal then iterate based on evals.
The framework's eval methodology is its backbone. You test the skill document like code: build scenario-based tests covering known failure modes, run them across baseline, MCP-only, and MCP+skill conditions, grade with a completeness score, and iterate. The entire output — a skill.md file, optional reference files, and an eval suite — is bundled into the relevant repo and versioned like software.
This method is best for platform and product teams whose AI integrations produce stale, unsafe, or incorrectly sequenced outputs.
How do they compare?
These two frameworks operate at entirely different layers of the AI product stack, and comparing them head-to-head on the same dimension is mostly a category error. Durable Sessions fixes how agent output reaches the user. Product Skill Architecture fixes what the agent knows when producing that output.
Durable Sessions is the harder, more expensive investment. It requires backend infrastructure changes — introducing a pub/sub session layer, migrating from SSE to WebSockets, redesigning client connection models. The payoff is a fundamentally more resilient product that handles real-world conditions gracefully. If your AI product feels broken on mobile, can't support multi-device sessions, or has a stop button that doesn't work reliably, this is the framework to apply.
Product Skill Architecture is faster to implement and easier to iterate on. The output is a markdown file and an eval suite, not an infrastructure overhaul. But it addresses the equally critical problem of agent correctness. If your agents are creating SQL views without security flags, using deprecated API endpoints, or generating migration files at the wrong time, no amount of streaming resilience will fix that. You need to close the context gap.
On multi-agent architectures, Durable Sessions is clearly stronger — it directly solves the orchestrator relay bottleneck that plagues multi-agent systems. Product Skill Architecture doesn't address multi-agent coordination; it focuses on making individual agents smarter about your product.
On eval rigor, the Skill Architecture Method is stronger. It provides a structured, graded, multi-model eval methodology. Durable Sessions validation is more manual — testing disconnection recovery, multi-tab visibility, and cancel signal propagation through integration tests.
Which should you choose?
If your agents produce correct outputs but users experience broken streams, lost responses, or no multi-device support, use the Durable Sessions Framework. Your problem is delivery, not intelligence.
If your agents have reliable delivery but produce stale, unsafe, or incorrectly sequenced outputs for your specific product, use the Product Skill Architecture Method. Your problem is context, not infrastructure.
If you're building a mature AI product, you likely need both — Skill Architecture to make agents correct, and Durable Sessions to make the experience resilient. Start with whichever layer is currently causing the most user-visible failures. For most early-stage products, that's agent correctness (Skill Architecture first). For products already deployed at scale on mobile or multi-device, it's more likely delivery resilience (Durable Sessions first).
// FREQUENTLY ASKED QUESTIONS
Can I use Durable Sessions and Product Skill Architecture together?
Yes, and most mature AI products should. They solve completely different layers — Durable Sessions fixes how agent output reaches users (delivery infrastructure), while Product Skill Architecture fixes what agents know about your product (context and correctness). They are complementary, not competing. Implement whichever addresses your most pressing user-visible failures first.
Which framework is easier to implement?
Product Skill Architecture is significantly easier. It involves writing a markdown file and building an eval suite — achievable in days. Durable Sessions requires redesigning your streaming infrastructure, introducing a pub/sub session layer, and potentially replacing SSE with WebSockets. That's weeks to months of backend engineering work depending on your current architecture.
Do I need Durable Sessions if I'm using the Vercel AI SDK?
If your product works fine on stable desktop connections and doesn't need multi-device support or mid-generation user control, the Vercel AI SDK's SSE streaming may be sufficient. But if users experience dropped responses on mobile, can't continue sessions across devices, or your stop button behaves unreliably, you've hit the Single-Connection Trap that Durable Sessions solves.
What is a skill.md file and why does it matter for AI agents?
A skill.md is a structured markdown instruction file that gives AI agents product-specific knowledge their training data lacks. It contains non-negotiable rules (like security checks), opinionated workflows, and persistent directives to fetch live documentation. It matters because without it, agents default to stale or generic training data, producing unsafe or incorrect outputs for your specific product.
Does the Durable Sessions framework work with single-agent architectures?
Yes. While Durable Sessions particularly shines in multi-agent architectures by eliminating the orchestrator relay bottleneck, it provides critical value for single-agent setups too — specifically resilient delivery on flaky networks, cross-device session continuity, and live control (stop, steer, cancel) during generation.
How do I test whether a skill.md is actually improving agent behavior?
The Rodrigues method prescribes a structured eval approach: create at least six realistic task scenarios covering known failure modes, run them in three conditions (baseline, MCP-only, MCP+skill), and grade each with a completeness score. Run these across at least two model families. If a non-negotiable rule is still being skipped, move it from reference files into skill.md and retest.
Why can't SSE support a stop button properly?
SSE is strictly one-way — the client can receive data but cannot send messages back to the server. The only client action available is closing the connection. This creates an irresolvable ambiguity: closing the connection could mean 'I disconnected accidentally, please let me resume' or 'I pressed stop, please cancel.' Resume and cancel become mutually exclusive. Bidirectional transport like WebSockets solves this.
Is the Product Skill Architecture method specific to Claude or does it work with other AI models?
It is designed to be model-agnostic. The method explicitly requires testing skills across at least two model families to confirm agent-agnostic performance. A skill.md that only works for one model is considered fragile. If a model fails where others pass, the skill language should be strengthened for that model's specific tendencies.