Durable Sessions vs AI Systems Engineering: Which Framework?
// TL;DR
These two frameworks solve completely different problems and rarely compete. If you are building or fixing an AI chat/agent product experience that breaks under real-world conditions — disconnections, multi-device, live control — choose Christensen's Durable Sessions. If you need coding agents to tackle low-level AI/ML engineering like CUDA kernels, LLM fine-tuning, or autonomous research pipelines, choose Burtenshaw's AI Systems Engineering. Pick based on whether your bottleneck is delivery infrastructure or ML engineering automation.
// HOW DO THEY COMPARE?
| Dimension | Christensen Durable Sessions AI UX Framework | Burtenshaw AI Systems Engineering via Coding Agents |
|---|---|---|
| Best For | Fixing broken AI chat/agent UX: streaming resilience, multi-device continuity, live agent control | Automating hard AI/ML systems work: CUDA kernels, LLM fine-tuning, autonomous research loops |
| Problem Domain | Real-time delivery infrastructure between agents and clients | AI/ML engineering automation closer to hardware and model training |
| Complexity | Medium — requires rearchitecting streaming layer and adopting pub/sub, but concepts are well-defined | High — spans CUDA programming, distributed multi-agent orchestration, and ML experiment design |
| Time to Apply | Days to weeks for audit and migration; incremental adoption possible | Hours for Boss 1/2 (single kernel or fine-tune); days to weeks for full AutoLab setup |
| Prerequisites | Existing AI chat or agent product with a streaming architecture (SSE, WebSocket, etc.) | GPU hardware access, CUDA familiarity, Hugging Face Hub account, coding agent tooling |
| Output Type | Architectural redesign: a durable session layer, transport migration plan, and validated UX capabilities | Concrete ML artifacts: optimized CUDA kernels, fine-tuned models, ranked experiment results |
| Key Architectural Insight | Decouple agents from clients via a persistent shared session; never let stream health depend on one connection | Expose open primitives to agents; memory bandwidth — not compute — is usually the bottleneck |
| Multi-Agent Relevance | Solves the orchestrator relay bottleneck by letting all sub-agents write directly to a shared session | Defines a full multi-agent AutoLab team (Researcher, Planner, Workers, Reporter) for parallel experiments |
| Creator Background | Mike Christensen, Ably — real-time infrastructure and pub/sub expertise | Ben Burtenshaw, Hugging Face — ML tooling, open-source models, and coding agent workflows |
| Primary Audience | Product engineers and architects building AI-powered user-facing applications | ML engineers and AI systems engineers pushing coding agents into low-level optimization |
What does the Christensen Durable Sessions AI UX Framework do?
Mike Christensen's framework diagnoses a specific, widespread failure in AI products: the streaming connection between your agent and your user is fragile. Most AI chat products use direct HTTP streaming (typically SSE via the Vercel AI SDK or similar), which creates what Christensen calls the Single-Connection Trap. If the user's connection drops, the stream is gone. A second device cannot see the live response. And SSE's one-way nature means you cannot distinguish between a user pressing 'stop' and a network disconnect.
The framework introduces Durable Sessions — a persistent, shared session layer that sits between agents and clients. Agents publish events to the session; clients subscribe to it. Neither holds a direct pipe to the other. This single architectural inversion unlocks three foundational capabilities: Resilient Delivery (streams survive disconnections), Continuity Across Surfaces (sessions follow users across tabs and devices), and Live Control (clients can steer or cancel agents mid-generation).
The workflow is a 10-step process: audit your current streaming model, score it against the three capabilities, identify failure modes, design the durable session layer, redirect agents and clients to use it, replace SSE with bidirectional transport if needed, flatten multi-agent relay bottlenecks, and validate. It is a focused infrastructure migration, not a model or prompt change.
What does Burtenshaw's AI Systems Engineering via Coding Agents do?
Ben Burtenshaw's framework pushes coding agents into genuinely hard ML engineering problems that go far beyond writing application code. It is structured around three progressive tiers he calls "Bosses":
- Boss 1: CUDA Kernel Writing — Use a coding agent to generate and optimize custom GPU kernels, guided by file-based "Skills" that convert zero-shot tasks into few-shot tasks with examples and benchmarking scripts.
- Boss 2: Zero-Shot LLM Fine-Tuning — Issue a plain-language instruction to fine-tune a model on a dataset; the agent handles script generation, Hub job submission, and GPU provisioning.
- Boss 3: AutoLab — A full multi-agent autonomous research pipeline with four specialized roles (Researcher, Planner, Workers, Reporter) that propose hypotheses, run parallel experiments, and rank results by a verifiable metric.
The guiding philosophy is "go closer to the silicon" — routine coding is commoditized, so the high-value frontier is AI systems engineering. Key principles include preferring open primitives over opaque APIs (so agents can inspect everything), recognizing that memory bandwidth is usually the bottleneck (not compute), and anchoring all autonomous work in verifiable experiments with measurable outcomes.
How do they compare?
These frameworks operate in almost entirely different domains and are complementary, not competing.
Christensen's Durable Sessions is about the delivery layer — how agent-generated content reaches users reliably across real-world conditions. It does not touch what the agent does or how it reasons; it fixes the pipe between the agent and the human.
Burtenshaw's AI Systems Engineering is about the agent's work itself — what a coding agent can accomplish when pointed at hard ML problems. It does not address how results are streamed to end users; it assumes the agent has a capable execution environment.
The one area of conceptual overlap is multi-agent architecture. Both frameworks address the complexity of coordinating multiple agents, but from opposite angles. Christensen solves the output delivery problem (sub-agents writing to a shared session so the orchestrator does not relay updates). Burtenshaw solves the task distribution problem (specialized agent roles running parallel experiments). A team building a multi-agent AI product could apply both — Burtenshaw's pattern to structure the agent team's work, and Christensen's pattern to deliver that work to users.
On complexity, Durable Sessions is a more bounded migration — you know your current architecture and you are moving to a defined target state. AI Systems Engineering spans a much wider surface area, from CUDA programming to multi-agent experiment orchestration, and demands ML domain expertise.
Which should you choose?
Choose Christensen's Durable Sessions if your AI product works in demos but breaks in production — users lose responses on mobile, second tabs cannot see live activity, or your stop button is unreliable. Your problem is infrastructure, not intelligence. This framework directly addresses the gap between a fragile demo and a production-grade AI UX.
Choose Burtenshaw's AI Systems Engineering if you want to use coding agents to automate ML engineering tasks that are currently manual and expensive — writing custom GPU kernels, fine-tuning models without writing training code, or running autonomous research loops overnight. Your problem is engineering automation, not user experience.
Choose both if you are building a multi-agent AI product where agents perform complex ML work and users need a resilient, real-time view of that work. Use Burtenshaw to structure what agents do; use Christensen to ensure users see it reliably.
If you are unsure, ask yourself: Is my bottleneck that users cannot reliably receive and control agent output, or that agents cannot do the hard engineering work I need? The answer determines your framework.
// FREQUENTLY ASKED QUESTIONS
Can I use Durable Sessions and AI Systems Engineering together?
Yes, and they are complementary. Use Burtenshaw's framework to structure how coding agents tackle ML engineering tasks (kernel optimization, fine-tuning, research). Use Christensen's Durable Sessions to deliver agent progress and results to users reliably across devices. They solve different layers of the same AI product stack.
Which framework helps fix AI chat apps that break on mobile?
Christensen's Durable Sessions framework. It directly addresses the Single-Connection Trap where mobile network switches or disconnections destroy the response stream. By introducing a persistent session layer with automatic resume, mobile clients reconnect and pick up exactly where they left off without any agent-side logic.
Do I need to know CUDA to use Burtenshaw's framework?
You need CUDA familiarity for Boss 1 (kernel writing), but Boss 2 (fine-tuning) requires no CUDA knowledge — the agent handles it via plain-language instructions. Boss 3 (AutoLab) requires ML experiment design knowledge but not necessarily CUDA. The framework's Skill files help bridge knowledge gaps by providing structured examples.
What is a Durable Session in AI UX?
A Durable Session is a persistent, stateful, shared resource that sits between the agent layer and the client layer. Agents write events to it; clients subscribe to it. Messages outlive any individual connection. This decoupling enables stream resilience, multi-device continuity, and live client-to-agent control — the three capabilities that separate a production AI UX from a fragile demo.
What is AutoLab in the Burtenshaw framework?
AutoLab is a multi-agent autonomous research pipeline with four roles: Researcher (scans papers, proposes hypotheses), Planner (queues experiments), Workers (implement and run experiments as Hub jobs), and Reporter (tracks metrics via open data layers). It runs parallel experiments overnight and ranks results by verifiable metrics like validation loss.
Why is SSE not enough for AI streaming?
SSE is one-way only. When a client closes an SSE connection, the server cannot distinguish between a network disconnect (resume needed) and a user-initiated cancel (stop needed). This SSE Resume-Cancel Conflict makes resilient delivery and live control mutually exclusive. Christensen recommends bidirectional transport like WebSockets plus a Durable Sessions layer.
Which framework is better for multi-agent AI architectures?
It depends on your problem. Christensen solves the multi-agent *delivery* problem — sub-agents writing directly to a shared session so the orchestrator does not become a relay bottleneck. Burtenshaw solves the multi-agent *task distribution* problem — specialized agent roles running parallel ML experiments. For a complete multi-agent product, you may need both.
How long does it take to implement Durable Sessions?
The audit and gap analysis can be done in a day. Full migration — redirecting agents and clients to a session layer, replacing SSE with bidirectional transport, and validating the three capabilities — typically takes days to a few weeks depending on architectural complexity. The framework supports incremental adoption, so you can start with resilient delivery alone.