Durable Sessions vs Dark Factory: Which Should You Use?

// TL;DR

Choose the Christensen Durable Sessions Framework if you are building or fixing a user-facing AI chat or agent product that streams responses to end users — it solves disconnection, multi-device, and live control problems. Choose the Koc Dark Factory Method if you are a developer managing multiple AI coding agents in parallel to ship code faster. These frameworks solve entirely different problems: one is about delivering AI output to users reliably, the other is about orchestrating AI agents that write your code. Most teams building AI products need Durable Sessions first, since broken delivery undermines everything else.

// HOW DO THEY COMPARE?

DimensionChristensen Durable Sessions AI UX FrameworkKoc Dark Factory Agent Orchestration Method
Best forTeams building user-facing AI chat/agent products that need resilient streaming, multi-device continuity, and live agent controlSolo developers or small teams using AI coding agents (e.g. Codex) to ship code at high velocity across parallel sessions
Problem domainAI UX delivery infrastructure — how AI responses reach the end userAI-assisted software development workflow — how engineers manage coding agents
ComplexityHigh — requires architectural changes to streaming layer, introducing a pub/sub session substrate, and replacing SSE with bidirectional transportMedium — requires disciplined process changes and repo setup (clones, dot-skills), but no novel infrastructure
Time to applyDays to weeks for a full implementation; hours for the audit and gap analysisHours to get started; ongoing refinement as you iterate swim lanes and dot-skills files
PrerequisitesAn existing AI product with a streaming architecture (SSE, WebSockets, or polling); understanding of pub/sub conceptsAn AI coding agent (e.g. Codex, Cursor, Claude Code); a codebase with a test harness; enough compute for parallel sessions
Output typeArchitectural redesign: a durable session layer, transport migration plan, and validated multi-surface AI UXProcess and workflow: swim lane assignments, dot-skills files, merge gates, and a repeatable factory cadence
Key principleDecouple agents from clients via a persistent shared session — neither holds a direct pipe to the otherEngineer as factory manager — apply taste and architectural judgement, let agents do the coding
Failure mode addressedStreams lost on disconnect, stop-button ambiguity, no multi-device sync, orchestrator relay bottleneckCommit maxing without quality gates, waffling agent sessions burning tokens, codebase bloat from unreviewed merges
Creator backgroundMike Christensen, Ably (real-time infrastructure platform)Vincent Koc, OpenClaw (open-source AI project)
Scalability concernScales to many concurrent users and devices per session; multi-agent writes to one sessionScales to many parallel agent sessions per engineer; limited by human brain-space, not compute

What does the Christensen Durable Sessions AI UX Framework do?

The Christensen Durable Sessions Framework diagnoses and fixes the fragile streaming architecture behind most AI chat and agent products. If your AI app uses SSE (like Vercel AI SDK or LangChain streaming) to pipe LLM responses directly to a single client connection, you are in what Christensen calls the Single-Connection Trap: drop the connection, lose the stream.

The framework introduces a Durable Session — a persistent, shared, independently addressable layer that sits between your agent backend and your client frontend. Agents write events to the session; clients subscribe to the session. This architectural inversion unlocks three foundational capabilities:

1. Resilient Delivery — streams survive disconnections and clients resume exactly where they left off.

2. Continuity Across Surfaces — the same session is visible on multiple tabs, devices, or notification surfaces simultaneously.

3. Live Control — users can send stop signals, steering messages, or follow-up prompts while an agent is mid-generation.

The framework also addresses the SSE Resume-Cancel Conflict (closing a connection is ambiguous — is it a disconnect or a cancel?) by requiring bidirectional transport like WebSockets. For multi-agent architectures, it eliminates the Orchestrator Dual-Purpose Problem by letting every sub-agent write directly to the session instead of proxying updates through a central orchestrator.

What does the Koc Dark Factory Agent Orchestration Method do?

The Koc Dark Factory Method is a workflow framework for engineers who use AI coding agents to write software. Instead of treating AI as an autocomplete tool, it reframes the engineer as a factory manager running a production line of parallel agent sessions.

The core organizing concept is swim lanes: isolated, parallel agent sessions each scoped to a single mandate — CI health, feature work, bug fixes, or horizon-scanning for emerging issues. The engineer decides how many lanes to run, how much oversight each needs, and when to nuke a session that has gone off the rails.

Key operating principles include:

- In Harness We Trust — the automated test suite, even over-fitted AI-generated tests, is the primary merge gate. Line-by-line code review does not scale at high agent velocity.

- The Waffling Signal — if an agent starts explaining itself in circles, kill the session immediately rather than burning more tokens.

- Token Efficiency — the mature posture is deliberate, structured agent loops with reward mechanisms, not blind commit-maxing.

- Dot-Skills as Engineering Artefacts — reusable, versioned prompt/context files that improve through use and are deployed into every new agent session.

The method also advocates a plugin architecture as a scope boundary: instead of merging every contributor's feature PR into core, hand them an isolated plugin surface they control.

How do they compare?

These two frameworks operate at entirely different layers of the AI engineering stack and are not competitors.

The Durable Sessions Framework is an infrastructure architecture pattern. It answers: How do AI-generated responses reach the end user reliably across devices, networks, and agent topologies? It requires real engineering investment — redesigning your streaming layer, introducing a pub/sub session substrate, and migrating transports.

The Dark Factory Method is a development workflow pattern. It answers: How do I, as an engineer, manage multiple AI coding agents to ship code quickly without destroying my codebase? It requires process discipline — structuring swim lanes, maintaining test harnesses, iterating dot-skills files — but no novel infrastructure.

A team could easily need both. If you are building an AI product, you might use the Dark Factory Method to manage your coding agents during development, and the Durable Sessions Framework to architect how your product delivers AI responses to users. They address orthogonal concerns.

Where one is clearly stronger:

- For fixing broken AI chat UX — Durable Sessions is the only relevant framework. Dark Factory has nothing to say about user-facing streaming.

- For shipping code faster with AI agents — Dark Factory is the only relevant framework. Durable Sessions does not address development workflow.

- For multi-agent architectures — both frameworks touch multi-agent concerns, but from opposite angles. Durable Sessions solves how multiple agents deliver output to a shared session for users. Dark Factory solves how one engineer runs multiple agent sessions in parallel for development.

Which should you choose?

If your AI product has users complaining about lost responses, broken mobile experiences, no multi-device sync, or a stop button that does not work reliably — choose the Durable Sessions Framework. It directly addresses these infrastructure failures and provides a concrete architectural migration path.

If you are a developer or small team trying to ship faster using AI coding agents and you are struggling with agent session management, codebase bloat, or wasted tokens — choose the Dark Factory Method. It gives you the process discipline to run a high-velocity agent-driven development workflow.

If you are building an AI product and also using AI agents to build it, you likely need both — Durable Sessions for your product architecture and Dark Factory for your development process. Start with whichever pain point is more acute. For most teams, the user-facing delivery layer (Durable Sessions) should come first, because no amount of development velocity matters if your product's AI UX is broken.

// FREQUENTLY ASKED QUESTIONS

Can I use Durable Sessions and Dark Factory together?

Yes, and many teams should. They solve completely different problems. Use the Dark Factory Method to manage your AI coding agents during development. Use the Durable Sessions Framework to architect how your AI product delivers responses to end users. They operate at different layers of the stack and complement each other naturally.

Do I need Durable Sessions if I'm using Vercel AI SDK?

Most likely yes. The Vercel AI SDK uses SSE by default, which creates the Single-Connection Trap. If a user's connection drops, the stream is lost. Durable Sessions introduces a persistent layer between your agent and client that survives disconnections, enables multi-device sync, and resolves the SSE Resume-Cancel Conflict that makes stop buttons unreliable.

Is the Dark Factory Method only for open-source projects?

No. Vincent Koc developed it in an open-source context (OpenClaw), but the swim lane model, test harness gating, waffling signal detection, and dot-skills practices apply equally to proprietary codebases. Any engineer running multiple parallel AI coding agent sessions benefits from the structured factory management approach.

What is a Durable Session vs a regular WebSocket connection?

A WebSocket is a transport — it provides bidirectional communication but is still a point-to-point connection. A Durable Session is a persistent, shared resource that outlives any individual connection. Multiple clients and agents connect to the same session. If a WebSocket drops, the session persists and the client reconnects without data loss. WebSockets alone do not solve multi-device or resilience problems.

How many swim lanes should I run in the Dark Factory Method?

It depends on your brain-space, not your compute budget. Koc suggests starting with 4-5 lanes: 1-2 for CI/test stability (low oversight), 2-3 for features or bugs (active conversation), and optionally one for horizon-scanning issues. Scale up only if you can maintain quality oversight across all lanes. The constraint is your cognitive capacity to monitor reasoning quality, not token cost.

Does the Durable Sessions Framework work for multi-agent architectures?

Yes, and it is particularly valuable there. It solves the Orchestrator Dual-Purpose Problem where the orchestrator is forced to relay sub-agent progress updates. With Durable Sessions, each sub-agent writes directly to the shared session channel. The orchestrator focuses only on coordination. Clients subscribe once and see all agent activity without any additional relay code.

What is the waffling signal in AI coding agents?

The waffling signal is when an AI coding agent produces verbose, circular, or incoherent explanations without making meaningful progress. It is the agent equivalent of a team member who is bullshitting. The correct response in the Dark Factory Method is to immediately kill that session and either restart with a tighter mandate or reassign the task, rather than burning more tokens hoping it recovers.

Which framework helps with AI chatbot disconnection issues?

The Christensen Durable Sessions Framework directly solves this. It introduces a persistent session layer between your agent and client so that streams survive disconnections. When a client reconnects, it resumes from exactly where it left off with no data loss and no agent-side replay logic. The Dark Factory Method does not address user-facing connection issues.