Frequently Asked Questions About Solmaz On-Demand Disposable Agent Orchestration Framework
21 answers covering everything from basics to advanced usage.
// Basics
What is an agent harness and how is it different from a model?
A harness is the full coding agent environment wrapping an AI model — including context window management, tooling integrations, file system access, and the interaction layer. Codex, Claude Code, and OpenClaw are harnesses. The model (GPT-4, Claude, etc.) is just one component inside. The Solmaz framework orchestrates harnesses, not raw models, because agents need the surrounding infrastructure to do useful work like editing files and running tests.
What is the difference between a shallow bug loop and a fundamental refactor?
A shallow bug loop is when an agent iteratively finds and fixes minor, surface-level bugs — typos, linting errors, simple logic fixes, CI failures. This is agent-safe and produces quality output. A fundamental refactor involves architectural decisions — changing data models, restructuring modules, altering API contracts. Agents looping on fundamental refactors produce slop. The framework mandates that agents escalate fundamental refactors to humans and only autonomously loop on shallow bugs.
What is Telegram Driven Development and does it really work?
Telegram Driven Development (TDD) is a workflow pattern where agent tasks are dispatched, monitored, and iterated via messaging platform channels rather than traditional IDEs. It works for tasks that are well-defined enough to encode as SOPs — you send a task via Telegram/Discord/Slack, an agent pod spins up, executes the workflow, and returns results to the channel. It enables coding on the go (e.g. during a commute) by treating each channel as a lightweight IDE session.
What is the Ship of Theseus principle in this framework?
The Ship of Theseus principle means a harness does not need to be rebuilt from scratch as requirements evolve — it can be iteratively ripped apart and reassembled. The identity of your agent system is maintained through continuity of use, not continuity of implementation. You can swap out the underlying model, change the tooling layer, update the ACP adapter, and modify SOPs incrementally. This prevents the common trap of rewriting agent infrastructure from scratch every time a new model drops.
// How To
How do I decide which tasks to automate with agents vs. keep manual?
Classify inbound work into three categories: (a) fully automatable mechanical work like CI-fix refactors, (b) agent-assisted work requiring human sign-off like PR intent evaluation, and (c) work requiring human design judgment like architecture decisions. Only categories (a) and (b) enter the agent workflow. The key signal is repetition — if you notice yourself repeating the same mechanical judgment steps, that pattern should be encoded as an SOP and handed to an agent.
How do I set up ACPX as a workflow engine?
Install the ACPX CLI and bind it to your communication platform channels and harnesses via ACP. Define your SOP workflows as sequences of programmatic nodes — each node is a step like 'determine intent' or 'check CI status' that emits structured JSON output feeding the next node. ACPX functions like Argo Workflows but drives harness sessions instead of raw containers. Wire in escalation breakpoints where the workflow routes to a human for design decisions instead of continuing the loop.
How do I handle scaling from 10 to 1000 concurrent agent tasks?
The Goal Operator and Kubernetes handle horizontal scaling natively — each task is an independent pod, so scaling is a cluster capacity question, not an architectural one. Ensure your state synchronisation layer scales (e.g. GitHub API rate limits, rsync bandwidth). Use Helm charts for repeatable pod templates. The concierge pattern naturally load-balances by dispatching to available pods. Monitor cluster resources and set pod resource limits to prevent noisy-neighbor problems.
How do I create a concierge agent on Slack for my team?
Deploy a persistent agent on your Slack workspace that listens for requests. When an engineer messages it, the concierge uses ACPX to dispatch an on-demand disposable agent pod for that specific task. Since Slack does not natively support dynamic multi-agent provisioning, the concierge returns a UI link to a React app hosted in-cluster where the engineer can interact with the spawned agent's session. Automate provisioning through the Goal Operator — never manage Slack app manifests by hand.
What structured output format should my SOP workflow steps use?
Each SOP step should emit structured JSON that is auditable and feeds directly into the next workflow node. Include fields for the step name, decision made, confidence level, evidence (e.g. diff hunks examined), and the routing decision (continue to next step, loop for shallow fix, or escalate to human). This JSON trail makes the entire agent workflow inspectable, debuggable, and improvable. It also enables programmatic quality gates — if a step's output does not meet criteria, the workflow can branch automatically.
// Troubleshooting
What is the telephone game anti-pattern in agent orchestration?
The telephone game anti-pattern occurs when you route instructions through a middle model — for example, asking Claude to tell Codex what to do. Each intermediate LLM paraphrases the instructions, introducing subtle wording errors that compound. Since exact wording matters significantly when prompting agents, this degrades output quality. The fix is to use ACP to route instructions directly to the executing harness, bypassing any intermediary paraphrasing layer.
How do I prevent parallel agents from creating conflicting file changes?
Configure a state synchronisation layer before running parallel agents. Grant read/write GitHub access to all agent pods and layer an rsync-style or Dropbox-algorithm synchronisation mechanism so file state remains consistent. Without this, agents editing the same repository concurrently will silently produce conflicting artefacts — merge conflicts, overwritten changes, or divergent branches. This is one of the most common pitfalls when scaling from one agent to many.
What happens if an agent pod crashes mid-task?
The Goal Operator handles pod lifecycle including failure recovery. If a pod crashes, the operator can reprovision a new pod for the same task. Because agents are disposable and task state is tracked via structured JSON outputs at each SOP step, the workflow can resume from the last completed checkpoint rather than restarting from scratch. This is one advantage of the disposable pattern — failure is expected and built into the architecture rather than treated as an exceptional case.
Why shouldn't I trust AI-generated PR descriptions when reviewing with agents?
Most AI-generated PR descriptions are low-signal — they describe what the code changes superficially rather than explaining intent or justification. Treating them as ground truth causes the reviewing agent to misjudge whether the PR is the best possible fix. The SOP mandates that the agent independently determines intent by reading the actual code diff, then compares that to the PR description. Even slop PRs provide valuable data about where the codebase is confusing or broken.
How does this framework handle agent slop or low-quality agent output?
The framework addresses slop through structural constraints: SOPs with explicit step boundaries prevent unchecked agent drift, shallow-bug-vs-fundamental-refactor classification prevents agents from making design decisions, structured JSON outputs at each step enable quality gates, and escalation breakpoints route uncertain decisions to humans. The key insight is that slop comes from asking agents to design, not from asking them to execute. Constrain agents to mechanical judgment within well-defined SOPs and quality stays high.
// Comparisons
How does the disposable agent pattern differ from persistent agent sessions?
Persistent agent sessions maintain state across tasks and require session management, garbage collection, and conflict resolution when multiple users share one instance. Disposable agents spin up a fresh Kubernetes pod per task with a clean environment, execute the task, and are torn down. This trades resource efficiency for simplicity and isolation — no state leakage between tasks, no contention between users, and a dramatically simpler failure model. Each agent gets a full computer, not a sandbox.
How does this compare to using Argo Workflows directly for agent tasks?
Argo Workflows orchestrates raw containers with DAG-structured task dependencies. The Solmaz framework uses the same DAG-structured workflow model but drives interactive harness sessions (Codex, Claude Code) rather than static containers. Agents need real-time context, file editing, and iterative loops — capabilities that static container steps cannot provide. ACPX bridges this gap by embedding Argo-like workflow semantics into an agent-aware execution environment with ACP standardisation.
Is it wasteful to spin up a full Kubernetes pod for every single task?
Yes, it is deliberately wasteful in resource terms — and that is the correct tradeoff. The framework explicitly accepts this cost because giving an agent a full computer is dramatically more powerful than constrained sandboxes. Sandbox limitations create edge cases that consume engineering time fixing agent failures. Pod-per-task also provides clean isolation, simple failure handling, and no state leakage. The resource cost is predictable and linear; the engineering cost of working around sandbox limitations is unpredictable and compounds.
// Advanced
Why give each agent a full Kubernetes pod instead of a lightweight container?
Giving each agent a full compute environment is deliberately wasteful in resource terms but provides a fundamentally better abstraction. Agents perform better when they have unrestricted access to a complete operating system — they can install packages, run arbitrary commands, execute tests, and manage files without hitting sandbox limitations. The cost per pod is the price of agent capability. Constrained sandboxes create edge cases that consume more engineering time than the infrastructure savings justify.
Can I use this framework without Kubernetes?
Kubernetes is the recommended infrastructure target because the Goal Operator pattern relies on pod lifecycle management, Helm charts, and container orchestration primitives. However, the conceptual framework — disposable environments per task, ACP standardisation, SOP-driven workflows, concierge pattern — could be adapted to other container orchestration platforms or even VM-based systems. You would lose the operator abstraction and need to rebuild lifecycle management, which significantly increases engineering overhead.
How many parallel agent channels can one person realistically monitor?
The practical range is 1-5 parallel channels simultaneously, based on the framework's field experience. Each channel runs one agent on one task, so monitoring 5 channels means overseeing 5 concurrent tasks. The human's role is to handle escalations when agents hit fundamental refactor boundaries — not to watch every step. As SOPs mature and escalation rates drop, you can increase parallelism. The bottleneck is human attention for design decisions, not infrastructure.
Can agents call other agents in this framework?
Yes. While ACP is primarily designed for human-to-agent communication, agents can also use it to call other agents. ACPX enables this over the command line — any agent can invoke any other agent via ACP. This allows composition: a concierge agent dispatching specialist agents, or a review agent spawning a test-runner agent. The key constraint is avoiding the telephone game — direct routing through ACP rather than having one LLM paraphrase to another.