Frequently Asked Questions About Lou Bichard Software Factory Primitives Framework

22 answers covering everything from basics to advanced usage.

// Basics

What are the four primitives in the Software Factory Primitives Framework?

The four primitives are Runtime (where agents execute — threads, containers, VMs, or dev environments), Orchestration (scaling agents up and down horizontally), Triggers (events that bring agents online — webhooks, PR events, ticket creation), and Coordination (how agents interact, pick up tasks, gate progress, and collaborate). The first three are largely solved by existing infrastructure. Coordination is almost always the missing primitive and the primary bottleneck to building a true software factory.

What is the difference between a software factory and just running multiple coding agents?

A software factory incrementally removes the human from the SDLC loop so work flows to production autonomously. Running multiple coding agents in parallel with a human orchestrating each one is not a software factory — it is parallel-agent assistance that still requires constant human initiation and oversight. The distinction matters because the infrastructure requirements are fundamentally different: a factory needs coordination, triggers, and gated micro-steps, while parallel agents only need runtime and orchestration.

What does context rot look like in practice with coding agents?

Context rot manifests as agents progressively forgetting their goals partway through a task, skipping steps they were instructed to complete, producing lower-quality output as the conversation lengthens, or claiming they completed work they actually skipped. You'll notice it most when agents are executing multi-step tasks — they start strong but degrade. The fix is not a bigger context window but Harness Engineering: encoding guardrails into the repository so the agent can be re-grounded at each micro-step.

What is agents.md and how does it fit into the framework?

Agents.md is a file in your repository that provides instructions, context, and guidelines specifically for coding agents. It is a primary tool of Harness Engineering — when you identify that an agent gets lost or makes mistakes in a specific area, you encode that knowledge into agents.md so the agent reads it at the start of each task. It can include coding standards, architectural decisions, step-by-step procedures, and domain-specific rules. It is one of several harness materials alongside skills, context files, and unit tests.

What is a CLI gateway and how does an agent use it?

A CLI gateway is a command-line tool that a locally running coding agent — such as Claude Code or a Cursor-based agent — invokes as a tool call during execution. The agent calls the CLI at each micro-step boundary to ask 'Have I completed this stage? May I proceed to the next?' The CLI checks completion criteria (tests pass, files created, etc.) and returns a gate signal: proceed or halt. This gives local agents access to an external coordination authority without requiring a full server-side orchestration system.

// How To

How do I audit my current agent setup using this framework?

For each of the four primitives, assess whether it is solved, partial, or missing. Runtime: can agents execute code in an isolated environment? Orchestration: can you spin agents up and down horizontally? Triggers: can events bring agents online without human initiation? Coordination: can agents gate their own progress, hand off tasks, and collaborate? Mark each as solved, partial, or missing. Coordination will almost always be the gap. This audit gives you a precise diagnosis before building anything new.

How do I decompose the SDLC into micro-steps for agents?

Take each coarse stage in scope — Plan, Build, Test, Review, Deploy — and list every discrete sub-action an agent must execute. For example, Plan decomposes into: parse spec, identify affected components, write task list, validate task dependencies. Build decomposes into: scaffold files, implement each component, wire integrations. Write these as a linear sequence, not boxes. Each micro-step becomes a node in your coordination layer's state machine, with a gate between each transition.

How do I implement machine-checkable gates between micro-steps?

Define objective pass/fail criteria at each micro-step boundary that do not depend on agent self-report. Examples: tests pass (run test suite, check exit code), lint clean (run linter, zero errors), spec validated (schema check against spec file), code compiles (build succeeds), coverage threshold met. For steps where machine checks are harder — like plan quality — use a second agent as a reviewer or validate against a structured template. The goal is to prevent sycophantic false completion signals.

How do I choose between the CLI gateway and state machine approaches for coordination?

Use a CLI gateway when agents run locally (e.g., Claude Code, Cursor) and need to query an external authority mid-task — the agent invokes a tool call to ask 'May I proceed?' Use a state machine or workflow graph when you are orchestrating multiple agents server-side and need a central representation of pipeline state. Use durable execution when the process must survive interruptions and retries. In practice, many setups combine a state machine backend with a CLI gateway that agents invoke.

// Troubleshooting

Why do agents skip steps even when I give them detailed instructions?

Agents skip steps primarily due to context rot — as the context window fills, earlier instructions get deprioritized or forgotten. Agents also exhibit sycophantic behavior, claiming completion to satisfy the perceived intent rather than following every step. The fix is twofold: decompose instructions into explicit micro-steps with machine-checkable gates (so skipping is caught), and apply Harness Engineering to encode guardrails into the repo so the agent gets re-grounded at each step rather than relying on a long initial prompt.

My agents are creating too much noise in GitHub PRs and Linear tickets — how do I fix this?

This is the 'reusing human tools for agents' antipattern. GitHub and Linear were designed for human coordination and cannot handle agent-scale activity. Build a purpose-built coordination layer that tracks agent state internally. Only surface to GitHub or Linear the final outputs that humans need to see — the completed PR, the summary status. Use the coordination layer's own interface to show agent progress, and design the human UX to highlight only where intervention is needed, not everything happening.

My agents keep losing track of what they're supposed to do halfway through a task

This is context rot. The agent's performance degrades as its context window fills with conversation history, code, and intermediate outputs. Identify the specific micro-step where the agent drifts by running it through your pipeline and observing where quality drops. Then encode the missing context — goal reminders, step checklists, validation criteria — back into the repository via agents.md files, skills, or context files. This is the core Harness Engineering loop: run, identify failure, encode fix, repeat.

// Comparisons

How does this framework compare to using LangGraph or CrewAI for multi-agent coordination?

LangGraph and CrewAI are agent orchestration frameworks — they help you wire agents together programmatically. The Software Factory Primitives Framework operates at a higher level: it diagnoses which infrastructure primitive you're missing and prescribes the right coordination pattern for your SDLC. You might use LangGraph to implement the coordination layer this framework prescribes, but without the framework's diagnosis, micro-step decomposition, and Harness Engineering practices, an orchestration framework alone won't prevent context rot or step-skipping at scale.

How does this framework compare to just using CI/CD pipelines for agent automation?

CI/CD pipelines handle the build-test-deploy stages well but are not designed for the full SDLC loop including planning, implementation, and review. They also lack the agent-specific coordination primitives this framework prescribes — gated micro-steps, context management, and inter-agent hand-off. CI/CD is part of the Triggers primitive (a PR merge can trigger a deploy agent), but it does not solve the Coordination primitive, which is the gap this framework specifically addresses.

How does VM isolation compare to container isolation for coding agents?

Containers are not a bulletproof security boundary — container escapes are a known attack vector, and on shared Kubernetes clusters, containers create noisy-neighbour compute contention that degrades agent performance. VMs provide hardware-level isolation via hypervisors, making them the baseline for secure agent execution at scale. The framework prescribes VMs or full dev environments for proper development tasks where agents have write access to code and can execute arbitrary commands. Containers may suffice only for simple, stateless tasks.

// Advanced

Can I use this framework if I only have one coding agent, not a swarm?

Yes. Even a single coding agent benefits from this framework's micro-step decomposition and gating. A single agent working through a multi-step task will still experience context rot and step-skipping. Decomposing its workflow into explicit micro-steps with machine-checkable gates catches failures early. As you scale to multiple agents, the coordination layer and swarm/fleet/events patterns become essential, but the diagnostic and Harness Engineering practices apply at any scale.

How do I apply Harness Engineering in practice?

Start by running your agent through the pipeline and recording where it fails, drifts, or skips steps. For each failure, encode the fix into the repository: add a rule to agents.md, create a context file with domain knowledge, write a unit test that catches the specific failure mode, or add a skill definition. Then re-run the agent and verify the failure is resolved. This is an iterative loop — each cycle makes the repository smarter and the agent more reliable. The goal is that the repo itself becomes the agent's ongoing tutor.

How do I know if my coding pipeline is ready for the fleet pattern?

You are ready for the fleet pattern when you have a well-tested coordination layer and gated micro-steps working reliably for a single repository, and you need to apply the same automated workflow across many repositories — for example, CVE remediation, dependency updates, test coverage enforcement, or policy compliance. Prerequisites: your coordination layer must be parameterized per-repo, your runtime must support spinning up isolated VMs per repo, and your trigger system must handle webhooks or schedules that initiate fleet-wide runs.

What happens when a gate fails and the agent can't proceed?

When a gate fails, the coordination layer should halt the agent's progression on that micro-step and surface the failure to the human oversight interface. The agent should not be allowed to retry indefinitely or self-approve. Depending on your design, the response may be: route to a human reviewer, trigger a different agent to fix the issue, or log the failure for Harness Engineering. The key principle is that gate failures are signals to improve the harness, not just obstacles to retry past.

How do I design the human oversight UX for a software factory?

Design the UX so humans are on-the-loop, not in-the-loop. Show aggregate pipeline state: which agents are running, which micro-step each is on, and where gates have failed. Do not route all agent activity through human-facing tools like GitHub notifications. Surface only moments that require intervention — gate failures, security flags, ambiguous specs. The coordination layer's state becomes the source of truth for the UX, not the noise in existing developer tools.

What security risks increase when I scale coding agents to fleet level?

Fleet-scale automation increases the attack surface significantly: compromised agents could push malicious code across hundreds of repositories, leaked credentials could propagate, and supply chain attacks become amplified. VM isolation is the baseline — each agent runs in its own VM with restricted permissions. Additionally audit: what repos each agent can access, what credentials agents hold, whether agents can escalate privileges, and what monitoring detects anomalous agent behavior. Security is a prerequisite for further human removal, not an afterthought.

Can I use this framework with non-coding agents like documentation or DevOps agents?

Yes, the four primitives and micro-step decomposition apply to any SDLC-adjacent workflow. Documentation agents benefit from gated micro-steps (outline → draft → review → publish) and Harness Engineering (style guides encoded in agents.md). DevOps agents benefit from the events pattern (infrastructure alerts trigger remediation agents) and VM isolation (agents modifying infrastructure need secure execution). The coordination layer prevents these agents from skipping verification steps just as it does for coding agents.