Question 1

What exactly is a software factory in the context of AI coding agents?

Accepted Answer

A software factory is the commitment to incrementally moving the human out of the SDLC loop so that work flows from development into production autonomously. The human is not proactively interacting with a computer to drive each step. This is distinct from running multiple agents in parallel while a human still orchestrates — that is parallel-agent assistance, not a software factory.

Question 2

What is Harness Engineering and how is it different from prompt engineering?

Accepted Answer

Harness Engineering is the practice of encoding process knowledge back into the repository — via agents.md, skills files, context files, and unit tests — so agents receive structured feedback at runtime. Unlike prompt engineering, which focuses on crafting individual prompts, Harness Engineering creates persistent, repository-level infrastructure that improves agent performance across all future runs and survives context window resets.

Question 3

What is a CLI gateway for agent coordination?

Accepted Answer

A CLI gateway is a coordination layer form factor where a locally running agent like Claude Code can invoke a CLI tool as a tool call to ask 'Have I completed this SDLC micro-step? May I proceed to the next?' The gateway checks machine-verifiable conditions (tests pass, lint clean, spec validated) and returns a gate signal, preventing agents from self-reporting false completions.

Question 4

What are SDLC micro-steps and why do agents need them?

Accepted Answer

Micro-steps are the granular, discrete sub-actions within each coarse SDLC stage (plan, build, test, review, deploy). Agents do not respect coarse-grained SDLC boxes — they skip steps, lose context, and fail to proceed deterministically. Decomposing each stage into explicit micro-steps with gates between them gives agents a clear sequence to follow and provides machine-checkable boundaries to verify completion.

Question 5

How do I audit my current agent setup using the four primitives?

Accepted Answer

For each primitive — Runtime, Orchestration, Triggers, Coordination — assess whether it is solved, partial, or missing in your current setup. Runtime: do your agents have somewhere to execute? Orchestration: can you scale agents up and down? Triggers: do events bring agents online? Coordination: can agents hand off work and gate progress? Mark each as solved, partial, or missing. Coordination is almost always the gap.

Question 6

How do I decompose the SDLC into micro-steps for agents?

Accepted Answer

Take each coarse stage in scope — e.g., Plan — and list every discrete action an agent must execute. Plan becomes: parse spec, identify affected components, generate task list, validate task dependencies. Build becomes: scaffold files, implement component A, implement component B, integrate. Write these as an ordered sequence, not a box. This sequence becomes the backbone of your coordination layer.

Question 7

How do I implement machine-checkable gates between agent steps?

Accepted Answer

At each micro-step boundary, define what constitutes pass/fail before the agent proceeds. Use machine-checkable criteria: tests pass, linter reports zero errors, spec validation succeeds, required files exist, coverage threshold met. Avoid relying on agent self-report — agents exhibit sycophantic behavior and will claim completion without finishing. Gates should be automated checks the coordination layer verifies independently.

Question 8

How do I choose between swarm, fleet, and events patterns?

Accepted Answer

Match pattern to scale: single repo, single task → Swarm (one intent, sub-agents, results funnel to one PR). Cross-repo at org scale → Fleet (agents fan across repositories on schedules or triggers). Background autonomous work → Events (webhooks trigger agents without human initiation). Most mature setups combine all three. Start with the pattern matching your immediate scale target and add others as needs evolve.

Question 9

How do I set up human oversight without creating a bottleneck?

Accepted Answer

Design humans to be on-the-loop (able to see state and intervene when needed) but not in-the-loop (not required to drive each step). Build visibility into sub-agent activity — parent agent status, task completion lists, gate pass/fail signals — but route this through your coordination layer, not through GitHub or Linear. The UX should show where to intervene, not everything that is happening.

Question 10

My agents keep skipping steps even though I told them what to do — what's going wrong?

Accepted Answer

This is context rot combined with a missing coordination layer. As the context window fills, agents lose track of goals and skip steps. The fix is twofold: first, decompose your SDLC into explicit micro-steps with machine-checkable gates so agents cannot proceed without verification. Second, apply Harness Engineering — identify where the agent drifts and encode fixes back into agents.md, context files, and tests in the repository.

Question 11

Why is my GitHub project board so noisy after adding coding agents?

Accepted Answer

GitHub was designed for human coordination, not agent-to-agent communication. When agents create PRs, comments, and status updates through GitHub, the signal-to-noise ratio collapses and humans cannot tell when to intervene. The solution is to build a purpose-built coordination layer for agents and surface only actionable state to humans through a separate interface designed for on-the-loop oversight.

Question 12

My agents claim they've written all the tests but some are missing — how do I prevent this?

Accepted Answer

This is agent sycophancy — agents claiming completion without actually finishing. Never rely on agent self-report for gate decisions. Implement machine-checkable gates: run the test suite, verify coverage thresholds, check that test files exist for each component. Your coordination layer should independently verify completion criteria before allowing the agent to proceed to the next micro-step.

Question 13

Agents work fine on small tasks but fail on larger ones — what's happening?

Accepted Answer

This is context rot. As the context window fills during longer tasks, agents degrade — they lose track of goals, skip steps, and produce lower-quality output. The solution is to decompose large tasks into smaller micro-steps with explicit gates, apply Harness Engineering to encode process knowledge into the repository, and ensure agents can reset context between micro-steps while retaining structured state through the coordination layer.

Question 14

How does the Software Factory Primitives Framework compare to just using Devin or similar autonomous coding tools?

Accepted Answer

Autonomous coding tools like Devin provide a runtime and basic orchestration but typically lack a proper coordination layer for multi-step, multi-agent SDLC workflows. The Software Factory Primitives Framework provides the diagnostic structure to identify what is missing regardless of which tool you use. It prescribes micro-step decomposition, machine-checkable gates, and Harness Engineering — infrastructure layers that sit above any individual agent tool.

Question 15

How does this framework compare to using an AI workflow tool like n8n or LangGraph for agent orchestration?

Accepted Answer

N8N and LangGraph can serve as implementation substrates for the coordination layer — particularly the state machine / workflow graph form factor. However, they are general-purpose tools. The Software Factory Primitives Framework provides the SDLC-specific mental model: you still need to decompose stages into micro-steps, define machine-checkable gates, implement Harness Engineering, and choose swarm/fleet/events patterns. The framework tells you what to build; tools like n8n tell you how to wire it.

Question 16

How is a software factory different from a CI/CD pipeline?

Accepted Answer

A CI/CD pipeline automates build, test, and deploy steps but assumes a human writes code and creates the PR. A software factory automates the entire SDLC — including plan, build, test, review, and deploy — with agents performing the work autonomously. CI/CD is one component of a software factory's runtime and triggers infrastructure, but it lacks the coordination layer needed for agents to hand off work and gate progress across the full lifecycle.

Question 17

Can I use the Software Factory Primitives Framework with Claude Code or Cursor as my agent?

Accepted Answer

Yes. Claude Code and Cursor serve as the agent runtime — they are where agents execute code. The framework sits above any specific agent tool: you audit your four primitives, decompose your SDLC into micro-steps, build a coordination layer (e.g., a CLI gateway that Claude Code invokes as a tool call), and apply Harness Engineering to encode process knowledge into the repo. The framework is agent-tool agnostic.

Question 18

How do I handle security when scaling agents to fleet-level operations?

Accepted Answer

Start with VM isolation as the baseline — containers are insufficient for secure agent execution. Then audit agent permissions: what repositories can the fleet access, what credentials do agents hold, what happens if an agent is compromised. Implement least-privilege access, rotate credentials, and log all agent actions. Security is a prerequisite for moving humans further out of the loop, not an afterthought you add after scaling.

Question 19

What does the coordination layer look like for a fleet pattern across hundreds of repos?

Accepted Answer

For fleet operations, the coordination layer manages agent state across repositories. It tracks which repos have been processed, which micro-steps each agent has completed, and which gates have passed or failed. A state machine per repo instance works well — each repo gets its own coordination graph triggered by an event (e.g., CVE published). The coordination layer aggregates state for human visibility: how many repos processed, how many gates failed, where intervention is needed.

Question 20

How do I decide what to encode into agents.md vs context files vs unit tests?

Accepted Answer

Use agents.md for persistent process instructions — SDLC micro-steps, coding standards, workflow rules. Use context files for task-specific information that changes per run — specs, architectural decisions, component relationships. Use unit tests as machine-checkable gates that verify agent output without relying on self-report. The rule: if it is a process rule, it goes in agents.md. If it is task context, it goes in a context file. If it is a verification criterion, it becomes a test.

Question 21

Can I implement the framework incrementally or do I need everything at once?

Accepted Answer

Implement incrementally. Start by auditing your four primitives to find the gap (usually Coordination). Then decompose one SDLC stage into micro-steps and add gates. Apply Harness Engineering after each run to encode fixes. Expand to additional SDLC stages over time. The framework explicitly defines a software factory as the commitment to incrementally moving the human out of the loop — it is designed for progressive adoption.

Question 22

What is the difference between a swarm parent agent and an orchestrator?

Accepted Answer

In the swarm pattern, a parent agent governs sub-agents via message passing — it fans out one intent to multiple sub-agents and funnels results back into a single output. An orchestrator in the generic sense often means a human managing agents. The key distinction in the framework is that the parent agent is itself automated and operates within the coordination layer, whereas a human orchestrator keeps you in parallel-agent assistance mode, not a true software factory.

Question 23

What if my agents need to coordinate across different programming languages or tech stacks?

Accepted Answer

The coordination layer is language-agnostic — it manages SDLC micro-steps and gates, not code execution details. Each agent runtime can handle its own tech stack. Harness Engineering accommodates this: encode language-specific instructions in agents.md per repo, define language-specific test gates, and use context files for stack-specific architectural decisions. The coordination layer only needs to know whether each gate passed or failed, regardless of the underlying technology.

Question 24

How do durable execution patterns apply to agent coordination?

Accepted Answer

Durable execution ensures the coordination process survives interruptions — agent crashes, network failures, context window resets. Each micro-step completion is persisted, so if an agent fails mid-pipeline, the coordination layer knows exactly which step to resume from. This prevents agents from restarting entire workflows after failures and is critical at fleet scale where interruptions across hundreds of repos would otherwise cause massive rework.

Question 25

Is the Software Factory Primitives Framework only for coding agents or can it apply to other AI agent systems?

Accepted Answer

The framework was designed specifically for coding agents and the SDLC, but the core concepts — four primitives, micro-step decomposition, machine-checkable gates, context rot management, and Harness Engineering — apply to any multi-agent system where agents must follow a structured process. The SDLC-specific micro-steps would need to be replaced with domain-specific process steps, but the diagnostic and coordination patterns transfer.

Frequently Asked Questions About Lou Bichard Software Factory Primitives Framework

// Basics