How Do I Safely Deploy Agents Across CI and Production?

For Platform engineers and DevOps teams deploying agentic systems in CI/CD pipelines · Based on Hablich Agent Interface Engineering Framework

// TL;DR

For platform engineers deploying agentic systems across CI/CD pipelines and production environments, the Hablich Agent Interface Engineering Framework provides a trust-tier model that separates security concerns by deployment context. Use it to classify environments into Tier 1 (local dev, human in loop), Tier 2 (CI, data-separated), and Tier 3 (internet-facing, maximum isolation), share tools across tiers while keeping security models strictly separate, and avoid the critical mistake of removing human consent friction for convenience in environments where prompt injection is a threat.

Why Can't I Use the Same Security Model for Local Dev and CI Agents?

Because local dev agents and CI agents operate in fundamentally different threat environments, even when they use the same tools. The Hablich framework classifies deployments into three trust tiers:

- Tier 1 (local dev): Human is in the loop. Default browser profile, local data access. Every sensitive action requires explicit, time-bound human consent.

- Tier 2 (CI/controlled): No human in the loop. Requires data separation — containers, separate browser profiles, remote debugging port isolation.

- Tier 3 (full internet access): Domain allow lists, prompt injection mitigations, and all Tier 2 controls, plus maximum process isolation.

The critical rule: tools can be shared across tiers, but the security model must never be shared. A `navigate_to_url` tool works the same everywhere, but in Tier 1 it requires consent, in Tier 2 it runs in a container, and in Tier 3 it's restricted to an allow list.

What Security Risks Do Convenience Features Create in Agent Deployments?

In traditional UX, removing friction is always a win. In agentic systems, some friction is by design. Features like auto-remembering permissions, autoconnect (sharing sessions without repeated consent), or blanket tool approvals feel like UX improvements but actually remove consent checkpoints that protect against prompt injection.

The Lethal Trifactor (Simon Willison's framework) identifies converging risks in agentic automation that make prompt injection particularly dangerous. The Hablich framework responds by treating any convenience request that eliminates a consent checkpoint as a security risk to be evaluated, not a UX win to be shipped.

For Tier 2 CI deployments, the mitigation is different: since no human is present, security comes from isolation (containers, separate profiles, network controls) rather than consent prompts.

How Do I Set Up Data Separation for Tier 2 Agent Deployments?

Tier 2 deployments in CI require three forms of isolation:

1. Container isolation — each agent run should execute in a fresh container with no access to persistent state from previous runs.

2. Separate browser profiles — if agents use browser automation, use isolated profiles that contain no credentials, cookies, or session data from real users or other agent runs.

3. Remote debugging port security — if using Chrome DevTools Protocol or similar, bind debugging ports to localhost only and use unique ports per container to prevent cross-agent communication.

Do not rely on a single isolation mechanism. Layer them. And never promote a Tier 2 setup to Tier 3 (internet-facing) without adding domain allow lists and prompt injection mitigations.

How Do I Monitor Agent Security Across Tiers in Production?

Instrument each tier separately. Track consent events in Tier 1 (how often agents request permissions, how often humans approve or deny). Track isolation violations in Tier 2 (container escapes, cross-profile data access, unexpected network calls). Track domain access patterns in Tier 3 (requests outside allow lists, unusual navigation patterns that suggest prompt injection).

Combine security monitoring with the framework's fuel-efficiency metrics — tokens per successful outcome per journey. An unusual spike in token cost for a journey that normally costs little can indicate the agent is being manipulated into unnecessary actions.

Next step: Audit your current agent deployments and classify each into Tier 1, 2, or 3. For any deployment where you find the same security model applied across tiers, create a remediation plan to separate them immediately.

// FREQUENTLY ASKED QUESTIONS

Can I automate consent prompts for Tier 1 local dev agents?

No. The Hablich framework explicitly treats automated consent in Tier 1 as a security risk, not a UX improvement. Tier 1 deployments require human consent at each sensitive action because the agent operates with access to the user's default browser profile and local data. Automating consent removes the by-design friction that protects against prompt injection. If you need unattended operation, promote to Tier 2 with proper data separation instead.

What's the difference between Tier 2 and Tier 3 agent deployments?

Tier 2 runs in controlled environments like CI where the agent's network access is limited to internal resources. Tier 3 involves full internet access — the agent can browse arbitrary websites. Tier 3 requires everything Tier 2 has (containers, isolated profiles) plus domain allow lists, prompt injection mitigations, and maximum process isolation. Tier 3 is sometimes called 'YOLO mode' and demands the most stringent security model.

How do I handle agents that need to operate across multiple tiers?

Share the tools but implement tier-specific security wrappers around each tool invocation. The same tool interface can exist in all three tiers, but the permission model, isolation requirements, and consent flows must be configured per tier. Use environment variables or deployment configuration to select the appropriate security wrapper at runtime. Never default to the least restrictive tier.

Full skill: Hablich Agent Interface Engineering Framework Extended FAQ More by AI Engineer All framework skills