How Do Platform Teams Deploy Agent Concierge Systems on Slack?

For Platform engineering teams at mid-to-large companies · Based on Solmaz On-Demand Disposable Agent Orchestration Framework

// TL;DR

Platform engineering teams at 100+ person companies can deploy a concierge agent on Slack that dispatches on-demand disposable AI agents for production error triage, bug investigation, and developer support. Instead of bottlenecking on a single shared agent, the concierge provisions a dedicated Kubernetes pod per request, provides the engineer with a UI link to the agent session, and remains available for the next request. The Goal Operator handles all pod lifecycle management, eliminating manual Slack app manifest configuration.

Why Does a Single Shared Agent Bottleneck at Scale?

A 100-person engineering team cannot share one AI agent instance. When multiple engineers need help simultaneously — debugging production errors after a release, investigating flaky tests, triaging customer-reported bugs — a single instance creates a queue. Engineers wait, context switches stack up, and the agent becomes a bottleneck rather than a force multiplier.

The Solmaz framework solves this with the concierge pattern: one persistent front-door agent on Slack dispatches on-demand disposable agent pods for each request.

How Do You Architect the Concierge Agent on Slack?

The concierge agent is a persistent Slack integration that:

1. Receives requests: Engineers message the concierge in a dedicated Slack channel or via DM.

2. Dispatches a disposable agent: The concierge triggers the Goal Operator to provision a new Kubernetes pod with a full coding agent harness.

3. Returns a session link: Since Slack doesn't natively support multi-agent cosmetic provisioning, the concierge returns a URL to a React app hosted in-cluster where the engineer interacts with their dedicated agent.

4. Stays available: The concierge immediately returns to listening for the next request. No queue, no waiting.

Use ACPX to bind the Slack integration to the ACP layer. This means your concierge works with any ACP-compliant harness — Codex, Claude Code, OpenClaw — without rewriting the Slack integration when you switch or add harnesses.

How Do You Handle Pod Lifecycle and Cost Control?

The Goal Operator manages the entire pod lifecycle:

- Provisioning: When the concierge dispatches a request, the operator creates a pod with the specified harness, pre-loaded with the relevant repository and context.

- Timeout: Pods that idle beyond a configurable threshold are torn down automatically.

- Teardown: When the engineer marks the task complete (or the agent determines it is done), the pod is destroyed.

- Cost controls: Set Kubernetes resource quotas per namespace, configure pod limits, and use priority classes to prevent runaway spending.

Deploy with helm charts for repeatability. Never manage Slack app manifests or pod configurations manually — that approach doesn't scale past five agents.

How Do You Encode Production Error Triage as an SOP?

Production error triage is a repeating task class — perfect for an agent SOP:

1. Ingest error context: The agent receives the error log, stack trace, and affected service from the engineer.

2. Reproduce: The agent attempts to reproduce the error in the pod's environment.

3. Root cause analysis: The agent traces the error through the codebase, identifies the likely root cause, and outputs its finding as structured JSON.

4. Suggest fix: For shallow bugs, the agent proposes a fix and opens a PR. For fundamental issues, it escalates with a detailed analysis.

5. Verify: If a fix is proposed, the agent runs tests to confirm the fix doesn't introduce regressions.

This SOP runs in ACPX as an Argo-like workflow, with each step as a node emitting JSON. The platform team can inspect, iterate, and improve the SOP over time.

Next step: Identify your team's top three most frequent engineering support requests. Encode the first one as an SOP in ACPX and deploy a concierge agent on Slack to handle it.

// FREQUENTLY ASKED QUESTIONS

How do I avoid engineers abusing the concierge agent for non-work tasks?

Configure the concierge with scope guards — define which repository contexts and task types it can dispatch agents for. Use Slack channel permissions to control access. Log all dispatched sessions with structured JSON for auditing. The SOP workflow naturally constrains what agents do; requests outside the defined SOPs get a human-readable explanation of what the system supports.

Can the concierge agent work on Microsoft Teams instead of Slack?

Yes. Because the concierge communicates through ACP, the platform integration is a thin adapter layer. Write one ACP adapter for Teams and the concierge works identically. ACPX handles the binding between the Teams channel and the underlying harness. You don't need to rebuild the agent logic or SOPs — only the platform-specific message routing changes.

How many concurrent agent pods can a typical Kubernetes cluster handle?

It depends on your cluster size and pod resource requests. A modest cluster with 10 nodes can typically handle 20-50 concurrent agent pods. With cluster autoscaling enabled, you can scale to hundreds. The Goal Operator handles provisioning; you configure resource quotas to set the ceiling. Start with conservative limits and increase based on observed usage patterns and cost tolerance.

Full skill: Solmaz On-Demand Disposable Agent Orchestration Framework Extended FAQ More by AI Engineer All framework skills