Question 1

What's the difference between AI augmentation and an agent-first business?

Accepted Answer

AI augmentation (gen-one AI) keeps humans in the driver's seat — using AI for autocomplete, single-turn questions, or tab suggestions in an IDE. An agent-first business deploys fully autonomous agents that execute multi-turn tasks with tool access, no human in the per-step loop. Most people and companies are still in gen-one mode without realizing it. The leap to agent-first requires handing the agent tasks that would take a skilled human hours or days and letting it run autonomously.

Question 2

Can I use a cheaper model instead of a frontier model for my agents?

Accepted Answer

Yes, but only after establishing quality data. First, run the agent on a frontier model and build a Rubric with scored dimensions. Once you have a quality trend line, test switching to a mid-tier model (e.g., Sonnet instead of Opus). If Rubric scores don't decline meaningfully, lock in the cheaper model for routine runs — achieving up to 5x cost savings. Reserve frontier models for high-stakes or complex tasks. Never start with a cheaper model; underpowered models are the most common reason agents disappoint.

Question 3

How do I know if my business idea is the right size for an agent-first approach?

Accepted Answer

Target medium-sized markets — roughly a few billion dollars in TAM. Large enough to build a multi-hundred-million-dollar business on double-digit market share, small enough that massive incumbents aren't prioritizing it. Avoid micro-niches (too small to matter) and hundred-billion-dollar categories (too competitive). If you're unsure, let the agent read your Gmail, Slack, and Notion to suggest opportunities tailored to your actual context.

Question 4

What happens if my agent's first output is terrible?

Accepted Answer

That's expected — V1 output is typically about 50% of your quality bar. This is normal and does not mean agents aren't capable. The biggest mistake is one-shotting and abandoning. Instead, identify specific failure modes ('too formal,' 'no data backing claims'), feed that back directly, have the agent regenerate AND update the Skill so the fix is permanent. Quality compounds through daily coaching over 30-90 days.

Question 5

How is the Agent-First Business Builder different from using Zapier or Make for automation?

Accepted Answer

Zapier and Make are workflow automation tools that chain pre-defined actions between apps — they execute deterministic, rule-based steps. The Agent-First Business Builder deploys generally intelligent agents that research, reason, make judgment calls, build artifacts, and self-improve over time. An agent in Founder Mode validates a market opportunity; Zapier moves data between fields. They operate at entirely different levels of cognitive complexity.

Question 6

What is memory defrag and when should I run it?

Accepted Answer

Memory defrag is a maintenance operation that clusters accumulated agent memories by keyword and embedding similarity, identifies duplicates, and lets you consolidate related items. Run it periodically as your agent accumulates memories over weeks and months. Without defragging, the memory store becomes cluttered, potentially degrading retrieval quality and increasing noise. Think of it as tidying up an employee's notes so they stay organized and performant.

Question 7

Should I let my AI agent post to social media automatically?

Accepted Answer

No — for content and any reputation-sensitive outputs, never go full YOLO (auto-post without review). The correct default is draft-and-review: the agent generates drafts, scores them via Rubric, and delivers top-scoring options for your approval. Reserve full autonomous action for genuinely low-stakes, reversible tasks like acknowledging routine customer emails or scheduling meeting requests.

Question 8

How many agents should I start with?

Accepted Answer

Start with one agent focused on a single well-defined role. Get it stable — Skill pinned, Rubric scoring, run schedule set, feedback loop active. Once it's reliably producing quality output with minimal intervention, build your second agent for a different role. The end-state is a fleet, but rushing to multiple agents before mastering one leads to poor quality across the board. Each agent maps to a human-equivalent role.

Question 9

What's the minimum daily time commitment to get good at building agents?

Accepted Answer

Commit to at least 30 minutes per day for 30-60-90 days. Sporadic weekly experimentation produces nothing meaningful. The framework follows the Door-to-Door vs. Internet Parable: committed daily practice compounds into structural business leverage within six months, while occasional dabbling leaves you stuck in gen-one mode. Treat it like learning a new skill — consistent reps are what build proficiency.

Question 10

How do I create a good Rubric for my agent?

Accepted Answer

Define 3-5 dimensions that represent what 'great' looks like for your agent's specific output. For a content agent, dimensions might be voice match, hook strength, and data presence. For a research agent: accuracy, completeness, and actionability. You can prompt the agent itself: 'Help me build a rubric to score great [output type].' Pin the Rubric to the agent so an LLM-as-Judge scores every run automatically. Refine dimensions as you learn what matters most.

Question 11

What if I don't have a business idea — can the agent help me find one?

Accepted Answer

Yes. Connect the agent to your Gmail, Slack, Notion, and Granola notes, then ask it to suggest use cases tailored to your actual context. The agent can surface patterns in your daily work, identify pain points you've been experiencing, and propose business opportunities at the right market size. Starting from a blank slate without connecting context is a common pitfall — the agent can only personalize recommendations if it can read your real data.

Question 12

How does the Self-Improvement Loop work in practice?

Accepted Answer

After each run, agents surface suggested memory updates, Skill tweaks, system prompt changes, and new tool recommendations based on what they observed. Your job is to curate these suggestions — accept, reject, or modify them. This is not blind auto-improvement; it's coached improvement. Over time the agent becomes progressively better at its role with decreasing intervention. Ignoring these suggestions means leaving compounding quality gains on the table.

Question 13

Can I deploy agents into Slack for my team to use?

Accepted Answer

Yes. Any agent can be deployed one-click into Slack, where it operates as a virtual co-worker. Team members can interact with it, ask questions within its domain, and receive proactive outputs. For example, a deal flow analyst agent deployed in Slack can chime in with competitive context whenever portfolio companies are discussed. This is part of the Command Center model — agents are distributed across the channels where your team already works.

Question 14

Is a $150 agent run really worth it?

Accepted Answer

Apply the Human Equivalent Time Cost Reframe. Ask: what would it cost a human — in time and money — to produce the same output? If a $150 token spend produces a board memo that would take a consultant two days and $3,000, it's extraordinarily cheap. Anchoring to Netflix-style subscription pricing ($10-20/month) is the wrong mental model. The correct comparison is always the human equivalent, not the SaaS equivalent.

Question 15

What's the difference between Live Mode and a scheduled agent run?

Accepted Answer

A scheduled run fires at a set time (e.g., daily at 8 a.m.) and delivers output via email or Telegram. Live Mode is always-on: the agent continuously polls for new inputs — tweets, emails, Slack messages — and pushes relevant outputs whenever triggered, without waiting for a manual or scheduled run. Use scheduled runs for predictable daily deliverables; use Live Mode for monitoring and real-time response tasks like competitive intelligence or inbound email triage.

Question 16

Why can't one agent do everything in my business?

Accepted Answer

Context window limits make role-partitioned agents structurally inevitable, just as they make role-partitioned humans inevitable in companies. One agent handling content marketing, deal flow, customer support, and competitive intelligence would exceed context limits and lose focus. The Command Center model maps each agent to a human-equivalent role with its own Skills, Rubric, and run schedule, producing better quality and easier management.

Question 17

What tools or data sources should I connect to my agent?

Accepted Answer

Connect any context or tool the agent needs to do its job well: Gmail, Slack, Notion, Granola, Linear, Twilio, Twitter/X, Google Maps, and relevant APIs. The more context an agent has, the more personalized and accurate its output. Starting without connecting data sources is a common pitfall — agents can only produce generic outputs without access to your real business context and communication history.

Question 18

How do I avoid the most common mistakes when building AI agents?

Accepted Answer

The top mistakes are: one-shotting and abandoning after a mediocre V1 result, using agents like gen-one chatbots for simple questions, anchoring cost to SaaS pricing instead of human-equivalent time, skipping the Rubric, experimenting sporadically instead of daily, going full YOLO on high-stakes outputs, treating Skills as finished after creation, and ignoring the Self-Improvement Loop suggestions. Awareness of these pitfalls separates casual experimenters from effective agent builders.

Question 19

What does 'Low Floor, High Ceiling' mean for agent platforms?

Accepted Answer

Low Floor means the initial experience must be intuitive enough for a first-time user — no coding required, no complex setup. High Ceiling means the control plane must scale to running a serious business: fleet management, Rubric scoring, memory defrag, model selection, and team deployment. The best agent platforms never sacrifice one for the other. If a tool is easy to start but can't scale, or powerful but impenetrable, it fails this design test.

Question 20

How do I handle an agent that keeps making the same mistake?

Accepted Answer

Identify the specific failure mode and feed it back directly in the thread. Critically, don't just fix the current output — update the Skill so the fix is permanent across all future runs. If the error persists, check whether the Skill instructions are ambiguous, add explicit negative examples ('never do X'), and verify the Rubric includes a dimension that catches this failure. Persistent errors usually mean the Skill or Rubric needs refinement, not that the model is incapable.

Question 21

Can the Agent-First Business Builder work for non-technical people?

Accepted Answer

Yes. The framework's Low Floor, High Ceiling principle specifically requires that the initial experience be intuitive for non-technical users. Skills are created through natural-language interaction, not coding. Rubrics are built by describing what 'great' looks like in plain English. Agents are deployed into familiar channels like Slack and email. The technical ceiling is there for those who need it, but the starting point requires no programming ability.

Frequently Asked Questions About Howie Liu Agent-First Business Builder

// Basics