Frequently Asked Questions About Rodrigues Product Skill Architecture Method
24 answers covering everything from basics to advanced usage.
// Basics
What is the context gap in AI agents?
The context gap is the delta between what an AI agent knows from its training data and what it needs to know to work correctly with your specific product. This gap includes deprecated API endpoints, proprietary security requirements, post-training-data features, and product-specific workflow sequences. The Rodrigues method treats skill.md as the mechanism for closing this gap, providing the agent with the precise, current knowledge it cannot obtain from its training corpus.
What is front matter in a skill.md file?
Front matter is the name and description metadata at the top of a skill.md file. It serves as the signal an agent uses to decide whether to load the skill for a given task. It must be written precisely — vague or generic front matter means agents may not activate the skill when they should, or may activate it inappropriately. Think of it as the function signature for a document: it determines when and whether the skill is invoked.
What does 'be opinionated' mean in the Rodrigues method?
Being opinionated means encoding the specific workflows you know are most effective for your product, rather than leaving agents to infer sequencing from general training data. You define the exact order of operations — for example, make schema changes, run the advisor, fix issues, then generate the migration. You provide both the rule and the rationale so the agent has reasoning to anchor on. Neutral documentation produces suboptimal agent behavior; opinionated skill files produce correct outcomes.
Why do agents skip reference files?
Agents are architecturally lazy about loading additional files — they resist loading more than one reference file per task and rarely load two or more. This is an observed behavioral pattern across multiple model families, not a bug. The Rodrigues method designs around this reality: anything that cannot be missed goes into skill.md directly. Reference files are reserved for genuinely optional supplementary detail. If evals show an agent skipped critical reference file content, that content must be promoted to skill.md.
Should I use the Rodrigues method if my product is well-documented and open source?
Yes. Even well-documented open-source products have a context gap because models have a training data cutoff. Documentation published after the cutoff is invisible to the model. Additionally, models may have ingested outdated versions of your docs. The skill.md forces agents to fetch current docs rather than relying on potentially stale training data, and it encodes the opinionated workflows and security requirements that even current documentation may present neutrally rather than prescriptively.
// How To
How do I identify my product's agent failure modes?
Audit agent interactions with your product by cataloguing every incorrect, unsafe, or suboptimal output. Common failure categories include: missed security flags (like RLS bypasses), use of deprecated API endpoints from stale training data, incorrect operation sequencing, generating unnecessary artifacts (like redundant migration files), and hallucinating endpoints that don't exist. Collect real failure examples from developers using AI coding assistants with your product. This audit defines the scope of your entire skill.
How do I decide what goes in skill.md versus reference files?
Apply the principle: if the agent missing this information would cause an incorrect or unsafe outcome, it goes in skill.md. If it's supplementary detail that enriches but isn't essential, it can be a reference file. Security checklists, non-negotiable workflows, and persistent fetch-docs directives always go in skill.md. API usage examples for uncommon endpoints, optional configuration details, and edge-case documentation can go in reference files — but accept they may be skipped.
How do I write effective front matter for a skill.md?
Write the name as a precise identifier for your product's domain — not generic like 'database skill' but specific like 'Supabase Database Development Skill.' The description should concisely state what the skill covers and when the agent should load it. Include key trigger terms the agent would encounter in relevant tasks. Front matter is how agents decide to activate the skill, so precision directly impacts whether the right skill loads at the right time.
How do I structure opinionated workflows in a skill.md?
Write each workflow as a numbered sequence of explicit steps with no ambiguity in ordering. For each step, state the action, why it comes at that position in the sequence, and what must be true before proceeding to the next step. Example: '1. Make schema changes freely in dev → 2. Run the advisor tool → 3. Fix all flagged issues → 4. Only then generate the migration file.' Including rationale helps agents anchor on reasoning rather than just following blind rules.
How do I set up the three-condition eval framework?
Create at least six task scenarios covering known failure modes. For each scenario, run the agent three times: baseline (no MCP tools, no skill — just the raw model), MCP-only (tools available but no skill guidance), and MCP plus skill (full stack). Score each run on a graded completeness scale. Compare scores across conditions to quantify the skill's impact. The MCP-only condition isolates whether tools alone solve the problem; usually they don't, which proves the skill's value.
How do I make agents actually fetch live documentation instead of using training data?
Include persistent, emphatic directives in skill.md instructing the agent to fetch live docs before any API interaction. Frame this as a hard requirement, not a suggestion. Provide the exact mechanism — URL pattern, docs-over-SSH interface, or search tool. Repeat the fetch-first instruction at multiple points in the skill file, especially before workflow sections where stale knowledge is most dangerous. The repetition counteracts the agent's default laziness toward tool calls.
// Troubleshooting
My skill.md works on Claude but fails on GPT — what should I do?
This is a known fragility pattern. Investigate which specific guidance the failing model ignores. Common fixes include: strengthening imperative language (use 'MUST' and 'NEVER' instead of 'should'), adding explicit rationale so the model has reasoning to anchor on, moving any still-in-reference-files content into skill.md, and restructuring lists as numbered sequences rather than bullets. Test again after each change. A robust skill should work across model families — single-model success is insufficient.
Agents still miss security checks even with my skill.md — how do I fix this?
First, confirm the security checks are in skill.md, not in reference files. If they are in reference files, move them immediately. If they're already in skill.md, strengthen the language: use bold formatting, place them in a clearly labeled security checklist section near the top, and add explicit consequences (e.g., 'Omitting this flag exposes user data and violates the platform's security model'). Run targeted evals on the specific failure. If necessary, duplicate the check at multiple relevant points in the skill.
What if my skill.md gets too long?
A skill.md that is too long may cause agents to skip or skim sections. The Rodrigues method explicitly warns against over-populating the first version. If your skill has grown large, audit every section: does each piece of guidance genuinely prevent an incorrect or unsafe outcome? If not, move it to a reference file or remove it entirely. Use evals to verify that trimming doesn't regress performance. A focused, minimal skill.md outperforms a comprehensive but ignored one.
How do I handle skill distribution when there's no package manager for skills?
Repo-bundling is currently the most reliable pattern. Place the skill folder (containing skill.md and any reference files) in a well-known directory within your product's repository — .claude, .cursor, or a dedicated .skills directory. Document the skill's location in your README. Version the skill alongside your codebase with semantic versioning. For cross-repo distribution, consider publishing the skill as a standalone repository or Git submodule. A universal skill registry does not yet exist; plan accordingly.
// Comparisons
How does the Rodrigues method compare to just writing a system prompt?
System prompts and skill.md serve different purposes. System prompts configure general agent behavior and persona. Skill.md provides product-specific knowledge and workflows that are progressively discovered — agents load skills based on task relevance via front matter matching, rather than carrying all instructions in every conversation. Skills also include reference files, versioning, eval infrastructure, and distribution via repos. A system prompt is a blunt instrument; a skill is a scoped, testable, iterable document.
How does skill.md differ from RAG for giving agents product knowledge?
RAG retrieves relevant documentation chunks at query time, while skill.md provides structured, opinionated instructions that the agent loads as a coherent document. RAG is good for breadth — answering arbitrary questions across a large corpus. Skill.md is good for depth — ensuring critical rules, workflows, and security checks are never missed. They are complementary: skill.md can instruct agents to use a RAG-backed search tool for detailed lookups while keeping non-negotiable guidance in the skill itself.
Is the Rodrigues method only for Supabase or can I use it for any product?
The method is product-agnostic. While Pedro Rodrigues developed it at Supabase, the framework applies to any product or platform where agents need current, specific guidance. It works for API platforms, developer tools, SaaS products, infrastructure services, and any system with security requirements or proprietary workflows. The core principles — don't duplicate docs, keep critical info in skill.md, be opinionated, start minimal — are universal to the problem of closing the context gap.
How does skill.md relate to tool-use and function calling?
Skill.md tells agents how to use tools correctly for your specific product. Tool definitions (via MCP or function calling) tell agents what tools are available and their parameters. Without skill guidance, agents may call the right tools in the wrong order, skip required security steps between tool calls, or use deprecated parameter patterns. The skill is the intelligence layer; tool definitions are the capability layer. You need both for correct agent behavior on complex products.
// Advanced
Can I use the Rodrigues method for internal tools, not just public products?
Yes. Internal tools often have an even larger context gap because they are entirely absent from model training data. The method is especially valuable for internal platforms with proprietary security requirements, custom APIs, or complex multi-step workflows that no public model has encountered. Internal teams can distribute skills via internal repos and test against internal-only eval scenarios. The same principles apply: start minimal, audit failure modes, encode opinionated workflows, and run evals.
How do I version and maintain a skill.md over time?
Treat skill.md as a versioned artifact with the same discipline as software. Create new versions for major changes — API overhauls, new security requirements, workflow restructuring. Use semantic versioning in the skill's metadata. Re-run your eval suite after each update to catch regressions. Monitor agent behavior for new failure modes that indicate the skill needs updating. Don't let the skill go stale; its value depends on being more current than the agent's training data.
What is docs-over-SSH and when should I use it for agent skills?
Docs-over-SSH is an interface pattern where product documentation is exposed via SSH, allowing agents to navigate it using familiar file-system commands (ls, cat, grep) rather than requiring web-fetching tools. Use it when your agents operate in terminal-based environments (like coding assistants) where file-system navigation is more reliable than HTTP requests. It provides a natural interface for agents to explore documentation programmatically and can be referenced in skill.md as the preferred method for fetching live docs.
How many reference files should a skill include?
As few as possible. Design for the reality that agents will load at most one reference file per task and almost never load two or more. Each reference file should be self-contained for a single sub-topic. Link to them explicitly from skill.md with clear conditions for when each should be loaded. If evals consistently show a reference file being skipped, either its content isn't important (remove it) or it is important (promote it to skill.md). Three to five reference files is a practical upper bound.
What scoring system should I use for skill evals?
Use a graded completeness score rather than binary pass/fail. A practical approach: score each eval scenario from 0 to 10 based on how completely and correctly the agent performed the task. Define specific scoring criteria per scenario — for example, correct API usage (2 points), security flag present (3 points), correct workflow order (3 points), used live docs instead of training data (2 points). Compare aggregate scores across baseline, MCP-only, and MCP+skill conditions to quantify the skill's impact.