How Do Platform Teams Build Agent Skills That Work?

For Developer platform teams · Based on Rodrigues Product Skill Architecture Method

// TL;DR

Developer platform teams use the Rodrigues Product Skill Architecture Method to create skill.md files that prevent AI agents from bypassing security policies, using deprecated APIs, or following incorrect workflows on their platform. The method starts by auditing known agent failure modes, then encodes non-negotiable security rules and opinionated workflows directly into skill.md — not reference files that agents will skip. Evals run across multiple models validate that the skill actually improves agent behavior. This is essential for any platform where incorrect agent behavior could expose user data or break production systems.

Why Do AI Agents Get Your Platform Wrong?

AI agents interact with developer platforms using knowledge from their training data — which is often stale, incomplete, or outright wrong for your current API surface. The result is agents that skip required security flags, call deprecated endpoints, generate redundant artifacts, and follow workflows that made sense two versions ago but cause problems today.

This is the context gap: the delta between what an agent learned during training and what it needs to know to work correctly with your platform right now.

The Rodrigues Product Skill Architecture Method, developed by Pedro Rodrigues at Supabase, gives platform teams a systematic framework for closing this gap through structured skill.md documents.

How Do You Build a Skill for Your Platform?

Start with an audit. List every way agents currently get your platform wrong. Common failure categories for developer platforms include:

- Missing security flags (e.g., row-level security bypasses when creating SQL views)

- Using deprecated API endpoints or parameter patterns

- Generating redundant migration files on every incremental change

- Ignoring your platform's recommended workflow ordering

This audit defines the scope of your skill. Every failure mode becomes a test case in your eval suite.

Next, identify your single source of truth — your canonical documentation. The skill will point agents here persistently and emphatically, rather than duplicating content. If your docs are available via URL, SSH, or a semantic search layer, specify the exact access mechanism in skill.md.

Then apply the critical classification rule: if the agent missing a piece of guidance would cause an incorrect or unsafe outcome, it goes directly into skill.md. Everything else can be a reference file — but assume agents will skip it. Security checklists, non-negotiable flags, and opinionated workflows are skill.md content. Optional configuration details and edge-case documentation are reference file candidates.

What Should Your Skill's Opinionated Workflows Look Like?

Platform teams know the right way to do things on their platform. Encode that knowledge as explicit, ordered sequences:

1. Make schema changes freely in the development environment

2. Run the platform's advisor/linter tool to surface security and performance issues

3. Fix all flagged issues

4. Only then generate the migration file

Include the rationale for each ordering decision. Agents anchor better on reasoning than on bare rules. Explain why generating a migration after every small edit causes version control noise and deployment issues.

How Do You Validate That Your Skill Actually Works?

Write at least six eval scenarios covering your known failure modes. Run each under three conditions:

- Baseline: no MCP tools, no skill — raw model only

- MCP-only: tools available, but no skill guidance

- MCP + skill: full stack with skill.md loaded

Score each run on a graded completeness scale. The MCP+skill condition should consistently outperform the others, especially on security-critical tasks.

Test across at least two model families. A skill that only works on Claude but fails on GPT is fragile. If a model fails, strengthen the skill.md language — don't assume the model is at fault.

When evals show agents skipped reference file content, promote that content to skill.md. Iterate until your eval scores are consistently high across models and scenarios.

What's the Best Way to Distribute Your Skill?

Bundle the skill folder inside your platform's repository — in a `.claude`, `.cursor`, or `.skills` directory. Version it alongside your codebase. Document its location in your README. There is currently no universal skill package manager; repo-bundling is the most reliable distribution pattern.

Start building your skill today: audit your failure modes, draft a minimal skill.md with your non-negotiable rules, write three eval scenarios, and run them. You can expand from there based on results.

// FREQUENTLY ASKED QUESTIONS

How do platform teams decide what goes in skill.md versus reference files?

Apply the 'if it can be skipped, it will be skipped' principle. Security checklists, required flags, and core workflow sequences go in skill.md because agents must never miss them. API usage examples for uncommon endpoints, optional configuration details, and supplementary context can go in reference files — but design with the assumption that agents will load at most one reference file per task.

How many evals should a platform team write for their skill?

At least six scenarios covering your most critical failure modes. Include scenarios for security bypasses, deprecated API usage, workflow ordering violations, and any platform-specific gotchas. Run each scenario across baseline, MCP-only, and MCP+skill conditions on at least two model families. More scenarios give more confidence, but start with your highest-impact failure modes and expand from there.

Can a platform team use skill.md alongside their existing MCP server?

Yes, and you should. MCP provides the tools — the actions agents can take on your platform. Skill.md provides the guidance — which tools to use, in what order, with which safety checks. Without the skill, agents with MCP access will still miss security steps and follow incorrect workflows. They are complementary: MCP is the capability layer, skill.md is the intelligence layer.