How Should AI PMs Align Product Roadmaps With Research?
For AI product managers at enterprise companies · Based on Emit Jane Luma Foundation Lab Method
// TL;DR
AI product managers should stop building product roadmaps independently from research roadmaps. The Foundation Lab Method shows PMs how to evaluate every feature through the lens of joint optimization: does this feature generate training data for the next model? Does the next model improvement eliminate the need for this feature? PMs should target professions with end-to-end workflow solutions, build the thinnest possible product layer, deploy Forward Deployed Creatives to capture process data, and use the 10x logarithmic test to evaluate whether scaling or architecture changes will solve capability gaps.
How should AI product managers think about feature prioritization?
Every feature decision in an AI product should pass two tests: (1) does building this feature generate training data that will improve the next model? And (2) would the next model improvement eliminate the need for this feature entirely?
If a feature fails both tests, it's misaligned with the Foundation Lab Method. Features that pass the first test—generating training signals—are highest priority even if they seem simple. Features that exist only to compensate for model gaps and don't generate training data are the dangerous ones: they represent six-to-eight months of engineering that the next model iteration makes irrelevant.
The PM's primary job in a foundation lab is managing the compound loop between product and research, not building a traditional feature backlog.
What's wrong with building engineering harnesses to cover model gaps?
Engineering harnesses—complex orchestration systems, multi-agent pipelines, prompt chains—are the most common trap for AI PMs. They feel productive because they solve immediate user pain. But they are dead ends.
The Foundation Lab Method is explicit: if the model doesn't do something today, that is a data collection job for the next training run, achievable in two to three weeks. It is not a justification for a multi-month engineering project. Every harness you build becomes technical debt that the next model must pay down, and often the next model makes the entire harness irrelevant.
As a PM, your job is to resist the organizational pressure to build workarounds and instead redirect energy toward collecting the specific training data that would make the model natively capable. Work with your research counterpart—or in a true foundation lab, this is the same team—to define what data collection would close each gap.
How should AI PMs define their target users?
Stop thinking in verticals. Think in professions.
When your sales team says 'we're targeting the entertainment vertical,' push back. Ask instead: which profession are we serving? Filmmakers? Editors? Marketing directors at studios? Each profession has a specific end-to-end workflow with specific failure modes and specific magic moments.
A filmmaker's end-to-end is concept → shoot → edit → set changes → final output. If your product only solves 'make a clip,' you're doing spot work—not delivering the promise of AI. The Foundation Lab Method insists that spot work solutions will be commoditized. End-to-end solutions for specific professions are durable value.
Map the full workflow of your target professions. Identify where the model succeeds natively and where it fails. Each failure point is a process data collection opportunity, not a feature request for engineering.
How do AI PMs evaluate whether to launch consumer features?
Apply the intelligence threshold test before every consumer-facing launch. The test: does the model understand why this content would be interesting to this specific person in this specific context?
If your model can generate impressive outputs but can't tailor them to individual context, humor, and relevance, consumer features will produce novelty spikes followed by retention collapse. PMs often mistake initial engagement metrics for product-market fit in generative AI—this is a dangerous error.
Focus product energy on enterprise and professional users until the model passes the intelligence threshold. Enterprise customers have clear end-to-end workflow problems where contextual entertainment intelligence isn't required. Every enterprise deployment through Forward Deployed Creatives generates the process data needed to eventually reach consumer-grade intelligence.
Audit your current product roadmap against these principles today. Flag every feature that's an engineering harness, reframe every vertical target as a profession, and establish a shared metric with your research team measuring the compound loop's velocity.
// FREQUENTLY ASKED QUESTIONS
How do I align my AI product roadmap with research priorities?
Create one unified roadmap with shared metrics. Every product feature should either generate training data for the next model or directly benefit from the next model improvement. If a feature does neither, deprioritize it. Replace separate product and research planning cycles with joint sessions where capability gaps are translated into data collection jobs rather than engineering projects. Measure the velocity of the compound loop, not just feature ship rate.
Should I build orchestration features to compensate for model limitations?
Avoid it. The Foundation Lab Method classifies engineering harnesses as six-to-eight-month dead ends. Each model capability gap should be treated as a two-to-three-week data collection job for the next training run. Building complex workarounds creates technical debt that the next model iteration invalidates. Instead, ship the thinnest product that delivers value today and redirect engineering effort toward collecting the specific training data that would make the model natively capable.
How do I measure success as a PM in a foundation lab?
Measure the compound loop velocity: how fast does product usage generate training data, and how fast does each model improvement reduce product stack thickness? Track process data volume captured per deployment, the number of engineering harnesses eliminated by each model iteration, and end-to-end workflow completion rates for target professions. Traditional PM metrics like feature count or sprint velocity are misleading in a foundation lab context.