How Should AI Product Managers Align Product and Research?

For AI product managers at large tech companies · Based on Emit Jane Luma Foundation Lab Method

// TL;DR

AI product managers in large organizations often face the structural problem of misaligned research and product roadmaps. The Foundation Lab Method provides a framework for eliminating this divide: every product feature should be reframed as a model capability question, and every model improvement should be defined by the product capability it unlocks. This framework teaches PMs to build thin product stacks, target professions over verticals, deploy Forward Deployed Creatives for dual customer-research intelligence, and apply the intelligence threshold test before consumer launches. Use it to stop building engineering harnesses that the next model makes irrelevant.

Why do engineering harnesses around model gaps fail?

When the base model cannot do something the product needs, the instinct is to build an engineering workaround — multi-step prompt chains, orchestration systems, retrieval-augmented scaffolding. The Foundation Lab Method calls these 'spaghetti harnesses' and identifies them as six-to-eight-month dead ends.

The reasoning is straightforward: the next model iteration will likely close the gap natively, making the harness irrelevant. Worse, the harness consumes engineering resources that could have been spent collecting the training data needed to close the gap in the model itself.

The correct response to a model capability gap is to treat it as a data collection job for the next training run — typically a two-to-three-week effort — not a months-long engineering project. As a product manager, your job is to maintain the thinnest possible product layer on top of base model capability and ensure that the model absorbs complexity with each iteration.

How should AI product managers think about target markets?

Stop thinking in verticals. 'Healthcare AI' or 'entertainment AI' are abstractions that obscure what real humans need. Think in professions: surgeons, filmmakers, financial analysts, architects. Each profession has a specific end-to-end workflow with specific failure modes and specific magic moments.

The key question is: what does end-to-end look like for this profession? A filmmaker's end-to-end is concept → shoot → edit → set changes → final output — not 'make a clip.' A marketer's end-to-end is understanding the environment → resonant message → localized assets at scale. Your product must target the full workflow, not a fragment.

Spot work products — 'I can make an image' or 'I can write some copy' — will be commoditized as base models improve. End-to-end workflow products built on process data from real professionals create durable competitive advantage.

How do I set up the right feedback loop between product and research?

Deploy Forward Deployed Creatives (FDCs) to your most important enterprise customers. FDCs are domain experts embedded in customer organizations who serve two functions simultaneously: (1) helping customers successfully deploy AI into their workflows, and (2) piping structured intelligence back to the research team about what works, what breaks, and what data is needed.

Every enterprise deployment should be treated as an optimization experiment, not a support ticket. The product telemetry should capture process data — the full path from initial prompt to final output, including iterations, edits, and decision points. This process data is what the internet cannot supply and what end-to-end agents require for training.

Build your product roadmap and your research roadmap as one document. If a product feature request doesn't translate into a model training priority, question whether it belongs in the roadmap at all.

When should I recommend launching a consumer product?

Apply the intelligence threshold test. Ask: does the model understand context, humor, and the local state of this specific user well enough that the output would be genuinely interesting to that person? If the answer is no, a consumer launch will produce a novelty spike followed by retention collapse.

A generated video is not interesting because it is generated — it is interesting because of what is happening in it. Until the model passes this threshold, enterprise and professional deployments are where you'll find sustainable engagement and the process data to improve the model toward eventual consumer viability.

Next step: Audit your current product stack. Identify every component that exists to compensate for a model capability gap. For each one, estimate whether the next training run could close the gap natively, and reallocate engineering resources from harness maintenance to data collection for model training.

// FREQUENTLY ASKED QUESTIONS

How do I measure whether my product stack is getting thinner over time?

Track the number of engineering components that exist solely to compensate for model limitations — prompt chains, orchestration layers, rule-based post-processing. After each model iteration, measure how many of these components can be eliminated because the model now handles the capability natively. A healthy foundation lab shows a declining count of workaround components and an increasing percentage of product value delivered directly by the base model.

How do I prioritize which profession to target first?

Choose the profession where your current model capability covers the most of the end-to-end workflow with the thinnest product stack. Also weight for process data value — professions whose workflows generate the most useful training signals for your next model iteration should be prioritized. The ideal first profession has clear end-to-end workflow boundaries, high tolerance for AI augmentation, and workflows that produce rich process data when logged.

What does a unified product-research roadmap look like?

Each item on the roadmap has two columns: the product capability it unlocks and the model training investment it requires. Feature requests from customers are translated into model capability gaps. Model capability milestones are defined by the product features they enable. There is no 'research backlog' separate from the 'product backlog.' The roadmap is sequenced by the compounding effect — items that simultaneously deliver product value and generate high-quality training data for the next iteration are prioritized highest.