Emit Jane Luma Foundation Lab Method

Apply Luma's foundation lab methodology to design AI companies, products, and research loops that are jointly optimized end-to-end — so that product and research compound each other rather than compete.

// TL;DR

The Emit Jane Luma Foundation Lab Method is a strategic framework for designing AI companies where product and research are one unified system—not separate teams. It teaches you to jointly optimize model training and product deployment so they compound each other. Use it when planning an AI company's research-product strategy, deciding between vertical vs. generalist systems, evaluating scaling bets, or designing data flywheels that capture process data (how artifacts are made) rather than just artifacts. The method targets professions instead of verticals and enforces a thin product stack on top of base model capability, with multimodal AGI as the north star.

// When should I use the Luma Foundation Lab Method?

Use this skill when designing or evaluating an AI company's research-product strategy, deciding whether to build a narrow vertical product vs. a generalist system, or planning how to couple model training with real customer usage data.

// What inputs do I need to apply the Foundation Lab Method?

  • Company or project descriptionrequired
    What the company or project is building — the core modality, problem domain, or product area (e.g., visual AI, coding agents, robotics).
  • Current stagerequired
    Where the team is today: pre-product, early product, scaling, or enterprise deployment.
  • Target professions or customer segmentsrequired
    Which professions or types of people the product serves — framed as professions, not verticals (e.g., filmmakers, marketers, game artists, not 'entertainment industry').
  • Known model capability gaps
    What the current model cannot do that the product needs — the delta between model capability and product promise.
  • Data availability
    What proprietary or scarce data exists or could be collected that the internet cannot supply (e.g., process data, how artifacts were made, not just the artifacts themselves).

// What are the core principles of the Foundation Lab Method?

Foundation Lab: No Product or Research — Only One Thing

In a foundation lab, there is no product or research as separate functions. Research produces the product and the product works in research. The secret to building a great company in this space is to treat them as one unified system. Foundation labs are the blueprint of companies of the future.

End-to-End Optimization as Prime Directive

The only way AI systems will do meaningful things in the world is through joint end-to-end optimization — top to bottom. Never optimize a narrow sub-problem in isolation for months; if the model doesn't do it today, that is a data collection job for the next training run, not an engineering harness problem.

Promise of AI Is Not Spot Work

The promise of AI is not doing a little bit of spot work. 'I can make copy' or 'I can make a little image for you' is not the goal. The goal is the full end-to-end solution: the book, the campaign, the film production — not a fragment of it. Customers want end-to-end solutions to their problems, not promises of solutions.

Think in Professions, Not Verticals

Do not think about verticals when designing products. Think about professions — which kinds of people you can help. Verticals are abstractions; professions are humans with specific workflows and end-to-end problems that generalist systems must solve.

Thin Stack on Top of Base Model Capability

Build the thinnest possible product on top of base model capability. If the product ends up being 'a little bit fat,' the next model's job is to reduce that fatness. Avoid spaghetti harnesses and complex workaround systems — they are six-month dead ends that the next model iteration makes irrelevant.

Data Flywheel via Deployed Agents

The internet gives you artifacts but not the process of how artifacts were made. To train agents that do end-to-end work, you need process data — the actions, iterations, and decisions that produced the final output. Deploy agents to real customers, observe the best creatives using them, and feed that intelligence directly back into model training.

Multimodal AGI as North Star

The guiding principle is one multimodal AGI. Every product decision, every research bet, and every data collection strategy should be evaluated against whether it moves you toward a single tower that jointly models language, audio, video, images, and physical context — not separate towers per modality.

Distribution and Data Before Model

If you don't think about distribution you are dead in ML. If you don't think about data you are dead. Before training the target model, ask: where does the scarce data come from? If there is no YouTube of your modality, the first product must be something people love to use for free that generates that data at scale.

Think in Logarithms

When evaluating scaling bets, think in logarithms. The right question is: if the next model is 10x larger in compute and parameters, would it be a categorically different thing — not just incrementally better? If the answer is not an obvious yes, the constraint is architectural or data quality, not scale.

// How do you apply the Foundation Lab Method step by step?

  1. 1

    Define the North Star as Multimodal AGI and Joint End-to-End Optimization

    Before any product or research decision, state the prime directive explicitly: (1) multimodal AGI as the destination and (2) joint end-to-end optimization as the method. Every subsequent decision is evaluated against these two poles. If a product decision does not feed back into the model and if a model improvement does not make a product better, it is misaligned.

  2. 2

    Identify the scarce data problem for your modality

    Ask: is there a YouTube of this modality? A Wikipedia of it? If no, the first product must generate that data. Do not wait to know the exact scale needed — scaling laws for new modalities are unknown early. Release something people love to use for free that produces data at scale. Expect to not know if you need 1 million or 1 trillion examples.

  3. 3

    Map current base model capability honestly against the end-to-end product promise

    List what the model can and cannot do today. Flag every gap where the product currently requires an engineering harness or workaround. Each gap is not an engineering project — it is a data collection and training job. Categorize each gap: does it require a fine-tuning run, a new training run, or a full pre-training investment in compute?

  4. 4

    Build the thinnest possible product stack on top of current model capability

    Resist the urge to build complex orchestration systems to paper over model gaps. Build the thinnest product that delivers real value to real professions today. Fatness in the product stack is technical debt that the next model iteration must pay down. The product's job is also to generate the training signal for the next model.

  5. 5

    Target professions, not verticals

    Reframe every market conversation from 'which vertical' to 'which profession.' Professions have specific workflows, specific failure modes, and specific magic moments. Ask: what does end-to-end look like for this profession? A filmmaker's end-to-end is concept → shoot → edit → set changes → final output — not 'make a clip.' A marketer's end-to-end is understanding the environment → resonant message → localized assets at scale.

  6. 6

    Deploy Forward Deployed Creatives (FDCs) to enterprise customers

    FDCs (Forward Deployed Creatives) are not sales engineers — they serve two jobs simultaneously: (1) help customers actually deploy powerful systems into their complex organizational workflows, and (2) pipe the resulting intelligence — what works, what breaks, what data is needed — directly back to research and model training. Treat every enterprise deployment as an optimization loop, not a support ticket.

  7. 7

    Capture process data, not just artifact data

    The internet supplies artifacts (movies, images, code). It does not supply how those artifacts were made — the actions, iterations, and decisions. End-to-end agents require process data. Every interaction in your deployed product is a training signal. Build the product so that the path to the artifact, not just the artifact, is logged and usable for training.

  8. 8

    Apply the 10x logarithmic scaling test at each model iteration

    Before each major training run, ask: if this model were 10x larger in compute and parameters, would it be categorically different — or just incrementally better? If the answer is not obvious yes, the bottleneck is not scale. Diagnose whether the constraint is: (a) insufficient modality coverage (e.g., missing audio, missing language tower), (b) data quality/process data gaps, or (c) architectural limitations. Fix the real constraint before scaling.

  9. 9

    Unify modalities into a single tower progressively

    The shape of a world model is a single tower that jointly models language, audio, video, images, and physical context as one single signal stream. Do not build separate towers per modality — fuse them. Prioritize: language + video + audio covers approximately 90% of the path to a world model. Start with the highest-leverage fusion (language + image or language + video) and expand. Measure whether each fusion enables things that were categorically impossible before.

  10. 10

    Evaluate consumer vs. enterprise deployment using the intelligence threshold test

    Consumers consume; creators create. A generative product aimed at consumers is premature until the models are intelligent enough to understand context, humor, and the local state of the user. Apply the test: does the model understand why this content would be interesting to this specific person in this specific context? If no, enterprise deployment is the correct focus — businesses are responsible for 99% of pixels on screens every day and have clear end-to-end problems the model can solve now.

// What does the Foundation Lab Method look like in practice?

A startup building AI tools for architecture firms has good 3D rendering models but struggles to grow beyond individual tool usage into full workflow adoption.

Apply the 'Promise of AI is Not Spot Work' principle: the product is currently solving a spot work problem (make a render faster) rather than the end-to-end problem (go from brief → concept → full permit-ready design package). Reframe the product around the architect profession's end-to-end workflow. Identify the process data gap — the internet has finished buildings but not the decision path from brief to building. Deploy FDCs to architecture firms to capture that process data and pipe it back to model training. Build the thinnest stack that covers the full workflow loop, let the next model iteration reduce the harness.

An AI company has a strong language model and is debating whether to build a separate vision model or attempt a unified model.

Apply the Single Tower / Unified Model principle. A separate vision model creates two towers that do not jointly optimize. The unified model approach — one backbone fusing language and image tokens — enables things categorically impossible with separate towers (e.g., understanding who a character is across a long production, reasoning about visual states in code). Apply the 10x logarithmic test: would scaling the language-only model 10x produce categorically better visual reasoning? Almost certainly not — the constraint is architectural, not scale. Invest in the unified model architecture even though it is 'ridiculously hard to train' because it is the only path to end-to-end optimization.

A team is considering launching a consumer social network built around AI-generated video content.

Apply the intelligence threshold test before launch. Ask: do the models understand context, humor, and the local state of each user well enough that the generated content would be interesting to a specific person? If no, the product will have a strong day-one spike (novelty of generation) followed by rapid retention collapse — users scroll for a few days and ask 'now what?' because a generated video is not interesting because it is generated; it is interesting because of what is happening in it. Defer consumer launch until the unified model has sufficient intelligence. In the interim, focus on enterprise and professional creator deployments where end-to-end workflow value does not depend on contextual entertainment intelligence.

// What mistakes should I avoid when using the Foundation Lab Method?

  • Treating product and research as separate teams with separate roadmaps — in a foundation lab they are one unified system and must be jointly optimized.
  • Building complex engineering harnesses to paper over model capability gaps — this is a 6-to-8-month dead end; the correct response is to treat the gap as a 2-to-3-week data collection job for the next training run.
  • Thinking in verticals instead of professions — verticals are abstractions that obscure the actual end-to-end workflow a human needs solved.
  • Chasing consumer deployment before the models are intelligent enough to understand context and local user state — this produces novelty spikes followed by retention collapse, as the content is not interesting because it is generated.
  • Collecting only artifact data (finished outputs) and not process data (how the artifact was made) — agents that do end-to-end work require process data that the internet cannot supply.
  • Assuming scale alone (10x parameters, 10x compute) will fix a categorical capability gap — if the answer to 'would 10x scale make this categorically different' is not an obvious yes, the constraint is architectural or data quality, not scale.
  • Building separate modality towers instead of a unified single tower — separate towers cannot jointly optimize and prevent the model from developing true physical world understanding.
  • Solving spot work problems and calling it AI transformation — the promise of AI is the full end-to-end solution (the book, the campaign, the film), not a faster fragment of the workflow.

// What are the key terms and concepts in the Foundation Lab Method?

Foundation Lab
A company architecture in which product and research are not separate functions but one unified system. Research produces the product; the product works in research. Foundation labs are described as 'the blueprint of companies of the future' because their economics are driven by compute and research, not by individual software products, enabling new products to be launched at approximately 1% of the balance sheet.
End-to-End Optimization
The prime directive and guiding methodology: joint top-to-bottom optimization across the full stack — from base model training through product deployment and back. The only way AI systems will do meaningful things in the world. The opposite of optimizing a narrow sub-problem in isolation.
World Model
A model that has understanding of the physical world and is able to simulate it. Not defined by real-time speed or autoregressive architecture. Defined by understanding laws of physics, causality, time, and human language — all as one single signal stream. The shape of a world model is a single tower jointly modeling language, audio, video, images, and physical context.
Unified Model
A single-backbone model with a language tower and one or more modality towers (image, video, audio) fused into one single thing, jointly trained on both language tokens and continuous signal tokens. The unified model is the architectural path to a world model. Unified models enable things categorically impossible with separate modality towers.
Single Tower
The architectural ideal for a world model: one model that processes language, audio, video, images, and physical context as one single signal stream without separate towers per modality. Analogous to the human brain operating across all modalities without separate systems.
FDC (Forward Deployed Creative)
A Luma-invented role analogous to Palantir's forward deployed engineers, but for creative and visual domains. FDCs serve two simultaneous functions: (1) help enterprise customers deploy powerful AI systems into their complex organizational workflows, and (2) pipe intelligence from real customer usage directly back to model research and training pipelines.
Process Data
Training data that captures how an artifact was made — the actions, iterations, and decisions in the path to a final output — as opposed to artifact data (the finished output itself). Process data is what the internet cannot supply and what end-to-end agents require to learn to do end-to-end work.
Promise of AI Is Not Spot Work
A core principle stating that AI's value is not in doing fragments of workflows faster ('I can make copy,' 'I can make a little image') but in delivering full end-to-end solutions — the book, the campaign, the film production. Spot work solutions will be commoditized; end-to-end solutions are the durable value.
Think in Logarithms
A scaling evaluation heuristic: when assessing the next major model investment, ask whether a 10x increase in compute or parameters would produce a categorically different model — not just an incremental improvement. If the answer is not obvious yes, the constraint is not scale but architecture, data quality, or missing modality coverage.
Thin Stack
The product architecture principle of building the thinnest possible product layer on top of base model capability. Fatness in the stack represents problems the model cannot yet solve natively. The next model's job is to reduce that fatness. Thick stacks built around model gaps become irrelevant with each new training run.
Intelligence Threshold Test
The test for whether a consumer generative product is viable: does the model understand context, humor, and the local state of the specific user well enough that the output would be genuinely interesting to that person? Below this threshold, consumer generative networks produce novelty spikes followed by retention collapse.

// FREQUENTLY ASKED QUESTIONS

What is the Luma Foundation Lab Method?

The Luma Foundation Lab Method is a framework for building AI companies where product and research are one unified system rather than separate functions. Research produces the product, and the product generates training data for research. It emphasizes end-to-end optimization, targeting professions instead of verticals, building thin product stacks on base model capability, capturing process data through deployed agents, and evaluating every decision against the north star of multimodal AGI.

What is a foundation lab in AI?

A foundation lab is a company architecture where product and research are not separate departments but one unified system. Research directly produces the product, and the product functions as a research instrument by generating training signals. Economics are driven by compute and research rather than individual software products, enabling new products to launch at roughly 1% of the balance sheet. It is described as the blueprint for companies of the future.

How do I apply the Foundation Lab Method to my AI startup?

Start by defining multimodal AGI and joint end-to-end optimization as your explicit north star. Then identify your scarce data problem—if no internet-scale dataset exists for your modality, your first product must generate that data. Map your model's current capabilities against the full end-to-end promise, build the thinnest possible product stack, target specific professions, deploy forward deployed creatives to enterprise customers, and capture process data that feeds back into model training.

How do you decide between building a vertical AI product vs. a generalist system?

Reframe the question entirely: think in professions, not verticals. Verticals are abstractions that obscure real human workflows. Instead, ask which professions you can serve end-to-end. A generalist system built around professions solves the full workflow—brief to final output—rather than spot work fragments. The Foundation Lab Method argues that end-to-end solutions for specific professions are durable value, while narrow vertical tools solving fragments will be commoditized.

How does the Luma Foundation Lab Method compare to traditional AI product development?

Traditional AI product development separates research and product teams with distinct roadmaps. The Foundation Lab Method rejects this entirely—product and research are one system. Traditional approaches often build thick engineering harnesses to patch model gaps; this method treats every gap as a data collection job for the next training run. Traditional approaches optimize sub-problems in isolation; this method demands joint end-to-end optimization top-to-bottom. The result is compounding returns rather than competing priorities.

When should I use the Foundation Lab Method?

Use it when designing or evaluating an AI company's research-product strategy, deciding whether to build a narrow vertical product vs. a generalist system, or planning how to couple model training with real customer usage data. It applies at any stage—pre-product, early product, scaling, or enterprise deployment—but is most critical when making foundational architecture decisions about how product and research relate to each other.

What is process data and why does it matter for AI training?

Process data captures how an artifact was made—the actions, iterations, and decisions in the path to a final output—as opposed to artifact data, which is just the finished result. The internet supplies artifacts (movies, images, code) but not the process behind them. End-to-end AI agents require process data to learn complete workflows. You capture it by deploying agents to real customers and logging every interaction path, not just final outputs.

What results can I expect from applying the Foundation Lab Method?

You can expect product and research to compound each other instead of competing for resources. Each product deployment generates training data that makes the next model better, which in turn makes the product thinner and more capable. You avoid six-to-eight-month dead ends from engineering harnesses. You build durable competitive advantages through proprietary process data flywheels. And you create an organization capable of launching new products at approximately 1% of the balance sheet.

What is the intelligence threshold test for consumer AI products?

The intelligence threshold test asks whether the model understands context, humor, and the local state of the specific user well enough that generated content would be genuinely interesting to that person. Below this threshold, consumer generative products produce novelty spikes followed by retention collapse—users engage for a few days then churn because the content isn't interesting for what's happening in it, only for the novelty of being generated. Enterprise deployment is the correct focus until models pass this threshold.

What does think in logarithms mean for AI scaling decisions?

Think in logarithms is a scaling evaluation heuristic: before each major training run, ask whether a 10x increase in compute or parameters would produce a categorically different model—not just an incremental improvement. If the answer isn't an obvious yes, the bottleneck isn't scale. It's likely architectural limitations, missing modality coverage, or data quality gaps. Fix the real constraint before investing in scale. This prevents wasting massive compute budgets on diminishing returns.

// GET STARTED

Turn Any YouTube Video Into An AI Skill

SkillForge captures a creator's exact methodology from their video and turns it into a reusable AI skill you can invoke in Claude, ChatGPT, or any LLM.

Forge your own skill