Frequently Asked Questions About Ng Deep Learning Project Execution Skill

21 answers covering everything from basics to advanced usage.

// Basics

What is the difference between deep learning and machine learning in Ng's framework?

Deep learning is a subset of machine learning that uses neural networks trained on large amounts of data. In Ng's framework, the critical practical distinction is the scaling law: traditional ML algorithms plateau as you add more data, while deep learning performance keeps improving with more data and larger networks. This means project design should exploit scale—but only after diagnostics confirm that scale is the actual bottleneck in your specific application.

What does Andrew Ng mean by 'data is weird and wonderful'?

Ng means that you never fully know what is in your training data until you actively explore it. The output of any ML system depends on both code and data—you control code 100% but data always contains surprises: unusual accents in speech data, background speakers, class imbalances, labeling errors, or distribution shifts. Treating data exploration as a required step rather than an afterthought prevents derailing surprises later in the project.

What are scaling laws in deep learning and why do they matter for my project?

Scaling laws are the empirically observed, predictable relationship between compute/data investment and model performance. They show that deep learning performance improves reliably as you scale up data and model size. This predictability drove massive infrastructure investment in AI. For your project, scaling laws mean that if diagnostics reveal data or model size as the bottleneck, you can forecast the gains from scaling—but you must verify this applies to your specific application rather than assuming it universally.

How do I know if my problem is greenfield?

A greenfield application is one that no one in the world has worked on before, with no existing parallel projects or research literature to benchmark against. If you cannot find comparable published results, Kaggle competitions, or industry case studies for your specific data type and task, treat it as greenfield. The practical implication is that you cannot estimate data requirements or expected performance upfront—you must collect a small dataset, build a quick baseline, and use that model's performance as your diagnostic instrument.

What is the biggest mistake teams make in deep learning projects?

The biggest mistake is choosing interventions without running diagnostics first. Less experienced teams pick things to work on almost at random—one week collecting more images, next week buying GPUs—driven by hype or gut feeling rather than systematic error analysis. Disciplined teams run diagnostics first to identify the highest-leverage intervention. This single practice is the most important differentiator between teams that finish in days versus teams stuck for months, and it is the core of Ng's methodology.

// How To

How do I build a quick and dirty prototype for a deep learning project?

Build the simplest possible version in a contained sandbox environment with no sensitive data and no external exposure. Use AI-assisted coding to accelerate development. Explicitly lower security and scalability requirements—the goal is a feedback instrument, not a shippable product. You want to discover what is in the data and whether your approach is viable. This prototype should take days, not weeks. Run many variants cheaply rather than investing deeply in one approach.

How do I tune hyperparameters effectively in a deep learning project?

Hyperparameters are settings like learning rate, network size, and batch size that control how a neural network trains. Change one variable at a time with a clear hypothesis about its effect. Track all experiments systematically. Learning rate and network size are the most impactful hyperparameters to tune first. This step is not glamorous but it is decisive—your practical skill at hyperparameter tuning directly determines how quickly you get a model to train well and separates fast teams from slow ones.

How do I transition a deep learning prototype to production?

Once a prototype proves the approach is viable, shift to production-grade implementation by reintroducing full security, scalability, and reliability requirements. This is an explicit phase transition—do not apply production standards during prototyping and do not skip them during deployment. Use AI-assisted coding but review generated code rigorously, especially for database operations where agentic tools can cause irreversible data loss. Document the architectural decisions from prototyping that inform the production design.

How do I fine-tune a smaller model to reduce LLM API costs?

Engineer a labeled dataset from your existing production traffic—the LLM's outputs on real user queries become training labels. Select a smaller open-source pre-trained model (e.g., a distilled transformer) and continue training it on your task-specific data. Deploy this fine-tuned model as a replacement for expensive API calls. This is the critical skill that bends the cost curve at scale. Validate that the fine-tuned model maintains acceptable quality by testing against a held-out set of production examples before switching over.

// Troubleshooting

My deep learning model's accuracy is stuck and nothing I try works. What should I do?

Stop trying random interventions and run diagnostics. Examine the specific examples the model gets wrong and categorize the failure modes. Determine whether the issue is data quality, data quantity for specific categories, model capacity, hyperparameter settings, or a fundamental task definition mismatch. For complex systems, isolate which component is responsible for most errors. The answer to 'what should I try next' always comes from the diagnostic, never from hype or gut feeling. This single discipline separates teams that finish quickly from those stuck for months.

I've been tuning prompts for weeks and the LLM still can't do what I need. What now?

If after roughly a month of serious prompt tuning you cannot close the performance gap, the problem likely requires dropping from the GenAI layer to the deep learning layer. Document your prompt tuning attempts and their results so the team does not cycle back. Collect a labeled dataset for your specific task and fine-tune a model directly using deep learning techniques. This is especially true for tasks involving audio, image, video, or structured data, where LLM prompting is fundamentally the wrong abstraction layer.

Our AI product's API costs are growing unsustainably. How do we fix this?

You have hit the inflection point where bending the cost curve requires moving from the GenAI layer to the deep learning layer. Engineer a labeled training dataset from your existing production traffic—use the LLM's responses as training labels. Fine-tune a smaller open-source model on that data and deploy it as a replacement for expensive API calls. This can reduce costs by 10x or more while maintaining acceptable quality. This is the critical deep learning skill that makes AI products economically viable at scale.

My team keeps arguing about whether to collect more data or buy more compute. How do I resolve this?

Neither decision should be made without diagnostic evidence. Run error analysis on your current model's failures. If the model fails on underrepresented categories, targeted data collection for those categories will help more than general data collection. If the model is training well but underfitting due to capacity limits, a larger model or more compute may help. If the model overfits, more data could help—but so could regularization. The diagnostic tells you which lever to pull. Without it, you are guessing.

// Comparisons

How does the Ng deep learning project execution methodology compare to CRISP-DM?

CRISP-DM is a general data mining methodology with six broad phases (business understanding, data understanding, data preparation, modeling, evaluation, deployment). Ng's methodology is specifically designed for deep learning and AI projects and adds critical elements CRISP-DM lacks: scaling law awareness, explicit layer-of-abstraction decisions (GenAI vs. deep learning vs. ML), cost curve analysis, the quick-and-dirty prototyping philosophy, and a diagnostic-first intervention selection process. Ng's approach is more opinionated and actionable for modern AI projects.

How is Ng's approach different from just following a Kaggle competition workflow?

Kaggle workflows optimize a fixed dataset for a fixed metric on a fixed problem. Ng's methodology addresses the full project lifecycle including problem scoping, data strategy, abstraction layer selection, cost management, and production transitions—none of which exist in a Kaggle context. Kaggle also does not address the prototype-to-production gap, cost curve bending, or the strategic decision of when to use LLM APIs versus fine-tuned models. The diagnostic-first principle applies in Kaggle, but Ng's framework is broader and more strategic.

What's the difference between GenAI and deep learning in practice?

GenAI (generative AI) is one application of deep learning, built primarily on transformer neural networks trained on internet-scraped data. It excels at text tasks and increasingly at image and audio generation. Deep learning is the broader field encompassing ConvNets for vision, sequence models for time-series, and many architectures beyond transformers. In practice, treating them as interchangeable is a common pitfall: many use cases in audio, vision, and structured data require deep learning algorithms directly, not LLM prompting.

// Advanced

Can I use this methodology if I'm working with structured data like spreadsheets?

Yes, but the abstraction layer decision is important. Structured data (large tables of numbers) usually requires going directly to deep learning or traditional ML algorithms rather than prompting an LLM. LLMs were built for unstructured data—text, audio, images. For structured data, start with a quick baseline model using standard ML, then assess whether deep learning's scaling advantages apply to your dataset size and problem. The diagnostic workflow and disciplined development process apply identically regardless of data type.

How many experiments should I run during prototyping?

Run at least 20 proof-of-concept variants rather than betting on one approach. Because prototyping cost is now very low—especially with AI-assisted coding—the right strategy is not fewer experiments but faster, cheaper ones. Expect most experiments to fail. The one or two that work will justify all the others. This only works if you maintain sandbox conditions: no sensitive data, no external exposure, and explicitly lowered security and scalability requirements.

What is the 'layers of abstraction' principle in Andrew Ng's framework?

AI capability is layered: CS fundamentals → Machine Learning → Deep Learning → Generative AI. Each layer builds on the one below. When prompting an LLM (the GenAI layer) is not sufficient, drop one layer deeper into deep learning to get the application to work. Knowing which layer your problem actually lives at is a critical strategic decision that gates everything downstream. Many teams waste months operating at the wrong layer—either over-engineering with deep learning when prompts suffice, or under-powering with prompts when deep learning is required.

What does 'move fast and be responsible' mean for AI development?

It is Ng's update to 'move fast and break things.' Speed of iteration in a sandboxed environment is itself a safety mechanism, not a risk. Fast prototyping reveals what is in the data and what users actually want, which is the best way to discover what could go wrong and fix it before production. The key distinction is the sandbox: no sensitive data, no external exposure. Within those guardrails, speed increases both development velocity and responsible outcomes because problems are found earlier.

Is learning to code still important with AI tools that write code for me?

Yes—Andrew Ng considers advising people not to learn to code some of the worst career advice ever given. Knowing how computers, deep learning, and GenAI actually work lets you direct AI coding tools with precision. It is like knowing art history when prompting an image generator: you get far greater control than someone who can only ask for generic outputs. Easier coding tools mean more people should code, not fewer. CS and ML fundamentals are the vocabulary you need to be effective with AI.