Module 44

AI Model Limitations

Last updated 2026-06-02

Key points

Lesson 1: What is AI Model Limitations and why it matters

AI models learn from examples, not step-by-step instructions, which makes them nondeterministic (unpredictable in output). Unlike traditional software that follows a fixed recipe, an AI studies thousands of finished dishes and writes its own recipe. This means every query can produce different results, and as you add more AI, the possibility for errors increases. You need constant maintenance, upkeep, and evaluations to ensure systems provide value rather than becoming a headache.

Another key limitation is that AI is still a blackbox (an opaque system where internal reasoning is hidden). You can see what the model does, but you must talk to it extensively and be very clear. Your role shifts from writing code to assuring quality and keeping the system on track. As one expert put it, AI is really a junior developer. With the right spec and framework, you can engineer it into something like a senior, but ask a dumb question and you get a dumb answer.

These limitations matter because AI models are becoming cheaper and more accessible, meaning intelligence itself is commoditized. The only things proprietary to your business are your processes, decisions, and historical context. Collating that information and plugging it into the right model with the right framework is what creates value. Additionally, AI models sometimes get better, sometimes worse. Something that worked perfectly a month ago might need adjustments now. Recognizing these constraints helps you build solid systems and improve them as you learn how they behave in production.

Sources

Lesson 2: How to use AI Model Limitations: step-by-step

Every AI model has limitations you need to manage. The key is choosing the right model for each specific task. Claude offers different tiers: Opus is the most capable and expensive, while Haiku is cheaper but less powerful. If a task has three steps and only one is difficult enough to need Opus, use Opus for that step and Haiku for the simpler ones. This avoids wasting money.

The effort parameter (a setting that controls how hard the model thinks) helps you manage compute costs. Set it to low for high-volume routine tasks, medium for everyday work, high for complex problems, and max for peak intelligence. Opus 4.6 added adaptive thinking (automatic effort adjustment) so the model decides when extended reasoning helps, optimizing cost and latency.

For coding, use the PIV loop: Plan what you want, let AI implement it, then Validate the results. Each cycle improves your code. Separate decisions from execution by writing a recipe (a YAML file with deterministic steps) and letting Claude or Codex be the chef that follows it. You can also run multiple agents in parallel — four agents working simultaneously can scrape comments, create diagrams, and analyze data after you spend 30 seconds setting them up.

Remember that your output is only as good as your setup. If the data is small enough, put it directly in the system prompt instead of using a separate retrieval system. Claude code can now sit behind production infrastructure, not just prototypes, but watch your session limits when running many automations.

Sources

Lesson 3: Best practices and pitfalls

AI models are powerful but have clear limitations. They are still a "blackbox" (an opaque system where you can't see internal reasoning). Even advanced models like Opus 4.5 or Gemma 4 can fabricate information — turns with zero reasoning produced false results while deep reasoning turns were correct. This means a model can look smart while being wrong.

A common mistake is assuming the model understands your intent without clear direction. Your output is only as good as your input. Models optimize perfectly for the wrong thing if you give unclear instructions. One expert found bugs in a 20-year AI system by running automated experiments — things he'd walked past a thousand times. This shows you must be the quality assurance person, not just the person giving orders.

Best practices start with being very clear and specific with your prompts. Use "intent engineering" (designing what you want the model to achieve) rather than just piling in context. Set proper token budgets for reasoning tasks so the model doesn't skimp on thinking. With Claude Code, build "skills" (reusable capabilities for agents) and use routines as schedulers to run multiple agents in parallel. For coding, rescue commands and adversarial commands (pitting models against each other) give fresh perspectives.

The space moves fast — chasing every new model leads to burnout. Focus on one framework and connect theory to real projects. Test everything. The most dangerous failure looks like success until it's too late.

Sources