Module 75

Claude Running Ollama Models

Last updated 2026-06-02

Key points

Lesson 1: What is Claude Running Ollama Models and why it matters

Claude running Ollama models means you can now use local, open-source AI models inside Anthropic's Claude desktop app. Instead of only using Anthropic's paid models, a dropdown lets you pick models like Kimiko or Quen from the Ollama lineup (Ollama is a tool for running AI models on your own computer). Claude will then handle real work autonomously—organizing files, talking to your local apps—using the model you selected.

This matters for AI development because it changes how developers build and test. You can use cheaper or specialized models for different tasks without switching tools. For example, Claude Code (Anthropic's coding agent) works with these local models too, cutting costs dramatically. One command—"Ollama launch Claude"—pulls the right model, sets up environment variables, and starts the agent in seconds.

There are two current limits: web search and extensions are not supported yet. Also, open-source models may misbehave because they weren't trained on Claude Code's tools, may have small context windows (amount of text they can remember), or might not follow the exact protocol Claude expects. Still, this integration signals a shift where AI tools work together rather than compete—embedding everywhere instead of walling off. Full setup instructions are at docs.ollama.com/integration/claudedesktop.

Sources

Lesson 2: How to use Claude Running Ollama Models: step-by-step

To use Claude with Ollama models (local AI models you run on your own computer), start by downloading Ollama from ollama.com for your operating system. After installation, open your terminal (command-line interface) and run the command `ollama launch Claude`. This single command pulls the right model, wires up environment variables (settings the system needs to connect tools), and starts the Claude agent for you — typically in about 10 seconds.

Once running, reload the Claude desktop app. A new model picker drop-down appears showing all Ollama models discovered automatically. You can choose from Kimiko, GPT-OSS, Qwen, Dev Stroll, Mini Stroll, GLM, or Mini Mix. Switch to Claude co-work, pick a model, and let it handle real tasks like organizing files or talking to your local apps autonomously (working on its own). The same drop-down works in both the desktop app and VS Code.

Two limits to know: web search is not supported yet, and extensions are not supported yet. Sub-agents (helper AI tasks) inherit your current model choice. For full setup details, visit docs.ollama.com/integration/claudedesktop.

If you prefer using Claude Code specifically with a custom model, first pull your model using `ollama pull` followed by the model name — for example, `ollama pull qwen3.5:9b`. Then run `ollama launch Claude` in your VS Code terminal. The terminal lets you choose which local model to run Claude with. Note that open-source models may misbehave if they were not trained on Claude Code's tools, have too small a context window (amount of text the model can remember at once) for Claude's system prompt, or do not follow the exact JSON protocol (data format rules) Claude Code expects.

Sources

Lesson 3: Best practices and pitfalls

Running Ollama models through Claude is now a single command: `ollama launch Claude`. This command automatically pulls the right model, wires up environment variables (settings the system needs to connect everything), and starts the agent (an AI that performs tasks for you) in about ten seconds. Before this shortcut, you had to install Ollama manually, pull a model, set three or four environment variables by hand, figure out model compatibility, and then launch the agent. That friction is gone.

The integration is deep. The Claude desktop app shows a drop-down with all your local Ollama models discovered automatically — Kimiko, GPT-OSS, Qwen, and others. You can also use Claude Code inside VS Code with the same drop-down. The `ollama launch Claude` command works there too.

But pitfalls exist. Open-source models running in Claude Code may misbehave for specific reasons. They might not have been trained on Claude Code's tools (the functions and commands Claude Code uses to work). They might have a context window (the amount of text the model can remember at once) too small for Claude's system prompt (initial instructions guiding behavior). And they might not follow the exact same JSON protocol (the structured data format) that Claude Code expects. Think of it like putting a motorcycle engine into a truck — the parts don't match.

To avoid mistakes, use a model with a sufficiently large context window. The Qwen 3.5 model with 64,000 context is a common choice for this reason. Also, use Claude skills (pre-built instructions that improve model behavior) like "superpowers," which forces the model to plan before coding, working in an isolated environment and writing tests first. This prevents chaotic behavior from models not trained on Claude's specific agent tools.

Sources