“Free models” in OpenClaw can mean two different things and mixing them up is where most people waste time.
One kind of free is truly free because the model runs locally and you only pay in CPU, RAM, GPU, electricity. Think Ollama or an OpenAI compatible runtime you host yourself.
The other kind is “free tier” free where a hosted provider gives you a quota, credits, or OAuth access. That can be great, but it comes with rate limits, policy limits, and the occasional surprise outage or sudden cap.
This guide is a long one because model config is one of those things that looks simple until you have to debug why tool calls got slow, why you hit a 429, or why one agent is using a different auth profile than you expected. We’ll keep it practical.
If you are new to OpenClaw and want the basics first, you can read what OpenClaw is and how it works. If you are already running it, let’s wire up models properly.
How OpenClaw model references work
OpenClaw model references use provider/model format. Example: openai/gpt-5.1-codex or ollama/llama3.3.
If you set agents.defaults.models you are effectively creating an allowlist. Only those models are eligible. That is useful when you want to prevent random model selection drift across environments.
CLI helpers that matter in real life:
openclaw onboardfor initial auth and provider setupopenclaw models listto see what OpenClaw can seeopenclaw models set provider/modelto switch quickly without editing config
Those are called out directly in the provider docs and they save you from config typos.
Pick your strategy before you paste random keys
Strategy A: Local first with a hosted fallback
This is my default recommendation for most self-hosters. Local models handle the boring day to day tasks and your fallbacks cover “I need a better brain for this one” or “my local box is busy”. It also keeps cost predictable because most traffic stays local.
Strategy B: Hosted free tiers only
This can work if you do not want to run any local inference and your usage is light. The downside is that free tiers change. You will hit limits. You will want fallbacks.
Strategy C: Local only
This is the cleanest from a privacy and cost perspective. It is also the most hardware dependent. A 3B model on a small machine can feel great for quick automation and then fall apart on long reasoning tasks.
Local models that are truly free
Ollama
Ollama is the most common “I want it free and local” route. OpenClaw supports it as a provider and it can auto-detect a local Ollama server at http://127.0.0.1:11434/v1.
Minimal setup looks like this:
ollama pull llama3.3
openclaw models list
Then set your default model in openclaw.json:
{
"agents": {
"defaults": {
"model": {
"primary": "ollama/llama3.3"
}
}
}
}
If you want a sane “starter” set, keep your primary as a general model and add one coding model as a fallback. Keep it small at first. You can always scale up later.
Ollama caveat about streaming
Some builds disable streaming for certain Ollama setups because of SDK and streaming format quirks. If you see weird tool tokens leaking into text output this is usually why. Treat streaming as optional rather than mandatory.
LM Studio vLLM LiteLLM llama.cpp and other OpenAI compatible runtimes
OpenClaw can use almost any OpenAI compatible base URL via models.providers. The docs even show a pattern for LM Studio style endpoints with explicit provider config.
Example pattern:
{
"agents": {
"defaults": {
"model": { "primary": "lmstudio/minimax-m2.1-gs32" },
"models": { "lmstudio/minimax-m2.1-gs32": { "alias": "MiniMax" } }
}
},
"models": {
"providers": {
"lmstudio": {
"baseUrl": "http://localhost:1234/v1",
"apiKey": "LMSTUDIO_KEY",
"api": "openai-completions",
"models": [
{
"id": "minimax-m2.1-gs32",
"name": "MiniMax M2.1",
"contextWindow": 200000,
"maxTokens": 8192,
"cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }
}
]
}
}
}
}
Notice the cost fields are set to zero. That is a nice mental signal in config: local equals no per-token billing.
Hosted options with free tiers and free access routes
This section is about providers where you can run OpenClaw without paying immediately. Terms change, quotas change, and some “free” is really “free until the promo credits are gone”. Treat this as a menu, not a promise.
Qwen OAuth
This is the one I recommend you try if you want a good free tier without playing “rotate 12 API keys”. OpenClaw supports Qwen via an OAuth device-code flow with a bundled plugin.
Enable the plugin and log in:
openclaw plugins enable qwen-portal-auth
openclaw models auth login --provider qwen-portal --set-default
Model refs look like:
qwen-portal/coder-modelqwen-portal/vision-model
Qwen’s docs for this flow describe a daily free request quota tied to OAuth access.
My personal take: Qwen is one of the rare “free tier” options that still feels usable for real work if you pick the right model for the right job. I would not run a whole team on it for free, but for a personal agent it can be a sweet spot.
OpenRouter free models
OpenRouter is useful because it aggregates many models behind one provider. Their docs also explain the :free suffix on certain models.
In OpenClaw you typically just set:
{
"agents": {
"defaults": {
"model": {
"primary": "openrouter/meta-llama/llama-3.2-3b-instruct:free"
}
}
}
}
Free models can change over time, so treat model choice as something you revisit occasionally. If a free model disappears you want a fallback already configured.
Groq
Groq is popular because it is fast and it often offers generous access for experimentation. Your exact quota depends on what Groq is offering in your region and account tier. It is still worth wiring as a fallback because when it works it feels instant.
OpenClaw includes Groq as a built-in provider that uses GROQ_API_KEY.
Google Gemini
Gemini is also a built-in provider using GEMINI_API_KEY.
Google’s free usage and rate limits depend on product surface and plan, so I avoid promising specific numbers in a tutorial like this. If you want a rough sanity check, Google’s own community threads discuss free tier daily request caps for API style usage.
Mistral
Mistral is another built-in provider. If you have a free tier available in your region it can be a nice general fallback. OpenClaw uses MISTRAL_API_KEY for auth.
Cohere
Cohere is commonly used for summarization and classification style work. Depending on how you route it you can use a direct provider or an OpenAI compatible proxy. If you have a free tier, keep it as a fallback rather than a primary, unless you are sure it covers your usage.
Moonshot Kimi and Kimi Coding
Kimi is usually not “permanent free” but it is often accessible via promos, credits, or partner programs. OpenClaw’s docs show how to configure Moonshot as a custom provider with an OpenAI compatible base URL.
Example config skeleton from the docs:
{
"agents": {
"defaults": { "model": { "primary": "moonshot/kimi-k2.5" } }
},
"models": {
"mode": "merge",
"providers": {
"moonshot": {
"baseUrl": "https://api.moonshot.ai/v1",
"apiKey": "${MOONSHOT_API_KEY}",
"api": "openai-completions",
"models": [{ "id": "kimi-k2.5", "name": "Kimi K2.5" }]
}
}
}
}
Kimi Coding is a separate provider route in the docs using KIMI_API_KEY.
DeepSeek
DeepSeek is often described as “basically free” but it depends on whether you are running it locally or using a hosted API. If you run DeepSeek via Ollama or a local OpenAI compatible server then your cost is hardware. If you use a hosted API it is usually low-cost rather than zero.
How to set fallbacks so free tiers do not break your agent
OpenClaw handles failure in two layers. It can rotate auth profiles inside the same provider then it can fall back to the next model in agents.defaults.model.fallbacks.
That matters more than it sounds because free tiers hit rate limits and “billing disabled” states more often. OpenClaw tracks cooldowns and disables profiles for longer when billing errors happen.
Auth profile stickiness and why your provider can “change” mid week
OpenClaw stores API keys and OAuth tokens in auth profiles and it pins a chosen profile per session for cache friendliness.
If you have multiple profiles for the same provider, OpenClaw can rotate them based on config order or a round robin rule.
If you ever had the feeling that an OAuth login “disappeared” it is often just rotation. Pin a profile order if you want predictable behavior.
Important free model safe setup
A few things that should be boring policy in your setup:
- Never commit API keys. Put them in env vars or systemd env files.
- Do not expose your gateway to the public internet without auth. Use tokens, a tailnet, or a proper reverse proxy.
- Keep tool access sane. A free model can still run dangerous tools if your agent can.
If you want the “skills layer” side of this, the security mindset is the same. Skills are an execution surface, not content. That topic is covered in our OpenClaw skills guide.
Quick troubleshooting checklist
OpenClaw cannot see any models
- Run
openclaw models listand confirm providers show up. - If using Ollama, make sure the service is reachable and
ollama listshows models. - If using a proxy, confirm the
baseUrlends with/v1for OpenAI compatible APIs.
I keep hitting 401 or 403
- Check the correct env var for that provider. OpenClaw’s provider docs list the expected auth variables.
- If you have multiple auth profiles, a broken one can rotate in. Check auth profile order and cooldown state.
I keep hitting 429 rate limits
- Add fallbacks so OpenClaw can switch models when a provider throttles.
- Reduce concurrency for heavy workflows like long document summarization.
- If you are using free tiers, accept that this is normal and design around it.

