Free AI models for OpenClaw and how to configure them

Daniel Ignat

04/02/2026

Free AI models for OpenClaw and how to configure them - Free AI models for OpenClaw and how to configure them

“Free models” in OpenClaw can mean two different things and mixing them up is where most people waste time.

One kind of free is truly free because the model runs locally and you only pay in CPU, RAM, GPU, electricity. Think Ollama or an OpenAI compatible runtime you host yourself.

The other kind is “free tier” free where a hosted provider gives you a quota, credits, or OAuth access. That can be great, but it comes with rate limits, policy limits, and the occasional surprise outage or sudden cap.

This guide is a long one because model config is one of those things that looks simple until you have to debug why tool calls got slow, why you hit a 429, or why one agent is using a different auth profile than you expected. We’ll keep it practical.

If you are new to OpenClaw and want the basics first, you can read what OpenClaw is and how it works. If you are already running it, let’s wire up models properly.

How OpenClaw model references work

OpenClaw model references use provider/model format. Example: openai/gpt-5.1-codex or ollama/llama3.3.

If you set agents.defaults.models you are effectively creating an allowlist. Only those models are eligible. That is useful when you want to prevent random model selection drift across environments.

CLI helpers that matter in real life:

openclaw onboard for initial auth and provider setup
openclaw models list to see what OpenClaw can see
openclaw models set provider/model to switch quickly without editing config

Those are called out directly in the provider docs and they save you from config typos.

Pick your strategy before you paste random keys

Strategy A: Local first with a hosted fallback

This is my default recommendation for most self-hosters. Local models handle the boring day to day tasks and your fallbacks cover “I need a better brain for this one” or “my local box is busy”. It also keeps cost predictable because most traffic stays local.

Strategy B: Hosted free tiers only

This can work if you do not want to run any local inference and your usage is light. The downside is that free tiers change. You will hit limits. You will want fallbacks.

Strategy C: Local only

This is the cleanest from a privacy and cost perspective. It is also the most hardware dependent. A 3B model on a small machine can feel great for quick automation and then fall apart on long reasoning tasks.

Local models that are truly free

Ollama

Ollama is the most common “I want it free and local” route. OpenClaw supports it as a provider and it can auto-detect a local Ollama server at http://127.0.0.1:11434/v1.

Minimal setup looks like this:

ollama pull llama3.3
openclaw models list

Then set your default model in openclaw.json:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "ollama/llama3.3"
      }
    }
  }
}

If you want a sane “starter” set, keep your primary as a general model and add one coding model as a fallback. Keep it small at first. You can always scale up later.

Ollama caveat about streaming

Some builds disable streaming for certain Ollama setups because of SDK and streaming format quirks. If you see weird tool tokens leaking into text output this is usually why. Treat streaming as optional rather than mandatory.

LM Studio vLLM LiteLLM llama.cpp and other OpenAI compatible runtimes

OpenClaw can use almost any OpenAI compatible base URL via models.providers. The docs even show a pattern for LM Studio style endpoints with explicit provider config.

Example pattern:

{
  "agents": {
    "defaults": {
      "model": { "primary": "lmstudio/minimax-m2.1-gs32" },
      "models": { "lmstudio/minimax-m2.1-gs32": { "alias": "MiniMax" } }
    }
  },
  "models": {
    "providers": {
      "lmstudio": {
        "baseUrl": "http://localhost:1234/v1",
        "apiKey": "LMSTUDIO_KEY",
        "api": "openai-completions",
        "models": [
          {
            "id": "minimax-m2.1-gs32",
            "name": "MiniMax M2.1",
            "contextWindow": 200000,
            "maxTokens": 8192,
            "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }
          }
        ]
      }
    }
  }
}

Notice the cost fields are set to zero. That is a nice mental signal in config: local equals no per-token billing.

Hosted options with free tiers and free access routes

This section is about providers where you can run OpenClaw without paying immediately. Terms change, quotas change, and some “free” is really “free until the promo credits are gone”. Treat this as a menu, not a promise.

Qwen OAuth

This is the one I recommend you try if you want a good free tier without playing “rotate 12 API keys”. OpenClaw supports Qwen via an OAuth device-code flow with a bundled plugin.

Enable the plugin and log in:

openclaw plugins enable qwen-portal-auth
openclaw models auth login --provider qwen-portal --set-default

Model refs look like:

qwen-portal/coder-model
qwen-portal/vision-model

Qwen’s docs for this flow describe a daily free request quota tied to OAuth access.

My personal take: Qwen is one of the rare “free tier” options that still feels usable for real work if you pick the right model for the right job. I would not run a whole team on it for free, but for a personal agent it can be a sweet spot.

OpenRouter free models

OpenRouter is useful because it aggregates many models behind one provider. Their docs also explain the :free suffix on certain models.

In OpenClaw you typically just set:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "openrouter/meta-llama/llama-3.2-3b-instruct:free"
      }
    }
  }
}

Free models can change over time, so treat model choice as something you revisit occasionally. If a free model disappears you want a fallback already configured.

Groq

Groq is popular because it is fast and it often offers generous access for experimentation. Your exact quota depends on what Groq is offering in your region and account tier. It is still worth wiring as a fallback because when it works it feels instant.

OpenClaw includes Groq as a built-in provider that uses GROQ_API_KEY.

Google Gemini

Gemini is also a built-in provider using GEMINI_API_KEY.

Google’s free usage and rate limits depend on product surface and plan, so I avoid promising specific numbers in a tutorial like this. If you want a rough sanity check, Google’s own community threads discuss free tier daily request caps for API style usage.

Mistral

Mistral is another built-in provider. If you have a free tier available in your region it can be a nice general fallback. OpenClaw uses MISTRAL_API_KEY for auth.

Cohere

Cohere is commonly used for summarization and classification style work. Depending on how you route it you can use a direct provider or an OpenAI compatible proxy. If you have a free tier, keep it as a fallback rather than a primary, unless you are sure it covers your usage.

Moonshot Kimi and Kimi Coding

Kimi is usually not “permanent free” but it is often accessible via promos, credits, or partner programs. OpenClaw’s docs show how to configure Moonshot as a custom provider with an OpenAI compatible base URL.

Example config skeleton from the docs:

{
  "agents": {
    "defaults": { "model": { "primary": "moonshot/kimi-k2.5" } }
  },
  "models": {
    "mode": "merge",
    "providers": {
      "moonshot": {
        "baseUrl": "https://api.moonshot.ai/v1",
        "apiKey": "${MOONSHOT_API_KEY}",
        "api": "openai-completions",
        "models": [{ "id": "kimi-k2.5", "name": "Kimi K2.5" }]
      }
    }
  }
}

Kimi Coding is a separate provider route in the docs using KIMI_API_KEY.

DeepSeek

DeepSeek is often described as “basically free” but it depends on whether you are running it locally or using a hosted API. If you run DeepSeek via Ollama or a local OpenAI compatible server then your cost is hardware. If you use a hosted API it is usually low-cost rather than zero.

How to set fallbacks so free tiers do not break your agent

OpenClaw handles failure in two layers. It can rotate auth profiles inside the same provider then it can fall back to the next model in agents.defaults.model.fallbacks.

That matters more than it sounds because free tiers hit rate limits and “billing disabled” states more often. OpenClaw tracks cooldowns and disables profiles for longer when billing errors happen.

Auth profile stickiness and why your provider can “change” mid week

OpenClaw stores API keys and OAuth tokens in auth profiles and it pins a chosen profile per session for cache friendliness.

If you have multiple profiles for the same provider, OpenClaw can rotate them based on config order or a round robin rule.

If you ever had the feeling that an OAuth login “disappeared” it is often just rotation. Pin a profile order if you want predictable behavior.

Important free model safe setup

A few things that should be boring policy in your setup:

Never commit API keys. Put them in env vars or systemd env files.
Do not expose your gateway to the public internet without auth. Use tokens, a tailnet, or a proper reverse proxy.
Keep tool access sane. A free model can still run dangerous tools if your agent can.

If you want the “skills layer” side of this, the security mindset is the same. Skills are an execution surface, not content. That topic is covered in our OpenClaw skills guide.

Quick troubleshooting checklist

OpenClaw cannot see any models

Run openclaw models list and confirm providers show up.
If using Ollama, make sure the service is reachable and ollama list shows models.
If using a proxy, confirm the baseUrl ends with /v1 for OpenAI compatible APIs.

I keep hitting 401 or 403

Check the correct env var for that provider. OpenClaw’s provider docs list the expected auth variables.
If you have multiple auth profiles, a broken one can rotate in. Check auth profile order and cooldown state.

I keep hitting 429 rate limits

Add fallbacks so OpenClaw can switch models when a provider throttles.
Reduce concurrency for heavy workflows like long document summarization.
If you are using free tiers, accept that this is normal and design around it.

Your idea deserves better hosting

24/7 support 30-day money-back guarantee Cancel anytime

Billing Cycle

1 GB RAM VPS

€3.37 Save 50 %

€1.68 Monthly

1 vCPU AMD EPYC
30 GB NVMe storage
✔Unmetered bandwidth
✔ IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
✔1 Gbps network
✔Firewall management
✔Free server monitoring

Free AI models for OpenClaw and how to configure them

How OpenClaw model references work

Pick your strategy before you paste random keys

Strategy A: Local first with a hosted fallback

Strategy B: Hosted free tiers only

Strategy C: Local only

Local models that are truly free

Ollama

Ollama caveat about streaming

LM Studio vLLM LiteLLM llama.cpp and other OpenAI compatible runtimes

Hosted options with free tiers and free access routes

Qwen OAuth

OpenRouter free models

Groq

Google Gemini

Mistral

Cohere

Moonshot Kimi and Kimi Coding

DeepSeek

How to set fallbacks so free tiers do not break your agent

Auth profile stickiness and why your provider can “change” mid week

Important free model safe setup

Quick troubleshooting checklist

OpenClaw cannot see any models

I keep hitting 401 or 403

I keep hitting 429 rate limits

Your idea deserves better hosting

1 GB RAM VPS

2 GB RAM VPS

4 GB RAM VPS

6 GB RAM VPS

AMD EPYC VPS.P1

AMD EPYC VPS.P2

AMD EPYC VPS.P3

AMD EPYC VPS.P4

AMD EPYC VPS.P5

AMD EPYC VPS.P6

AMD EPYC VPS.P7

EPYC Genoa VPS.G1

EPYC Genoa VPS.G2

EPYC Genoa VPS.G3

EPYC Genoa VPS.G4

EPYC Genoa VPS.G5

EPYC Genoa VPS.G6

EPYC Genoa VPS.G7

FAQ

How do I choose between local models and free tiers?

How do I see what OpenClaw is using right now?

How do I add Qwen if I do not want to manage keys?

How do I stop OpenClaw from picking random models?

Why does my free tier provider work sometimes and fail other times?

Automate faster, for less

Products

App hosting solutions

Features

Resources

Solutions by use case

Get help

Company

Generate Password