How ZeroClaw memory works and how to tune it

Alex

09/03/2026

How ZeroClaw memory works and how to tune it - How ZeroClaw memory works and how to tune it

ZeroClaw remembers your conversations. Not just within a single chat session, but across days and weeks and platform switches. If you tell your assistant something on Tuesday through Telegram, it can recall that context on Friday through Discord. This persistence is what separates a self-hosted agent from a throwaway ChatGPT conversation, and it's made possible by a hybrid memory system that combines vector similarity search with traditional keyword matching inside a single SQLite database.

Understanding how this works isn't strictly necessary for casual use because the defaults are reasonable. But if you're running ZeroClaw for anything serious, knowing how to tune the memory system makes your assistant noticeably more useful over time.

The hybrid search system

When your assistant needs to recall something, ZeroClaw doesn't just do a simple text search. It runs two queries in parallel and merges the results.

Vector similarity (semantic search) — ZeroClaw generates embedding vectors for every piece of stored memory using a configurable embedding provider. When a new message comes in, it gets embedded too, and ZeroClaw finds stored memories that are semantically similar using cosine similarity. This catches conceptual relationships even when the exact words don't match. If you mentioned "deploying containers" last week, a question about "Docker setup" today will surface that memory even though the phrasing is different.

FTS5 keyword search — SQLite's Full-Text Search extension provides BM25-ranked keyword matching. This catches exact matches that semantic search sometimes misses. If you told your bot a specific error code or a person's name, keyword search finds it reliably because it matches the literal text.

By default, ZeroClaw weights these 70% vector and 30% keyword. The weighting is configurable, and we'll get to that.

What gets stored

ZeroClaw stores conversation turns (your messages and the assistant's responses), along with metadata like timestamps, channel source and embedding vectors. Everything lives in a single file called memory.db in the ~/.zeroclaw/ directory. Your entire conversation history, all the learned context, every interaction your agent has had with you across all channels, fits in one SQLite database.

The database uses WAL (Write-Ahead Logging) mode with tuned PRAGMA settings for production performance. In practice this means reads and writes don't block each other, which matters when your assistant is simultaneously searching memory and storing new interactions.

Memory configuration in config.toml

All memory settings live under the [memory] section in ~/.zeroclaw/config.toml. Here's what the key fields do:

[memory]
backend = "sqlite"
auto_save = true
hygiene_enabled = true
archive_after_days = 90
purge_after_days = 365

vector_weight = 0.7
keyword_weight = 0.3

embedding_provider = "openai"
embedding_model = "text-embedding-3-small"
embedding_dimensions = 1536
embedding_cache_size = 1000

response_cache_enabled = true

backend — The default is "sqlite" which is the hybrid vector + FTS5 system. Other options include "postgresql" (for external database setups), "markdown" (plain text files) and "none" (disable memory entirely). For almost everyone, sqlite is the right choice.

auto_save — When true, conversations are saved to memory automatically. You'd only turn this off if you want manual control over what gets remembered.

hygiene_enabled — Enables automatic memory maintenance. Old entries get archived and eventually purged based on the day thresholds you set.

archive_after_days / purge_after_days — Controls the lifecycle of memories. Archived memories are still searchable but have lower priority in retrieval. Purged memories are deleted. The defaults (90/365) are sensible for most setups.

vector_weight / keyword_weight — How much to weight each search method in hybrid retrieval. The default 70/30 split favors semantic understanding over exact matching. If your use case involves a lot of specific terms, codes or names, bumping keyword_weight to 0.4 or 0.5 can help. These should add up to 1.0.

embedding_provider / embedding_model — Which service generates the vector embeddings. OpenAI's text-embedding-3-small is cheap and effective. If you're running Ollama locally, you can point this at a local embedding model to keep everything on-server, though the quality of local embeddings varies. The Ollama setup guide covers local embedding configuration.

embedding_cache_size — Number of embeddings to cache in memory. Higher values reduce API calls for repeated or similar queries but use more RAM. 1000 is a good default for a VPS with 2+ GB of RAM.

Memory commands

ZeroClaw provides CLI commands for inspecting and managing memory:

zeroclaw memory list          # Show recent memories
zeroclaw memory get KEY       # Retrieve a specific memory
zeroclaw memory stats         # Show memory database statistics

The stats command is particularly useful. It tells you how many memories are stored, the database file size, embedding dimensions and cache hit rate. If the cache hit rate is very low, you might want to increase embedding_cache_size. If the database is getting large (hundreds of MB), consider lowering archive_after_days.

Tuning for different use cases

For a personal assistant that you chat with casually, the defaults work fine. Memory grows gradually and the 70/30 vector/keyword split handles the mix of casual references and specific details well.

For a team assistant in a group chat, consider increasing the embedding cache size to 2000-5000 since multiple people generate more varied queries. You might also want to shorten archive_after_days to 60 if conversations move fast and older context becomes less relevant.

For a development or DevOps assistant that deals with error codes, hostnames, IP addresses and version numbers, increase the keyword weight to 0.4 or 0.5. Semantic search is great for conceptual queries but keyword matching is more reliable for retrieving specific technical identifiers.

For a deeper look at advanced memory management patterns like memory graphs and external memory bridges, we've covered some of these concepts in the context of OpenClaw's advanced memory system. Not all of those features have direct equivalents in ZeroClaw yet, but the conceptual framework translates.

Your idea deserves better hosting

24/7 support 30-day money-back guarantee Cancel anytime

Billing Cycle

1 GB RAM VPS

14.51 zł Save 25 %

10.88 _zł Monthly

1 vCPU AMD EPYC
30 GB NVMe storage
✔Unmetered bandwidth
✔ IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
✔1 Gbps network
✔Firewall management
✔Free server monitoring

How ZeroClaw memory works and how to tune it

The hybrid search system

What gets stored

Memory configuration in config.toml

Memory commands

Tuning for different use cases

Your idea deserves better hosting

1 GB RAM VPS

2 GB RAM VPS

4 GB RAM VPS

6 GB RAM VPS

AMD EPYC VPS.P1

AMD EPYC VPS.P2

AMD EPYC VPS.P3

AMD EPYC VPS.P4

AMD EPYC VPS.P5

AMD EPYC VPS.P6

AMD EPYC VPS.P7

EPYC Genoa VPS.G1

EPYC Genoa VPS.G2

EPYC Genoa VPS.G3

EPYC Genoa VPS.G4

EPYC Genoa VPS.G6

EPYC Genoa VPS.G7

1 vCPU AMD Ryzen 9

2 vCPU AMD Ryzen 9

4 vCPU AMD Ryzen 9

8 vCPU AMD Ryzen 9

FAQ

Can I back up my ZeroClaw memory?

How do I make ZeroClaw forget something?

Do I need an embedding provider if I'm using Ollama?

Automate faster, for less

Products

App hosting solutions

Features

Resources

Solutions by use case

Get help

Company

Generate Password