Back to Article List

OpenClaw multi-agent coordination, patterns and governance

OpenClaw multi-agent coordination, patterns and governance

If you've already worked through the multi-agent setup guide, you know how to configure multiple agents within a single Gateway and route channels to the right one. What that guide doesn't cover is what happens next - when agents need to talk to each other, hand off tasks, share state, and operate without someone manually orchestrating every interaction. That's where most multi-agent setups either get genuinely powerful or fall apart completely.

The research on this is blunt: fewer than 10% of teams successfully scale beyond a single-agent deployment, and the main reasons are coordination complexity, uncontrolled token costs, and state management that nobody thought through properly.

This guide is about avoiding that. It's focused on the patterns that actually work in OpenClaw specifically, not abstract theory about multi-agent systems.

When multiple agents are worth the complexity

The honest answer is: less often than you'd think. A single well-configured agent with good tools covers most use cases without any of the coordination overhead. Before reaching for multiple agents, it's worth asking what specific problem you're solving.

Multi-agent setups genuinely earn their complexity in a few scenarios:

  • Security isolation. You need one agent to handle public Discord with minimal tool access, and another to handle personal DMs with exec permissions and access to sensitive files. Per-agent tool allow/deny lists make this clean, and you don't want those boundaries to be configuration mistakes — you want them enforced architecturally.
  • Domain specialization. A coding agent with dev tools and a research agent with web search running in parallel on a complex task is genuinely faster than one generalist agent doing both sequentially.
  • Multi-user routing. One WhatsApp number, but different users need different agents with different memory contexts and different permissions. Channel bindings handle this without any custom routing code.
  • Parallel workloads. Long-running tasks that can be decomposed and worked in parallel. A coordinator decomposes, specialists execute, coordinator aggregates.

The cases where multi-agent adds cost without real benefit: when you're just organizing prompts differently, when a single agent with good skills would do the same thing, and when the "coordination" is mostly you manually deciding what each agent does. If you're orchestrating it by hand, it's not really a multi-agent system — it's just multiple chatbots.

How OpenClaw routes between agents

Before getting into coordination patterns, it helps to understand how OpenClaw decides which agent handles a given message. The routing layer uses bindings: deterministic mappings from a (channel, accountId, peer/guild) tuple to an agentId. The most specific binding wins. A binding scoped to a specific Discord guild and channel beats a binding scoped to just that guild, which beats a binding scoped to all Discord.

This determinism is actually important. It means you can reason about where a message will go without tracing execution. It also means the routing layer itself doesn't need to be "smart" — that's the coordinator agent's job if further classification is needed after the initial routing.

Each agent in OpenClaw gets:

  • Its own workspace directory (~/.openclaw/agents/<agentId>/workspace) with independent MEMORY.md and memory files
  • Its own sessions directory (~/.openclaw/agents/<agentId>/sessions)
  • Its own tool allow/deny configuration
  • Its own auth profile and model configuration

They share the Gateway process, its concurrency limits, and optionally a shared workspace directory for cross-agent memory. That shared workspace is where coordination gets interesting.

Good coordination patterns

The coordinator-specialist pattern

This is the most reliable pattern for OpenClaw multi-agent setups and the one worth understanding first. The coordinator agent receives incoming tasks, classifies them, and delegates to the right specialist using sessions_spawn or sessions_send. Specialists are stateless — they do one task, return results, and terminate. The coordinator owns MEMORY.md and is responsible for aggregating and persisting anything worth keeping.

A minimal coordinator config looks like this:

{
  "id": "coordinator",
  "systemPrompt": "You decompose incoming tasks and delegate to specialist agents via sessions_send. You own MEMORY.md. Summarize results before storing. Never recurse — if a task comes back to you from a specialist, aggregate and close it.",
  "tools": ["sessions_send", "sessions_list", "memory_search", "read", "write"]
}

The key constraint in the system prompt is the last line. Coordinators that can recurse — sending tasks back to themselves or creating specialist chains that loop back — are how you get infinite delegation loops. More on that below.

Specialists don't need persistent memory. Their config reflects this:

{
  "id": "research-specialist",
  "systemPrompt": "You handle research tasks delegated by the coordinator. Complete the task, return a concise summary, then stop.",
  "tools": ["web_search", "read", "write"],
  "memory": {
    "enabled": false
  }
}

Disabling memory on stateless specialists isn't just an optimization — it prevents them from accumulating context across tasks that contaminates future runs.

Shared state via workspace files

The simplest coordination mechanism is shared files in a common workspace directory. Agents can read and write goal files, status files, and log files that other agents check. The pattern used in event-driven OpenClaw setups typically involves:

  • goal.md — the current task and its decomposition, written by the coordinator
  • plan.md — the execution plan with subtasks and assignments
  • status.md — current state of each subtask (pending / in-progress / complete / blocked)
  • log.md — append-only execution log for auditing and debugging

This file-based approach has the advantage of being completely inspectable. You can open any of these files and see exactly what state the system is in. The disadvantage is that file I/O adds latency and you need to think about write conflicts when multiple specialists update status concurrently.

The upcoming teams RFC and task mailbox

OpenClaw has a teams RFC in progress that will add a more formal coordination layer: a shared task list with dependencies, blocked, and claimed states, plus a per-agent mailbox for async P2P and broadcast messaging. When this lands, it'll replace the file-based workarounds for the most common coordination patterns. For now, file-based state and sessions_send are the stable options.

Preventing loops, deadlocks, and runaway delegation

This is the part that breaks most multi-agent setups that get past initial testing. The failure modes are predictable but easy to miss until they happen in production.

Infinite delegation loops

Symptom: coordinator sends to specialist A, specialist A sends back to coordinator, coordinator sends to specialist A again. Token costs spike, nothing gets done, Gateway hits concurrency limits.

Fix: enforce a strict no-recursion rule in coordinator and specialist prompts. The coordinator should never accept tasks routed back from specialists — if it receives output from a specialist, the correct response is to aggregate and close, not to re-delegate. You can also set max depth limits on spawn chains and use the Gateway's tool deny-list to prevent specialists from calling sessions_send to the coordinator at all.

Deadlocks on shared state

Symptom: specialist A is waiting for specialist B to complete a dependency, and specialist B is waiting for specialist A. Both are running, consuming tokens and Gateway concurrency slots, making no progress.

Fix: the RFC task list with explicit dependency tracking and cycle detection handles this cleanly when it ships. In the meantime: assign unique task IDs to everything, check status before claiming a task, and implement per-run timeouts. The teammate_shutdown tool can terminate a stuck agent if the coordinator detects it's been running too long without producing output.

Race conditions on shared files

Symptom: two specialists write to status.md simultaneously, one overwrites the other's update, the coordinator reads a partially consistent state.

Fix: use append-only files for logging (never overwrite, always append). For state files that need atomic updates, design writes so they're idempotent — writing the same value twice produces the same result as writing it once. The file locking capabilities being added in current OpenClaw development will address this more formally.

Cost runaway

This is a slower-moving failure but a real one. Each agent-to-agent handoff adds token overhead. A coordinator that summarizes verbosely before delegating, delegates to a specialist that summarizes verbosely before returning, then receives and summarizes again before writing to memory... you can burn through significant token budget on a task that a single agent would handle in one pass.

The mitigation is to make specialists genuinely stateless and concise. They should return structured output, not prose summaries. The coordinator is responsible for the final synthesis. Using a cheaper model for the coordinator is also worth considering — coordination decisions (classify, route, aggregate) don't require the most capable model, and the specialists doing the actual work are where quality matters.

Managing token costs across agents

Token costs in multi-agent systems scale with the number of agents, the verbosity of inter-agent communication, and how much context each agent carries. A few concrete strategies:

Keep specialists stateless. No persistent memory, no history across tasks. Each run starts clean. This is the biggest single lever for controlling costs because you're not accumulating context that gets re-injected on every subsequent run.

Summarize before storing. When the coordinator writes results to MEMORY.md, it should write a concise summary, not a transcript of the specialist's output. Memory retrieval pulls those snippets into future contexts — dense, verbose memory entries are expensive.

Use concurrency limits deliberately. OpenClaw's Gateway has maxConcurrentRuns configuration and a lane/queue system for session isolation. Don't let six specialists run simultaneously if three will do. Parallel execution is faster but multiplies cost; find the right balance for your workload.

Use cheaper models for lightweight tasks. A coordinator doing task classification doesn't need the same model as a specialist doing complex code generation. OpenClaw's per-agent model configuration makes this easy — assign models by task complexity, not by habit.

The cron scheduler guide and heartbeat vs cron are relevant here too: scheduling heavy multi-agent workflows during off-peak periods and using heartbeats to do cheap status checks before triggering expensive model calls are both meaningful cost levers.

Monitoring multi-agent flows

Single-agent monitoring is straightforward: one process, one log, one trace. Multi-agent monitoring is genuinely harder because failures can be invisible at the individual agent level while the overall system silently misbehaves. The coordinator might be logging "task delegated successfully" while the specialist is stuck in a loop that nobody's watching.

OpenTelemetry tracing

OpenClaw's native OTEL integration emits GenAI-standard traces for LLM calls and tool invocations, tagged by agentId. This means you can trace a coordinator → specialist → sub-tool delegation chain as a single distributed trace and see exactly where time and tokens are being spent. Each span includes the agent ID, the tool called, latency, and token counts.

For production setups, forward these traces to Datadog, Grafana Tempo, or another trace backend. Set up alerts on spans that exceed expected duration thresholds — a specialist span that runs for 10 minutes when it should run for 30 seconds is usually a loop or a stuck tool call, not a slow model.

Prometheus metrics per agent

OpenClaw's Prometheus exporter surfaces per-agent metrics: runs, success/failure counts, token usage, and session counts. In a multi-agent setup, you want separate dashboards per agent, not just aggregate Gateway metrics. An anomaly in the coordinator's success rate while specialists look fine points to a coordination bug; an anomaly in one specialist while others look fine points to a skill or tool problem specific to that specialist's workload.

Behavioral drift

This is the monitoring problem unique to AI agents: the model's behavior can drift over time without any code change. A specialist that worked correctly for two weeks might start producing subtly wrong output because its memory context has accumulated noise, or because model updates changed something in its reasoning. The monitoring guide covers the technical instrumentation; the governance layer below is how you define what "correct" looks like and enforce it.

Governance: Permissions, policies and conflict resolution

Governance in a multi-agent system is the set of rules that prevent agents from doing things you didn't intend — whether that's accessing tools they shouldn't have, routing tasks incorrectly, or accumulating permissions through delegation chains.

Tool permissions per agent

OpenClaw's per-agent tool allow/deny lists are the primary governance mechanism. The deny list wins over the allow list — if a tool is denied, it can't be used regardless of what the allow list says. The principle here is minimal privilege: each agent should have access to exactly the tools it needs for its defined role, nothing more.

A practical example for a three-agent setup:

  • Coordinator: sessions_send, sessions_list, memory_search, read, write — coordination tools only, no exec, no external API calls
  • Research specialist: web_search, read, write — no exec, no sessions tools (can't spawn or communicate with other agents directly)
  • Dev specialist: exec, read, write, git — sandboxed via Docker, no web access by default

Preventing research and dev specialists from using sessions_send is a governance choice that also prevents delegation loops — they physically can't route tasks back to the coordinator or to each other.

Sandbox isolation

For agents that run exec tools or handle untrusted input, OpenClaw's Docker sandbox mode (agents.defaults.sandbox.mode: "non-main" or "all") runs those agents in isolated containers with no network access by default. This is a hard boundary that tool allow/deny lists alone don't provide — even if a prompt injection attack convinces the agent to try something it shouldn't, the container's network isolation prevents it from reaching anything.

Auth profiles and model configuration per agent

Assign separate auth profiles to agents that connect to external services. If the research specialist's API key gets compromised, it shouldn't give access to the credentials the dev specialist uses. OpenClaw's per-agent auth profile configuration supports this directly. Disable beta_features on agents that don't need experimental capabilities — reducing the attack surface is worth it even when you trust your own setup.

The kill switch

The Gateway acts as a proxy for all LLM calls and tool invocations. This means that stopping the Gateway stops everything — all agents, all sessions, all scheduled tasks. In a situation where you've lost control of a multi-agent system (a loop that's burning API budget, an agent that's sending messages you didn't intend), Gateway shutdown is the emergency stop. Know how to do it quickly: openclaw gateway stop, or for containerized setups, docker compose stop openclaw-gateway.

Conflict resolution via binding specificity

When two bindings could handle the same message, OpenClaw uses the most-specific binding. This determinism means you can reason about conflict resolution statically, before anything runs. Design your binding configuration so the intended behavior is always the most specific match, and you eliminate an entire class of routing ambiguity.

For conflicts that bindings can't resolve (where the question is genuinely "which agent should handle this based on content") that's a job for the coordinator, not the routing layer. The routing layer gets the message to the coordinator; the coordinator makes the judgment call based on context and task type.

Your idea deserves better hosting

24/7 support 30-day money-back guarantee Cancel anytime
Számlázási ciklus

1 GB RAM VPS

$3.99 Save  50 %
$1.99 havonta
  • 1 vCPU AMD EPYC
  • 30 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Tűzfal kezelése
  • Szerver megfigyelés

2 GB RAM VPS

$5.99 Save  17 %
$4.99 havonta
  • 2 vCPU AMD EPYC
  • 30 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Tűzfal kezelése
  • Szerver megfigyelés

6 GB RAM VPS

$14.99 Save  33 %
$9.99 havonta
  • 6 vCPU AMD EPYC
  • 70 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Tűzfal kezelése
  • Szerver megfigyelés

AMD EPYC VPS.P1

$7.99 Save  25 %
$5.99 havonta
  • 2 vCPU AMD EPYC
  • 4 GB RAM memória
  • 40 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

AMD EPYC VPS.P2

$14.99 Save  27 %
$10.99 havonta
  • 2 vCPU AMD EPYC
  • 8 GB RAM memória
  • 80 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

AMD EPYC VPS.P4

$29.99 Save  20 %
$23.99 havonta
  • 4 vCPU AMD EPYC
  • 16 GB RAM memória
  • 160 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

AMD EPYC VPS.P5

$36.49 Save  21 %
$28.99 havonta
  • 8 vCPU AMD EPYC
  • 16 GB RAM memória
  • 180 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

AMD EPYC VPS.P6

$56.99 Save  21 %
$44.99 havonta
  • 8 vCPU AMD EPYC
  • 32 GB RAM memória
  • 200 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

AMD EPYC VPS.P7

$69.99 Save  20 %
$55.99 havonta
  • 16 vCPU AMD EPYC
  • 32 GB RAM memória
  • 240 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

EPYC Genoa VPS.G1

$4.99 Save  20 %
$3.99 havonta
  • 1 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4. generációs 9xx4 processzor 3.25 GHz-en vagy hasonló, Zen 4 architektúrával.
  • 1 GB DDR5 RAM memória
  • 25 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

EPYC Genoa VPS.G2

$12.99 Save  23 %
$9.99 havonta
  • 2 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4. generációs 9xx4 processzor 3.25 GHz-en vagy hasonló, Zen 4 architektúrával.
  • 4 GB DDR5 RAM memória
  • 50 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

EPYC Genoa VPS.G4

$25.99 Save  27 %
$18.99 havonta
  • 4 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4. generációs 9xx4 processzor 3.25 GHz-en vagy hasonló, Zen 4 architektúrával.
  • 8 GB DDR5 RAM memória
  • 100 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

EPYC Genoa VPS.G5

$44.99 Save  33 %
$29.99 havonta
  • 4 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4. generációs 9xx4 processzor 3.25 GHz-en vagy hasonló, Zen 4 architektúrával.
  • 16 GB DDR5 RAM memória
  • 150 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

EPYC Genoa VPS.G6

$48.99 Save  31 %
$33.99 havonta
  • 8 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4. generációs 9xx4 processzor 3.25 GHz-en vagy hasonló, Zen 4 architektúrával.
  • 16 GB DDR5 RAM memória
  • 200 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

EPYC Genoa VPS.G7

$74.99 Save  27 %
$54.99 havonta
  • 8 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4. generációs 9xx4 processzor 3.25 GHz-en vagy hasonló, Zen 4 architektúrával.
  • 32 GB DDR5 RAM memória
  • 250 GB NVMe tárhely
  • Korlátlan sávszélesség
  • IPv4 és IPv6 benne van Az IPv6 támogatás jelenleg nem érhető el Franciaországban, Finnországban és Hollandiában.
  • 1 Gbps hálózat
  • Automatikus mentés benne
  • Tűzfal kezelése
  • Szerver megfigyelés

How do I setup multiple agents?

Follow the multi-agent setup guide - it covers configuration: how to define multiple agents, assign channels, and get them running. This guide covers what happens after that ... how agents coordinate, how to prevent common failure modes, how to control costs, and how to enforce governance. Think of them as part one and part two.

Automate faster, for less

Bring your winning ideas to life with AMD power, NVMe speed and unmetered bandwidth. Deploy your VPS in seconds, with a pre-installed OpenClaw template on Ubuntu 24.04.