OpenClaw is designed to run in Docker. The official repository ships a docker-setup.sh script, a Dockerfile, and a Docker Compose configuration that together handle the build, onboarding, and initial startup automatically. Most people getting started with Docker don't need to write any of that from scratch.
This guide covers three things: how to use the official setup path correctly and avoid the common pitfalls, how to customize the image and Compose stack for production use, and how to take a working Compose deployment and move it to Kubernetes with proper secrets management, persistent volumes, and autoscaling. If you're already running OpenClaw via systemd and want to know whether containerizing is worth it for your setup, see the discussion in the high availability and clustering guide.
Getting started: the official docker-setup.sh
The fastest path to a working Docker deployment is the official setup script in the OpenClaw repository. Clone the repo and run it:
git clone https://github.com/openclaw/openclaw.git
cd openclaw
./docker-setup.sh
The script does several things in sequence: it builds the Docker image locally from the included Dockerfile (or pulls a pre-built image if you set OPENCLAW_IMAGE first), runs the onboarding wizard interactively inside a container, generates a gateway token for the Control UI, creates the necessary volume directories, and starts the gateway via Docker Compose. When it finishes, the gateway is running at http://127.0.0.1:18789.
A few things that trip people up on first run:
- The build will OOM on a 1GB VPS. The npm install step during image build needs more memory than a 1GB instance provides. If you're on a cheap VPS, create a swap file first:
sudo fallocate -l 2G /swapfile && sudo chmod 600 /swapfile && sudo mkswap /swapfile && sudo swapon /swapfile. Then run the build. The swap file can be removed afterward. - Build takes 5-10 minutes on first run. Layer caching makes subsequent rebuilds much faster, but the initial build is slow because it installs all Node dependencies from scratch. Don't interrupt it.
- ChatGPT OAuth produces a confusing redirect. If you choose OpenAI Codex OAuth during onboarding, OpenClaw opens a browser URL that redirects back to a localhost callback that isn't running. Copy the full localhost URL from your browser and paste it back into the setup wizard. This is expected behavior.
- To skip the local build and use a pre-built image:
export OPENCLAW_IMAGE="ghcr.io/openclaw/openclaw:latest" && ./docker-setup.sh. The script detects that the image name doesn't match the local build default and pulls instead of building.
After setup, the two containers running are openclaw-gateway (the long-running Gateway process) and openclaw-cli (a management container for running CLI commands). The volumes created are ~/.openclaw (config, credentials, sessions, cron jobs) and ~/openclaw/workspace (memory files, tools, workspace content).
Basic Compose management commands
# Start in background
docker compose up -d
# Stop everything
docker compose down
# Watch live logs
docker compose logs -f openclaw-gateway
# Restart after a config change
docker compose restart openclaw-gateway
# Run a management command (login, status, etc.)
docker compose run --rm openclaw-cli openclaw status --all
# Interactive channel setup (e.g. Telegram)
docker compose run --rm openclaw-cli openclaw channels login telegram
# WhatsApp QR scan (needs TTY)
docker compose run -it --rm openclaw-cli openclaw channels login whatsapp
Administrative commands should always go through the openclaw-cli container rather than exec-ing into the gateway. The gateway container is meant to run the long-lived process; the CLI container is the right tool for config changes and channel operations.
Building a custom Docker image
The official image (node:22-bookworm-slim, roughly 500MB) runs as the non-root node user and ships without Chromium, Playwright, or most system tools. This is the right default for security. If you need additional packages (ffmpeg for media processing, additional build tools for certain skills), use the OPENCLAW_DOCKER_APT_PACKAGES build argument rather than installing at runtime:
export OPENCLAW_DOCKER_APT_PACKAGES="ffmpeg build-essential"
./docker-setup.sh
If you change OPENCLAW_DOCKER_APT_PACKAGES, rerun docker-setup.sh to rebuild the image with the new packages baked in. Installing packages at container startup (not build time) works but means every container restart re-installs them, which is slow and potentially fragile.
Multi-stage Dockerfile for custom production builds
For full control, write your own Dockerfile using a multi-stage build. The first stage installs all dependencies and builds the project; the second stage is a lean runtime image that only copies the built output:
FROM node:22-bookworm AS builder
RUN curl -fsSL https://bun.sh/install | bash
ENV PATH="/root/.bun/bin:${PATH}"
RUN corepack enable
WORKDIR /app
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml .npmrc ./
COPY ui/package.json ./ui/package.json
COPY scripts ./scripts
RUN pnpm install --frozen-lockfile
COPY . .
RUN pnpm build && pnpm ui:install && pnpm ui:build
# Runtime stage
FROM node:22-bookworm-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
git curl jq \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/ui/dist ./ui/dist
COPY package.json pnpm-lock.yaml ./
RUN corepack enable && pnpm install --prod --frozen-lockfile
# Non-root user (uid 1000)
RUN useradd -m -u 1000 node && chown -R node:node /app
USER node
ENV NODE_ENV=production
EXPOSE 18789 18793
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD node dist/index.js health || exit 1
CMD ["node", "dist/index.js"]
The start-period=60s in the healthcheck gives the gateway enough time to complete startup before Docker starts counting failed health checks. Getting this too short causes the container to be restarted before it's finished initializing, which produces confusing restart loops.
If you see Permission denied errors on /home/node/.openclaw with a custom image, your host volume mount isn't owned by uid 1000. Fix it: sudo chown -R 1000:1000 ~/.openclaw ~/openclaw/workspace.
Adding sandbox support
OpenClaw's agent sandboxing (running tool executions in isolated sub-containers) requires Docker CLI inside the gateway image and access to the Docker socket. The build arg for this is OPENCLAW_INSTALL_DOCKER_CLI=1, which docker-setup.sh sets automatically when sandbox mode is enabled. If you're building manually:
docker build --build-arg OPENCLAW_INSTALL_DOCKER_CLI=1 -t openclaw:local .
And mount the Docker socket in your Compose service:
volumes:
- /var/run/docker.sock:/var/run/docker.sock
This is a significant security decision. The Docker socket gives the container root-equivalent access to the host. Only enable it if you've thought through the implications and have other controls in place.
Production Docker Compose configuration
Here's a complete production-ready Compose file with the gateway, CLI management container, optional QMD memory backend, and nginx reverse proxy:
version: '3.8'
services:
openclaw-gateway:
image: openclaw/openclaw:latest
container_name: openclaw-gateway
restart: unless-stopped
ports:
- "127.0.0.1:18789:18789" # Bind to localhost only; nginx handles external
volumes:
- openclaw-home:/home/node/.openclaw
- openclaw-workspace:/home/node/workspace
environment:
- NODE_ENV=production
- OPENCLAW_HOME=/home/node/.openclaw
secrets:
- anthropic_key
- telegram_token
command: openclaw gateway
deploy:
resources:
limits:
cpus: '2.0'
memory: 2G
reservations:
cpus: '0.5'
memory: 512M
healthcheck:
test: ["CMD-SHELL", "node dist/index.js health || exit 1"]
interval: 30s
timeout: 10s
start_period: 60s
retries: 3
openclaw-cli:
image: openclaw/openclaw:latest
volumes_from:
- openclaw-gateway
entrypoint: openclaw
profiles:
- tools # Only starts with: docker compose --profile tools run openclaw-cli ...
qmd:
image: tobi/qmd:latest
restart: unless-stopped
volumes:
- openclaw-workspace:/workspace
environment:
- QMD_WORKSPACE=/workspace
nginx:
image: nginx:alpine
restart: unless-stopped
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- certs:/etc/nginx/certs
depends_on:
- openclaw-gateway
secrets:
anthropic_key:
file: ./secrets/anthropic_key.txt
telegram_token:
file: ./secrets/telegram_token.txt
volumes:
openclaw-home:
openclaw-workspace:
certs:
Binding the gateway port to 127.0.0.1:18789 instead of 0.0.0.0:18789 is important on a public VPS. It means the gateway is only reachable through nginx, not directly from the internet. Exposed OpenClaw gateways on public IPs have been found in the wild leaking API keys and session history. Don't skip this.
Injecting secrets and environment variables
OpenClaw resolves ${VAR} references in config at startup. There are three practical ways to inject secrets into a Docker deployment, in order of preference.
Docker secrets (recommended)
Docker Compose secrets mount files at /run/secrets/<name> inside the container. Create the secret files first, then reference them:
# Create secret files (chmod them tightly)
mkdir secrets
echo "sk-ant-your-key" > secrets/anthropic_key.txt
chmod 400 secrets/anthropic_key.txt
Then configure OpenClaw's SecretRef to read from that path:
secrets:
providers:
docker:
source: "file"
path: "/run/secrets"
mode: "singleValue"
models:
providers:
anthropic:
apiKey:
source: "file"
provider: "docker"
id: "anthropic_key"
Secret files are mounted read-only inside the container and don't appear in docker inspect output the way environment variables do. This is meaningfully more secure than environment variables for credentials.
Environment variables via .env file
If Docker secrets feel like too much overhead for a single-host personal setup, a .env file in the same directory as your docker-compose.yml is the next best option:
ANTHROPIC_API_KEY=sk-ant-...
TELEGRAM_BOT_TOKEN=123456:ABC-...
Docker Compose reads .env automatically and makes those variables available to services. Make sure the file has restricted permissions (chmod 600 .env) and is in .gitignore if you version-control your Compose config.
Environment variables directly in Compose (avoid for secrets)
Hardcoding values in the environment: block of docker-compose.yml is the least secure option since the values are in the Compose file in plaintext. Fine for non-sensitive config, not for API keys or bot tokens.
Kubernetes deployment
Kubernetes makes sense when you need automatic failover, rolling deployments without downtime, or horizontal scaling beyond what a single host can handle. The setup is more complex than Docker Compose, but the primitives are straightforward once you understand them. If you just want reliable uptime on a single VPS, Docker Compose with restart: unless-stopped and a monitoring alert is usually enough.
Namespace and secrets
Start by creating a namespace to keep OpenClaw resources isolated from other workloads on the cluster:
apiVersion: v1
kind: Namespace
metadata:
name: openclaw
Secrets in Kubernetes are base64-encoded (not encrypted at rest by default, though you can enable encryption at rest in your cluster config). Generate encoded values with echo -n 'your-value' | base64:
apiVersion: v1
kind: Secret
metadata:
name: openclaw-secrets
namespace: openclaw
type: Opaque
data:
ANTHROPIC_API_KEY: <base64-encoded-value>
TELEGRAM_BOT_TOKEN: <base64-encoded-value>
DISCORD_BOT_TOKEN: <base64-encoded-value>
For production clusters, use External Secrets Operator to pull from HashiCorp Vault or AWS Secrets Manager rather than baking values into manifests. See the secrets management guide for how to connect external stores.
ConfigMap for non-secret configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: openclaw-config
namespace: openclaw
data:
OPENCLAW_HOME: "/home/node/.openclaw"
NODE_ENV: "production"
Persistent volume claims
This is the most important infrastructure decision for a Kubernetes OpenClaw deployment. You need ReadWriteMany (RWX) access mode if you run more than one replica, because all pods must mount the same config and workspace volumes simultaneously. Standard block storage (AWS EBS, standard Kubernetes local-path) only supports ReadWriteOnce (RWO), which works for single-replica but not multi-replica.
RWX-capable options include NFS, CephFS, and Longhorn. NFS is the easiest to add to an existing cluster:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: openclaw-home-pvc
namespace: openclaw
spec:
accessModes:
- ReadWriteMany
storageClassName: nfs-client
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: openclaw-workspace-pvc
namespace: openclaw
spec:
accessModes:
- ReadWriteMany
storageClassName: nfs-client
resources:
requests:
storage: 20Gi
Deployment manifest
apiVersion: apps/v1
kind: Deployment
metadata:
name: openclaw-gateway
namespace: openclaw
spec:
replicas: 3
selector:
matchLabels:
app: openclaw-gateway
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: openclaw-gateway
spec:
containers:
- name: gateway
image: openclaw/openclaw:latest
ports:
- containerPort: 18789
envFrom:
- secretRef:
name: openclaw-secrets
- configMapRef:
name: openclaw-config
volumeMounts:
- name: home
mountPath: /home/node/.openclaw
- name: workspace
mountPath: /home/node/workspace
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2"
memory: "2Gi"
livenessProbe:
httpGet:
path: /health
port: 18789
initialDelaySeconds: 60
periodSeconds: 15
failureThreshold: 3
readinessProbe:
httpGet:
path: /health
port: 18789
initialDelaySeconds: 10
periodSeconds: 5
volumes:
- name: home
persistentVolumeClaim:
claimName: openclaw-home-pvc
- name: workspace
persistentVolumeClaim:
claimName: openclaw-workspace-pvc
The initialDelaySeconds: 60 on the liveness probe prevents Kubernetes from killing and restarting the pod before it's finished initializing. Getting this too low is the most common cause of restart loops on fresh deployments.
Service and Ingress
apiVersion: v1
kind: Service
metadata:
name: openclaw-service
namespace: openclaw
spec:
selector:
app: openclaw-gateway
ports:
- name: gateway
port: 18789
targetPort: 18789
type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: openclaw-ingress
namespace: openclaw
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-connect-timeout: "60"
spec:
rules:
- host: openclaw.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: openclaw-service
port:
number: 18789
The 3600-second proxy timeouts are necessary for WebSocket connections that the gateway uses for some channel communication. Default nginx timeouts of 60 seconds will silently drop long-running connections.
Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: openclaw-hpa
namespace: openclaw
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: openclaw-gateway
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Pod Disruption Budget
Prevents cluster maintenance from taking down all replicas simultaneously:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: openclaw-pdb
namespace: openclaw
spec:
minAvailable: 2
selector:
matchLabels:
app: openclaw-gateway
Updating and rolling back
Docker Compose updates
# Pull the latest image
docker compose pull
# Recreate containers with the new image
docker compose up -d --force-recreate
# Watch startup logs
docker compose logs -f openclaw-gateway
Your named volumes survive this operation. If something breaks, tag the old image before pulling so you can return to it:
# Before pulling a new image, tag the current one
docker tag openclaw/openclaw:latest openclaw/openclaw:pre-upgrade
# If the new version has problems, revert
docker compose down
# Edit docker-compose.yml: change image to openclaw/openclaw:pre-upgrade
docker compose up -d
Kubernetes rolling updates and rollbacks
Update the image in the deployment, which triggers a rolling update respecting your maxUnavailable setting:
# Update to a specific version
kubectl set image deployment/openclaw-gateway \
gateway=openclaw/openclaw:2026.2.6 \
-n openclaw
# Watch the rollout
kubectl rollout status deployment/openclaw-gateway -n openclaw
# Roll back to the previous version if something goes wrong
kubectl rollout undo deployment/openclaw-gateway -n openclaw
# Roll back to a specific revision
kubectl rollout history deployment/openclaw-gateway -n openclaw
kubectl rollout undo deployment/openclaw-gateway --to-revision=3 -n openclaw
Kubernetes keeps rollout history (10 revisions by default) so you can return to any previous configuration, not just the immediately prior one. For zero-downtime deployments, the rolling update strategy with maxUnavailable: 1 means at least two pods are always serving traffic during a deployment.
For monitoring your running containers and alerting on health check failures, the monitoring guide covers Prometheus integration that works for both Docker Compose and Kubernetes deployments.

