Why scaling n8n is not as simple as adding CPU
When people outgrow a single-node setup, the instinct is to just upgrade the VPS size. More vCPUs, more RAM, done. But n8n isn’t built to magically spread workloads across cores. In the default mode, everything runs in a single process. The way you actually scale n8n horizontally is with queue mode powered by Redis.
If you’re only running hobby automations, you may never need this. But if you’re processing hundreds of webhooks a minute, handling bulky API calls, or running workflows with long async jobs, queue mode becomes the difference between stability and constant retries.
How queue mode works
The main components
- Main process: receives webhooks, schedules jobs, hands them off to Redis.
- Redis: acts as a message broker, storing jobs until a worker picks them up.
- Workers: one or more processes that pull jobs from Redis and execute them.
- PostgreSQL: still stores workflow definitions, credentials, execution history.
So the division of labor looks like this:
- Main = orchestration.
- Redis = queue.
- Workers = execution.
- Postgres = persistence.
What this changes
With queue mode, you can run multiple workers in parallel, each consuming jobs independently. This means one slow workflow doesn’t block the others, and scaling is now as easy as running more workers.
Setting up Redis for n8n
Choosing Redis deployment
On a VPS you have two main options:
- Local Redis container alongside n8n and Postgres (simplest).
- Managed Redis from your provider (better for production scale).
For most setups I recommend starting with local Redis, then moving to managed once you hit serious throughput.
Docker Compose example
services:
n8n-main:
image: n8nio/n8n
environment:
- DB_TYPE=postgresdb
- DB_POSTGRESDB_HOST=postgres
- QUEUE_BULL_REDIS_HOST=redis
- N8N_ENCRYPTION_KEY=supersecret
- EXECUTIONS_PROCESS=main
ports:
- "5678:5678"
depends_on:
- postgres
- redis
n8n-worker:
image: n8nio/n8n
environment:
- DB_TYPE=postgresdb
- DB_POSTGRESDB_HOST=postgres
- QUEUE_BULL_REDIS_HOST=redis
- N8N_ENCRYPTION_KEY=supersecret
- EXECUTIONS_PROCESS=queue
depends_on:
- postgres
- redis
postgres:
image: postgres:13
environment:
- POSTGRES_USER=n8n
- POSTGRES_PASSWORD=n8n
- POSTGRES_DB=n8n
redis:
image: redis:7
This creates one main, one worker, Postgres, and Redis. In production, you’ll often run multiple workers spread across nodes.
How many workers do you need?
There’s no magic number. It depends on:
- Workflow type: CPU heavy vs I/O heavy.
- External dependencies: APIs with rate limits.
- VPS resources: CPU cores and memory available.
A rough rule of thumb: one worker per vCPU core is safe, but don’t oversubscribe if your workflows are CPU bound. For I/O heavy automations (API polling, webhook forwarding), you can usually run more workers than cores without issue.
Redis configuration best practices
Persistence and durability
By default Redis runs in memory with async persistence. If your VPS crashes, you may lose some queued jobs. If that’s unacceptable, enable Append Only File (AOF) mode:
appendonly yes
appendfsync everysec
This trades some write performance for durability.
Security
- Bind Redis to localhost or a private network only.
- Require a password with
requirepass
inredis.conf
. - Use Redis TLS if it’s on a managed service.
Postgres tuning when scaling workers
Redis gets a lot of attention, but Postgres becomes the bottleneck faster than people expect. With multiple workers slamming the DB, you’ll want to:
- Increase connection limits with
max_connections
. - Use a connection pooler like PgBouncer.
- Prune execution history aggressively to avoid bloated tables.
- Put Postgres on NVMe storage for consistent latency.
Observability in queue mode
Metrics worth tracking
- Redis queue depth: if it keeps growing, workers can’t keep up.
- Worker success vs failure rates: sudden spikes = misconfig.
- Execution duration percentiles: shows if jobs are slowing over time.
- Postgres query times: detect bottlenecks before they stall workers.
Tools I actually use
- Prometheus + Grafana: for graphs and long term trends.
- Uptime Kuma: simple alerts if a worker stops responding.
- RedisInsight: GUI to peek into Redis queues when debugging.
Common issues and how to fix them
Webhooks failing in queue mode
If webhooks stop reaching workers, check that WEBHOOK_URL
is set correctly in .env
and that your reverse proxy forwards headers.
Workers idle but queue growing
Usually Redis misconfiguration or workers pointing to the wrong host. Confirm both main and workers use the same Redis connection.
Jobs stuck in active
This happens when a worker dies mid-job. Check container logs. Restarting the worker usually releases the job.
High database load
Execution logs pile up fast. Use execution pruning or move to external logging.
Scaling strategies in real life
Vertical scaling
Increase VPS size, but watch diminishing returns. One giant worker process still only uses one core.
Horizontal scaling
Add more workers across nodes. Keep them pointing to the same Redis and Postgres. This is the true way to scale.
Hybrid scaling
Combine a slightly larger VPS with more workers. This balances cost and reliability.
FAQ
Do I need Redis to run multiple workers?
Yes. Redis is mandatory for queue mode. Without it, workers cannot coordinate.
Can I run Redis and Postgres on the same VPS as n8n?
Yes, for small to medium setups. For large workloads, separate them to reduce resource contention.
How do I know when to add another worker?
If queue depth consistently grows faster than workers can drain it, add more.
Does Redis persistence slow things down?
Slightly, but it’s worth it if you need durability. AOF mode is a good balance.
What’s the biggest mistake people make?
Treating queue mode like magic. It still requires DB tuning, monitoring, and careful workflow design.