Hermes Agent gateway memory leak: How to mitigate OOM

Ellie Grace Hayes

02/06/2026

Hermes Agent gateway memory leak: How to mitigate OOM - Hermes Agent gateway memory leak: How to mitigate OOM

If your Hermes Agent gateway is fine for a day, then suddenly the box runs out of memory at midnight and the OOM killer takes the gateway down, you've hit the long-running gateway memory leak. It's a real bug in the upstream code, tracked in the Hermes issue tracker. Every VPS deployment that runs more than 24 hours straight will see it eventually.

Below: how to confirm you're hitting this specific leak (vs other memory issues), the mitigations that work today and what to watch for upstream.

Let's go!

Confirm it's the gateway leak (not something else)

Three different memory problems look similar from the outside. Make sure you're chasing the right one.

Sign 1: memory grows linearly with uptime

If you graph RSS over time, the gateway leak produces a steady upward trend. Not flat. Not spiky. Steadily climbing maybe 30-100 MB per hour depending on conversation throughput.

while true; do
  ps -o rss= -p $(pgrep -f "hermes gateway") | awk '{print strftime("%H:%M:%S"), $1/1024 "MB"}'
  sleep 300
done | tee gateway-memory.log

Let that run for a few hours. Plot it. Linear-up means leak. Flat with occasional bumps means normal operation.

Sign 2: it survives provider switches

Some memory bloat is provider-side (caching response buffers). The gateway leak doesn't care what provider you use. If you switch from Anthropic to OpenRouter mid-session and memory keeps climbing at the same rate, that's the leak.

Sign 3: dmesg shows the OOM killer picking your gateway

dmesg | grep -i "killed process" | grep -i hermes
journalctl -u hermes-gateway --since "24 hours ago" | grep -i "oom\|killed"

If you see a kill entry naming the hermes gateway PID, that confirms a real OOM rather than something else taking it down.

Mitigation 1: scheduled restarts (the boring fix that works)

This is what I run in production. Restart the gateway every 12 hours. The leak never gets bad enough to OOM. Users see at most a few seconds of downtime overlapping with messaging activity.

Restart through systemd

If your gateway runs under systemd (covered in our Hermes Agent systemd setup), add a timer that restarts it:

sudo systemctl edit --force --full hermes-gateway-restart.service

Content:

[Unit]
Description=Restart Hermes Gateway periodically
After=hermes-gateway.service

[Service]
Type=oneshot
ExecStart=/bin/systemctl restart hermes-gateway.service

Then the timer:

sudo systemctl edit --force --full hermes-gateway-restart.timer

[Unit]
Description=Restart Hermes Gateway every 12h

[Timer]
OnBootSec=12h
OnUnitActiveSec=12h
Persistent=true

[Install]
WantedBy=timers.target

sudo systemctl enable --now hermes-gateway-restart.timer
sudo systemctl list-timers --all

You should see the restart timer scheduled. Pick a window that doesn't overlap your busiest hour (mine fires at 4 a.m. local and 4 p.m. local).

Why 12 hours and not 24

The leak rate varies. On a busy bot with messaging gateways running, the gateway can OOM in 20 hours. On a quiet personal install, it goes 60+. 12 hours is the safe upper bound for most setups. Quiet boxes can stretch to 24 if you'd rather have fewer restart blips. Busy boxes need 6.

Mitigation 2: monitoring and alert before OOM

If you'd rather not restart on a fixed schedule, monitor memory and restart only when usage crosses a threshold.

cat > /usr/local/bin/hermes-mem-watchdog.sh << 'EOF'
#!/bin/bash
GATEWAY_PID=$(pgrep -f "hermes gateway" | head -1)
[ -z "$GATEWAY_PID" ] && exit 0
RSS_MB=$(ps -o rss= -p $GATEWAY_PID | awk '{print int($1/1024)}')
THRESHOLD=2048
if [ "$RSS_MB" -gt "$THRESHOLD" ]; then
  logger "Hermes gateway at ${RSS_MB}MB, restarting"
  systemctl restart hermes-gateway
fi
EOF
chmod +x /usr/local/bin/hermes-mem-watchdog.sh

Run it every 5 minutes from cron or a systemd timer. Threshold 2048 MB is conservative for a 4 GB box. Tune to your total RAM.

The trade-off vs scheduled restart: less downtime on quiet days, slightly more risk of an unexpected restart in the middle of busy traffic.

Mitigation 3: bigger box (the lazy fix)

If the leak rate is 50 MB/hour and you have 16 GB of RAM available to the gateway, you can go a long time before hitting OOM. Not technically a fix but real. On a 2 GB VPS the leak bites in less than a day. On a 16 GB VPS you have a week.

If you're already paying for a small VPS and the gateway is the only thing on it, moving up one tier on LumaDock buys you a lot of breathing room and removes the restart-cadence question for a while. Plans include unmetered bandwidth and no setup fees, so resizing mid-month is painless. Setup details in our Hermes Agent complete guide.

Mitigation 4: reduce conversation history retention

Part of the leak appears to be conversation history accumulating in memory. Aggressive compression keeps the working set smaller.

hermes config set session_max_messages 50
hermes config set session_auto_compress true

Sessions older than 50 messages get summarised and compressed. The compressed summary stays. The original messages get evicted from working memory.

This isn't a full fix (the leak is still there) but it slows the rate. I've seen leak rates drop from 60 MB/hour to 25 MB/hour with this setting on. Worth doing if you can't get to a fix.

What we know about the cause

From the upstream issue thread, the leak appears related to streaming session buffers not being released after the response completes. Specifically, the _drop_trailing_empty_response_scaffolding code path in the message flush pipeline holds references that should be GC-able but aren't. Same general area as the missing-assistant-messages bug we cover in our database is locked piece, but a different specific failure mode.

A fix is in flight in the Hermes repo. Watch the Hermes issues tracker for the specific PR landing. Until then, the restart cadence above is the operational answer.

After the upstream fix lands

Don't immediately rip out your restart timer. New releases sometimes introduce new leaks. Run the patched version with the restart timer still in place for at least a week, monitor the memory graph, confirm it stays flat. Then if you want to remove the timer, fine. I'd leave it on for peace of mind even after the fix, because there's no real cost to a midnight restart.

Logging context that helps if you're filing your own issue

If your leak looks different from what I described (faster than 100 MB/hour, sawtooth pattern rather than linear or only happens with certain providers) you might be hitting a related but separate bug. Capture:

Hermes version (hermes --version)
Provider, model, gateway channels in use
Average messages per hour
The memory graph from sign 1 above
Last 200 lines of ~/.hermes/logs/gateway.log

File it on the GitHub repo. Upstream maintainers usually triage memory issues quickly because they hurt every VPS deployment.

Where this fits with the rest of the stack

Memory monitoring should sit alongside your existing health checks. If you've got Prometheus + Grafana already, the gateway RSS metric is a useful one to chart. If you don't, the watchdog script above is enough for a single VPS. Production hardening more broadly lives in our Hermes production hardening checklist.

Your idea deserves better hosting

24/7 support 30-day money-back guarantee Cancel anytime

Billing Cycle

VPS.S1

57.47 kr Save 17 %

47.87 _kr Monthly

2 vCPU AMD EPYC
2 GB RAMMEMORY
30 GB NVMeSTORAGE
Unmetered bandwidth
IPv4 & IPv6IPv6 is currently unavailable in France, Finland or the Netherlands. included

Hermes Agent gateway memory leak: How to mitigate OOM

Confirm it's the gateway leak (not something else)

Sign 1: memory grows linearly with uptime

Sign 2: it survives provider switches

Sign 3: dmesg shows the OOM killer picking your gateway

Mitigation 1: scheduled restarts (the boring fix that works)

Restart through systemd

Why 12 hours and not 24

Mitigation 2: monitoring and alert before OOM

Mitigation 3: bigger box (the lazy fix)

Mitigation 4: reduce conversation history retention

What we know about the cause

After the upstream fix lands

Logging context that helps if you're filing your own issue

Where this fits with the rest of the stack

Your idea deserves better hosting

VPS.S1

VPS.S2

VPS.S3

EPYC VPS.P1

EPYC VPS.P2

EPYC VPS.P3

EPYC VPS.P4

EPYC VPS.P5

EPYC VPS.P6

EPYC VPS.P7

Genoa VPS.G2

Genoa VPS.G3

Genoa VPS.G4

Genoa VPS.G6

Genoa VPS.G7

AMD Ryzen VPS.R1

AMD Ryzen VPS.R2

AMD Ryzen VPS.R3

AMD Ryzen VPS.R4

Answers to common questions...

How do I know if my Hermes Agent gateway has a memory leak or just normal memory growth?

How often should I restart the Hermes gateway to avoid OOM?

Is there an upstream fix for the Hermes gateway memory leak?

Can I reduce the memory leak rate without restarting?

Should I move to a bigger VPS to handle the memory leak?

Your agent runs wild. Your bill doesn't.

Products

App hosting solutions

Resources

Company

Features

Get help

Solutions by use case

Generate Password