Back to Article List

Hermes Agent gateway memory leak: How to mitigate OOM

Hermes Agent gateway memory leak: How to mitigate OOM

If your Hermes Agent gateway is fine for a day, then suddenly the box runs out of memory at midnight and the OOM killer takes the gateway down, you've hit the long-running gateway memory leak. It's a real bug in the upstream code, tracked in the Hermes issue tracker. Every VPS deployment that runs more than 24 hours straight will see it eventually.

Below: how to confirm you're hitting this specific leak (vs other memory issues), the mitigations that work today and what to watch for upstream.

Let's go!

Confirm it's the gateway leak (not something else)

Three different memory problems look similar from the outside. Make sure you're chasing the right one.

Sign 1: memory grows linearly with uptime

If you graph RSS over time, the gateway leak produces a steady upward trend. Not flat. Not spiky. Steadily climbing maybe 30-100 MB per hour depending on conversation throughput.

while true; do
  ps -o rss= -p $(pgrep -f "hermes gateway") | awk '{print strftime("%H:%M:%S"), $1/1024 "MB"}'
  sleep 300
done | tee gateway-memory.log

Let that run for a few hours. Plot it. Linear-up means leak. Flat with occasional bumps means normal operation.

Sign 2: it survives provider switches

Some memory bloat is provider-side (caching response buffers). The gateway leak doesn't care what provider you use. If you switch from Anthropic to OpenRouter mid-session and memory keeps climbing at the same rate, that's the leak.

Sign 3: dmesg shows the OOM killer picking your gateway

dmesg | grep -i "killed process" | grep -i hermes
journalctl -u hermes-gateway --since "24 hours ago" | grep -i "oom\|killed"

If you see a kill entry naming the hermes gateway PID, that confirms a real OOM rather than something else taking it down.

Mitigation 1: scheduled restarts (the boring fix that works)

This is what I run in production. Restart the gateway every 12 hours. The leak never gets bad enough to OOM. Users see at most a few seconds of downtime overlapping with messaging activity.

Restart through systemd

If your gateway runs under systemd (covered in our Hermes Agent systemd setup), add a timer that restarts it:

sudo systemctl edit --force --full hermes-gateway-restart.service

Content:

[Unit]
Description=Restart Hermes Gateway periodically
After=hermes-gateway.service

[Service]
Type=oneshot
ExecStart=/bin/systemctl restart hermes-gateway.service

Then the timer:

sudo systemctl edit --force --full hermes-gateway-restart.timer
[Unit]
Description=Restart Hermes Gateway every 12h

[Timer]
OnBootSec=12h
OnUnitActiveSec=12h
Persistent=true

[Install]
WantedBy=timers.target
sudo systemctl enable --now hermes-gateway-restart.timer
sudo systemctl list-timers --all

You should see the restart timer scheduled. Pick a window that doesn't overlap your busiest hour (mine fires at 4 a.m. local and 4 p.m. local).

Why 12 hours and not 24

The leak rate varies. On a busy bot with messaging gateways running, the gateway can OOM in 20 hours. On a quiet personal install, it goes 60+. 12 hours is the safe upper bound for most setups. Quiet boxes can stretch to 24 if you'd rather have fewer restart blips. Busy boxes need 6.

Mitigation 2: monitoring and alert before OOM

If you'd rather not restart on a fixed schedule, monitor memory and restart only when usage crosses a threshold.

cat > /usr/local/bin/hermes-mem-watchdog.sh << 'EOF'
#!/bin/bash
GATEWAY_PID=$(pgrep -f "hermes gateway" | head -1)
[ -z "$GATEWAY_PID" ] && exit 0
RSS_MB=$(ps -o rss= -p $GATEWAY_PID | awk '{print int($1/1024)}')
THRESHOLD=2048
if [ "$RSS_MB" -gt "$THRESHOLD" ]; then
  logger "Hermes gateway at ${RSS_MB}MB, restarting"
  systemctl restart hermes-gateway
fi
EOF
chmod +x /usr/local/bin/hermes-mem-watchdog.sh

Run it every 5 minutes from cron or a systemd timer. Threshold 2048 MB is conservative for a 4 GB box. Tune to your total RAM.

The trade-off vs scheduled restart: less downtime on quiet days, slightly more risk of an unexpected restart in the middle of busy traffic.

Mitigation 3: bigger box (the lazy fix)

If the leak rate is 50 MB/hour and you have 16 GB of RAM available to the gateway, you can go a long time before hitting OOM. Not technically a fix but real. On a 2 GB VPS the leak bites in less than a day. On a 16 GB VPS you have a week.

If you're already paying for a small VPS and the gateway is the only thing on it, moving up one tier on LumaDock buys you a lot of breathing room and removes the restart-cadence question for a while. Plans include unmetered bandwidth and no setup fees, so resizing mid-month is painless. Setup details in our Hermes Agent complete guide.

Mitigation 4: reduce conversation history retention

Part of the leak appears to be conversation history accumulating in memory. Aggressive compression keeps the working set smaller.

hermes config set session_max_messages 50
hermes config set session_auto_compress true

Sessions older than 50 messages get summarised and compressed. The compressed summary stays. The original messages get evicted from working memory.

This isn't a full fix (the leak is still there) but it slows the rate. I've seen leak rates drop from 60 MB/hour to 25 MB/hour with this setting on. Worth doing if you can't get to a fix.

What we know about the cause

From the upstream issue thread, the leak appears related to streaming session buffers not being released after the response completes. Specifically, the _drop_trailing_empty_response_scaffolding code path in the message flush pipeline holds references that should be GC-able but aren't. Same general area as the missing-assistant-messages bug we cover in our database is locked piece, but a different specific failure mode.

A fix is in flight in the Hermes repo. Watch the Hermes issues tracker for the specific PR landing. Until then, the restart cadence above is the operational answer.

After the upstream fix lands

Don't immediately rip out your restart timer. New releases sometimes introduce new leaks. Run the patched version with the restart timer still in place for at least a week, monitor the memory graph, confirm it stays flat. Then if you want to remove the timer, fine. I'd leave it on for peace of mind even after the fix, because there's no real cost to a midnight restart.

Logging context that helps if you're filing your own issue

If your leak looks different from what I described (faster than 100 MB/hour, sawtooth pattern rather than linear or only happens with certain providers) you might be hitting a related but separate bug. Capture:

  • Hermes version (hermes --version)
  • Provider, model, gateway channels in use
  • Average messages per hour
  • The memory graph from sign 1 above
  • Last 200 lines of ~/.hermes/logs/gateway.log

File it on the GitHub repo. Upstream maintainers usually triage memory issues quickly because they hurt every VPS deployment.

Where this fits with the rest of the stack

Memory monitoring should sit alongside your existing health checks. If you've got Prometheus + Grafana already, the gateway RSS metric is a useful one to chart. If you don't, the watchdog script above is enough for a single VPS. Production hardening more broadly lives in our Hermes production hardening checklist.

Your idea deserves better hosting

24/7 support 30-day money-back guarantee Cancel anytime
Fatura Kesim Döngüsü

1 GB RAM VPS

14.60 zł Save  25 %
10.94 Aylık
  • 1 vCPU AMD EPYC
  • 30 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

2 GB RAM VPS

21.92 zł Save  17 %
18.26 Aylık
  • 2 vCPU AMD EPYC
  • 30 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

6 GB RAM VPS

54.86 zł Save  33 %
36.56 Aylık
  • 6 vCPU AMD EPYC
  • 70 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

AMD EPYC VPS.P1

29.24 zł Save  25 %
21.92 Aylık
  • 2 vCPU AMD EPYC
  • 4 GB RAM belleği
  • 40 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Otomatik yedekleme dahil
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

AMD EPYC VPS.P2

54.86 zł Save  27 %
40.22 Aylık
  • 2 vCPU AMD EPYC
  • 8 GB RAM belleği
  • 80 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Otomatik yedekleme dahil
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

AMD EPYC VPS.P4

109.75 zł Save  20 %
87.79 Aylık
  • 4 vCPU AMD EPYC
  • 16 GB RAM belleği
  • 160 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Otomatik yedekleme dahil
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

AMD EPYC VPS.P5

133.54 zł Save  21 %
106.09 Aylık
  • 8 vCPU AMD EPYC
  • 16 GB RAM belleği
  • 180 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Otomatik yedekleme dahil
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

AMD EPYC VPS.P6

208.56 zł Save  21 %
164.64 Aylık
  • 8 vCPU AMD EPYC
  • 32 GB RAM belleği
  • 200 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Otomatik yedekleme dahil
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

AMD EPYC VPS.P7

256.13 zł Save  20 %
204.90 Aylık
  • 16 vCPU AMD EPYC
  • 32 GB RAM belleği
  • 240 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Otomatik yedekleme dahil
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

EPYC Genoa VPS.G1

18.26 zł Save  20 %
14.60 Aylık
  • 1 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4. nesil 9xx4, 3.25 GHz veya benzeri hızda, Zen 4 mimarisiyle.
  • 1 GB DDR5 RAM belleği
  • 25 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Otomatik yedekleme dahil
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

EPYC Genoa VPS.G2

47.54 zł Save  23 %
36.56 Aylık
  • 2 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4. nesil 9xx4, 3.25 GHz veya benzeri hızda, Zen 4 mimarisiyle.
  • 4 GB DDR5 RAM belleği
  • 50 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Otomatik yedekleme dahil
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

EPYC Genoa VPS.G4

95.11 zł Save  27 %
69.50 Aylık
  • 4 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4. nesil 9xx4, 3.25 GHz veya benzeri hızda, Zen 4 mimarisiyle.
  • 8 GB DDR5 RAM belleği
  • 100 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Otomatik yedekleme dahil
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

EPYC Genoa VPS.G6

179.28 zł Save  31 %
124.39 Aylık
  • 8 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4. nesil 9xx4, 3.25 GHz veya benzeri hızda, Zen 4 mimarisiyle.
  • 16 GB DDR5 RAM belleği
  • 200 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Otomatik yedekleme dahil
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

EPYC Genoa VPS.G7

274.43 zł Save  27 %
201.24 Aylık
  • 8 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4. nesil 9xx4, 3.25 GHz veya benzeri hızda, Zen 4 mimarisiyle.
  • 32 GB DDR5 RAM belleği
  • 250 GB NVMe depolama
  • Sınırsız bant genişliği
  • IPv4 ve IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değildir.
  • 1 Gbps
  • Otomatik yedekleme dahil
  • Güvenlik duvarı yönetimi
  • Ücretsiz sunucu izleme

AMD Ryzen VPS.R1

58.52 zł Save  31 %
40.22 Aylık
  • 1 özel CPU AMD Ryzen 9 7950X, 4,5 GHz veya benzeri, Zen 4 mimarisinde. vCPU
  • 4 GB DDR5BELLEK
  • 50 GB NVMeDEPOLAMA
  • Sınırsız bant genişliği
  • IPv4 & IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değil.
  • Otomatik yedekleme dahil

AMD Ryzen VPS.R2

102.43 zł Save  21 %
80.47 Aylık
  • 2 özel CPU AMD Ryzen 9 7950X, 4,5 GHz veya benzeri, Zen 4 mimarisinde. vCPU
  • 8 GB DDR5BELLEK
  • 100 GB NVMeDEPOLAMA
  • Sınırsız bant genişliği
  • IPv4 & IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değil.
  • Otomatik yedekleme dahil

AMD Ryzen VPS.R4

365.92 zł Save  20 %
292.73 Aylık
  • 8 özel CPU AMD Ryzen 9 7950X, 4,5 GHz veya benzeri, Zen 4 mimarisinde. vCPU
  • 32 GB DDR5BELLEK
  • 400 GB NVMeDEPOLAMA
  • Sınırsız bant genişliği
  • IPv4 & IPv6 dahil IPv6 desteği şu anda Fransa, Finlandiya veya Hollanda'da mevcut değil.
  • Otomatik yedekleme dahil

Answers to common questions...

How do I know if my Hermes Agent gateway has a memory leak or just normal memory growth?

Well, a leak shows steady linear memory growth over hours, around 30-100 MB per hour depending on traffic. Normal operation is mostly flat. Track RSS every 5 minutes for a few hours and plot it.

Your agent runs wild. Your bill doesn't.

Easily deploy Hermes in one click on Ubuntu 24.04 with AMD EPYC, NVMe storage and unmetered bandwidth. The price stays the same whatever the agent does, no setup fees, no overage charges and no tier traps.

GPU products are in high demand at the moment. Fill the form to get notified as soon as your preferred GPU server is back in stock.