Back to Article List

Fix Hermes 402 quota errors with fallback providers

Fix Hermes 402 quota errors with fallback providers

HTTP 402 from your LLM provider means you hit a billing or quota limit. Out of credits, exhausted daily token allowance, payment method declined. The painful part isn't the 402 itself. It's that older Hermes versions treated a single 402 as fatal and killed the gateway. Suddenly your Telegram bot is silent, your scheduled briefing didn't fire and the agent looks dead.

This guide covers three things: upgrading to a version that handles 402 cleanly, configuring fallback providers so a single quota issue doesn't take down the whole agent, and monitoring so you find out before users do.

What 402 looks like across providers

Provider-specific text varies:

  • Anthropic: HTTP 402: Your account is out of credits
  • OpenRouter: 402: insufficient_credits
  • MiniMax: daily_limit_busy with 402 status
  • OpenAI: usually returns 429 not 402 (different status for out-of-credits, see below)

Step 1: Upgrade Hermes if your gateway dies on 402

Older Hermes versions propagated 402 up as a fatal gateway error. Newer versions retry and surface a clean error. If your gateway log shows a 402 followed by the gateway shutting down, you're on a version that needs upgrading:

hermes upgrade
hermes --version

Check the Hermes releases page for the current version. The 402 handling improvements landed in v0.13 or later.

Step 2: Why fallback providers matter

Even with a clean upgrade, a single provider's 402 means the agent stops answering until the quota resets. Monthly cap means a week of downtime. Daily cap means until midnight UTC. Either way the agent looks broken to users.

Fallback providers give Hermes a second route. When primary returns 402, gateway transparently retries through the secondary. Users see no failure. You get a log entry telling you the primary is exhausted, but the bot keeps working.

Step 3: Configure the fallback

Add a second provider

hermes provider list
hermes provider add openrouter --priority 2
hermes provider show

Priority 1 is the default, tried first. Priority 2 is fallback, only used when primary fails with a retriable error. You can chain priority 3 and 4 if you want belts and braces.

Pick a fallback model that mirrors the primary

If primary is Anthropic Sonnet, set the OpenRouter fallback to a Sonnet-equivalent model (Sonnet 4.6 is available on OpenRouter too). Same model class, different billing relationship, so an Anthropic credit exhaustion doesn't take you down.

Step 4: Tune which errors trigger fallback

Default fallback statuses are 402, 429, 5xx. You can change this:

hermes config set fallback_on_status "402,429,500,502,503,504"
hermes config set fallback_max_retries 2

Don't add 401 to the fallback list

If you do, real auth misconfigurations get covered up. You only find out the primary key is broken when both providers exhaust together. See our Hermes 401 auth errors piece for why 401 should always stay visible.

Credential pools for the same provider

If your usage justifies a second account at the same provider, Hermes can rotate between two keys associated with the same provider config:

hermes auth add anthropic --pool primary
hermes auth add anthropic --pool secondary
hermes provider set anthropic --credential-pool primary,secondary

This is rate-limit smoothing, not quota smoothing. Two accounts hitting the same monthly cap together still hit the cap. The real win is for short bursts where one account would 429 but a pool of two stays under the per-account RPM.

Monitoring to catch 402 early

The simplest pattern is to alert on the first 402 of a billing period. Every subsequent one is noise.

If you run the gateway under systemd (covered in our systemd setup), tail the gateway log into your monitoring stack:

journalctl -u hermes-gateway -f | grep --line-buffered "402" | mail -s "Hermes 402" [email protected]

Primitive but it works. If you have a real monitoring stack (Prometheus, Loki, anything with log alerting), wire the same rule.

Provider-specific 402 quirks

Anthropic

Returns 402 cleanly with a clear message. Fallback works as expected.

OpenAI

Returns 429 for both rate-limit and out-of-credits. Different special header tells you which. If you only configured 402 fallback, OpenAI exhaustion won't fall through. Add 429 to fallback statuses.

OpenRouter

Returns 402 with a clear topping-up message. Fallback works. The upgrade path is usually quicker than a fallback: just top up the OpenRouter account.

MiniMax and GLM

Sometimes return 401 for what is really a quota error. Hermes treats it as auth, fallback doesn't trigger. Workaround: mark these providers as fallback-only, never primary.

My production setup

For the production gateway running my Telegram and Discord channels:

  • Primary: Anthropic Sonnet 4.6 with credit auto-recharge at the account level
  • Fallback: OpenRouter routed to Sonnet 4.6 (same model, different billing)
  • Alert: 402 line piped to email

Two real 402 events in the last six months, both caught by fallback, both fixed before anyone noticed.

For the local dev agent: no fallback. If Anthropic 402s during dev work, I want to see the error immediately so I can decide to top up or move to something else.

Cost tuning is the cheaper option

Fallback providers buy reliability. They don't reduce cost. The cheaper way to handle 402 is to not hit it in the first place. Set provider account budgets, tune token usage with the patterns in our token costs guide, use /compress aggressively in long sessions.

A well-tuned agent rarely 402s on a Sonnet account even with daily use.

Pre-upgraded gateway on LumaDock

The Hermes Agent template on LumaDock includes the upgraded gateway version that handles 402 cleanly out of the box, plus the same systemd unit that auto-restarts after transient errors. Unmetered bandwidth and no setup fees. Setup walkthrough in our Hermes Agent complete guide.

Your idea deserves better hosting

24/7 support 30-day money-back guarantee Cancel anytime
Ciclo de Facturación

1 GB RAM VPS

$3.99 Save  25 %
$2.99 Mensual
  • 1 vCPU AMD EPYC
  • 30 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Gestión de firewall
  • Monitoreo gratis

2 GB RAM VPS

$5.99 Save  17 %
$4.99 Mensual
  • 2 vCPU AMD EPYC
  • 30 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Gestión de firewall
  • Monitoreo gratis

6 GB RAM VPS

$14.99 Save  33 %
$9.99 Mensual
  • 6 vCPU AMD EPYC
  • 70 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Gestión de firewall
  • Monitoreo gratis

AMD EPYC VPS.P1

$7.99 Save  25 %
$5.99 Mensual
  • 2 vCPU AMD EPYC
  • 4 GB memoria RAM
  • 40 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Copia automática incluida
  • Gestión de firewall
  • Monitoreo gratis

AMD EPYC VPS.P2

$14.99 Save  27 %
$10.99 Mensual
  • 2 vCPU AMD EPYC
  • 8 GB memoria RAM
  • 80 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Copia automática incluida
  • Gestión de firewall
  • Monitoreo gratis

AMD EPYC VPS.P4

$29.99 Save  20 %
$23.99 Mensual
  • 4 vCPU AMD EPYC
  • 16 GB memoria RAM
  • 160 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Copia automática incluida
  • Gestión de firewall
  • Monitoreo gratis

AMD EPYC VPS.P5

$36.49 Save  21 %
$28.99 Mensual
  • 8 vCPU AMD EPYC
  • 16 GB memoria RAM
  • 180 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Copia automática incluida
  • Gestión de firewall
  • Monitoreo gratis

AMD EPYC VPS.P6

$56.99 Save  21 %
$44.99 Mensual
  • 8 vCPU AMD EPYC
  • 32 GB memoria RAM
  • 200 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Copia automática incluida
  • Gestión de firewall
  • Monitoreo gratis

AMD EPYC VPS.P7

$69.99 Save  20 %
$55.99 Mensual
  • 16 vCPU AMD EPYC
  • 32 GB memoria RAM
  • 240 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Copia automática incluida
  • Gestión de firewall
  • Monitoreo gratis

EPYC Genoa VPS.G1

$4.99 Save  20 %
$3.99 Mensual
  • 1 vCPU AMD EPYC Gen4 AMD EPYC Genoa de 4ª generación 9xx4 con 3.25 GHz o similar, basado en la arquitectura Zen 4.
  • 1 GB DDR5 memoria RAM
  • 25 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Copia automática incluida
  • Gestión de firewall
  • Monitoreo gratis

EPYC Genoa VPS.G2

$12.99 Save  23 %
$9.99 Mensual
  • 2 vCPU AMD EPYC Gen4 AMD EPYC Genoa de 4ª generación 9xx4 con 3.25 GHz o similar, basado en la arquitectura Zen 4.
  • 4 GB DDR5 memoria RAM
  • 50 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Copia automática incluida
  • Gestión de firewall
  • Monitoreo gratis

EPYC Genoa VPS.G4

$25.99 Save  27 %
$18.99 Mensual
  • 4 vCPU AMD EPYC Gen4 AMD EPYC Genoa de 4ª generación 9xx4 con 3.25 GHz o similar, basado en la arquitectura Zen 4.
  • 8 GB DDR5 memoria RAM
  • 100 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Copia automática incluida
  • Gestión de firewall
  • Monitoreo gratis

EPYC Genoa VPS.G6

$48.99 Save  31 %
$33.99 Mensual
  • 8 vCPU AMD EPYC Gen4 AMD EPYC Genoa de 4ª generación 9xx4 con 3.25 GHz o similar, basado en la arquitectura Zen 4.
  • 16 GB DDR5 memoria RAM
  • 200 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Copia automática incluida
  • Gestión de firewall
  • Monitoreo gratis

EPYC Genoa VPS.G7

$74.99 Save  27 %
$54.99 Mensual
  • 8 vCPU AMD EPYC Gen4 AMD EPYC Genoa de 4ª generación 9xx4 con 3.25 GHz o similar, basado en la arquitectura Zen 4.
  • 32 GB DDR5 memoria RAM
  • 250 GB NVMe disco
  • Ilimitado ancho de banda
  • IPv4 e IPv6 incluidos El soporte IPv6 no está disponible en Francia, Finlandia o Países Bajos.
  • 1 Gbps red
  • Copia automática incluida
  • Gestión de firewall
  • Monitoreo gratis

AMD Ryzen VPS.R1

$15.99 Save  31 %
$10.99 Mensual
  • 1 CPU dedicada AMD Ryzen 9 7950X a 4,5 GHz o similar, en arquitectura Zen 4. vCPU
  • 4 GB DDR5MEMORIA
  • 50 GB NVMeDISCO
  • Ancho de banda sin medir
  • IPv4 & IPv6 incluidos El soporte IPv6 no está disponible actualmente en Francia, Finlandia ni Países Bajos.
  • Backup automático incluido

AMD Ryzen VPS.R2

$27.99 Save  21 %
$21.99 Mensual
  • 2 CPU dedicadas AMD Ryzen 9 7950X a 4,5 GHz o similar, en arquitectura Zen 4. vCPU
  • 8 GB DDR5MEMORIA
  • 100 GB NVMeDISCO
  • Ancho de banda sin medir
  • IPv4 & IPv6 incluidos El soporte IPv6 no está disponible actualmente en Francia, Finlandia ni Países Bajos.
  • Backup automático incluido

AMD Ryzen VPS.R4

$99.99 Save  20 %
$79.99 Mensual
  • 8 CPU dedicadas AMD Ryzen 9 7950X a 4,5 GHz o similar, en arquitectura Zen 4. vCPU
  • 32 GB DDR5MEMORIA
  • 400 GB NVMeDISCO
  • Ancho de banda sin medir
  • IPv4 & IPv6 incluidos El soporte IPv6 no está disponible actualmente en Francia, Finlandia ni Países Bajos.
  • Backup automático incluido

Extra answers

How do I stop a 402 from crashing the Hermes gateway?

Upgrade Hermes to a version that handles 402 cleanly (v0.13 or later). Add a fallback provider so the gateway retries through a secondary when the primary 402s.

Your agent runs wild. Your bill doesn't.

Easily deploy Hermes in one click on Ubuntu 24.04 with AMD EPYC, NVMe storage and unmetered bandwidth. The price stays the same whatever the agent does, no setup fees, no overage charges and no tier traps.

GPU products are in high demand at the moment. Fill the form to get notified as soon as your preferred GPU server is back in stock.