If your Hermes Agent regularly hits 429 rate limits during busy hours, credential pools are the feature you didn't know you needed. They let you associate multiple API keys with the same provider config, and Hermes rotates between them per request. Same provider, same model, just twice (or three times) the per-account rate budget.
Pools are an official Hermes feature documented at the credential pools docs. Most tutorials skip them because they sound advanced. They aren't.
The problem credential pools solve
Two related problems.
Rate limits on a single account. Anthropic gives you N requests per minute on a tier. Same for OpenAI, OpenRouter, every hosted provider. With one key, you hit that ceiling and Hermes either queues or 429s. With three keys in a pool, you've got 3N RPM until you upgrade the tier.
Per-account daily quotas. Some providers cap daily token spend per account regardless of plan (especially the free tiers on OpenRouter, Gemini, Groq). One account exhausts, the agent stops working. Two accounts pooled: you get double the budget.
The catch: pooling doesn't help if you've already maxed out a paid tier (you'd just hit the higher limit twice as fast). Pools shine when the per-account limit is the bottleneck, not your total spend.
Setting up a pool
The CLI exposes pool management through hermes auth. Exact syntax has shifted between versions; check hermes auth --help on your install to confirm. As of recent versions:
Add multiple keys to a provider
hermes auth add anthropic --pool primary
hermes auth add anthropic --pool secondary
hermes auth list anthropic
The CLI walks you through pasting each key. The pool name is arbitrary (I use "primary" and "secondary" but "key1" / "key2" works too).
Tell the provider config to use the pool
hermes provider set anthropic --credential-pool primary,secondary
hermes provider show anthropic
Now Hermes will rotate between the two keys on each request. Round-robin by default.
Rotation strategies
Hermes supports a few rotation modes. Pick based on what you're optimising.
Round-robin (default)
Each request goes to the next key in the list. Predictable, easy to reason about. Fine for most use cases.
Random
Each request picks a key at random. Slightly better at spreading load if your traffic is bursty.
hermes provider set anthropic --pool-strategy random
Failover (least common)
Always use the first key. Only switch to the next one if the first 429s or 5xxs. This is closer to fallback-provider behaviour applied within a single provider.
hermes provider set anthropic --pool-strategy failover
I use round-robin in production. Random has slightly less predictable cost attribution (you can't trace which key paid for which request from external billing alone). Failover is fine if you actively want one account to be the "primary" and the other a safety net.
Credential pools vs fallback providers
These two features sound similar. They aren't. Quick decision guide.
| Concern | Use this |
|---|---|
| Same provider hits rate limit | Credential pool |
| Same provider hits daily quota | Credential pool |
| Whole provider goes down | Fallback provider |
| Want to compare provider quality | Fallback provider |
| Need same model across two billing relationships | Credential pool (one provider) or fallback (Anthropic + OpenRouter routing to Sonnet) |
You can use both together. I do. Pool of two Anthropic keys as primary, single OpenRouter key as fallback. Rate limits get absorbed by the pool, full provider outages get caught by the fallback. Setup pattern is in our Hermes 402 quota fallback piece.
Where you can't pool
A few edge cases worth knowing.
Some providers tie keys to specific projects or organisations. Pooling keys from two different projects on the same provider is fine. Pooling keys belonging to the same user in the same org sometimes triggers anti-abuse detection ("looks like one user spinning up multiple keys to dodge limits"). Provider TOS varies. Read the fine print.
Local providers (Ollama, LM Studio, vLLM) don't have rate limits in the same way and pooling doesn't apply. Just point at one endpoint.
Streaming responses from some providers are pinned to a specific key for the duration of the stream. Pooling between requests works fine but a single long streaming response stays on one key. So pools don't help you mid-response if that key 429s on a follow-up token.
Billing visibility with pools
Your provider bill now has charges across two (or more) accounts. If you need clean per-team or per-channel cost attribution, pooling makes it harder. You can mitigate this by:
- Naming the keys descriptively when you create them on the provider dashboard ("hermes-bot-primary", "hermes-bot-secondary")
- Setting separate budgets on each account at the provider level
- Tracking spend on the Hermes side instead, covered in our Hermes cost tracking and budgets tutorial
For small teams this isn't worth worrying about. For org-level deployments where finance needs to allocate per-team, decide upfront how you want to track and stick to it.
Real-world setup: My production pool
I got two Anthropic keys (primary and secondary, both on the same workspace), one OpenRouter key as fallback. Pool strategy: round-robin. Anthropic monthly budgets set on the dashboard so an unusual day doesn't blow my month. OpenRouter has a hard cap because it's only fallback.
This setup has survived two Anthropic rate-limit incidents and one OpenRouter outage in the past four months. Users noticed nothing. The only operational task was checking the gateway logs the next morning to confirm what happened.
What happens when both keys in the pool 429
Hermes retries with exponential backoff a few times, then falls through to whatever fallback provider you've configured. If you have no fallback, the request errors. Setup detail in the fallback providers tutorial.
How pools interact with the 401 troubleshooting flow
If only one key in the pool is broken (you regenerated one and forgot to update it in Hermes), you get intermittent 401s that look random. Roughly half your requests succeed, half fail. The fix is to verify each key in the pool independently. See our Hermes 401 auth errors tutorial for the curl-based verification flow. Run it once per key in the pool.
What I'd skip
Don't bother with pools if you're running a personal Hermes that gets a few dozen messages a day. The complexity isn't worth the wins. Pools start mattering at the point where you're routinely seeing 429s during peak hours or your daily budget exhausts before the day is over.
Pre-configured on LumaDock
The Hermes Agent template on LumaDock supports pools out of the box once you add a second key through the auth wizard. No special setup beyond what's covered above. Unmetered bandwidth and no setup fees on every plan. Full setup walkthrough in our Hermes Agent complete guide.

