Track Hermes Agent token spend and set budget alerts

Ellie Grace Hayes

27/05/2026

Track Hermes Agent token spend and set budget alerts

The Hermes Agent bill creeps up on people. Hermes is great. You start using it for everything. The agent makes hundreds of small API calls a day. End of month, your Anthropic statement is twice what you expected. You promise yourself you'll keep an eye on it next month. You don't. Bill grows again.

This article is the boring operational answer to that pattern. What Hermes shows you out of the box. What it doesn't. What dashboards bolt on top. How to set alerts that fire before the bill does damage.

What Hermes shows you for free

Two built-in mechanisms. Both useful, but.... both incomplete on their own.

The /usage slash command

Inside any chat session, run /usage. The agent prints the token count for the current session: input tokens, output tokens, estimated cost based on the provider's pricing as Hermes knows it.

Useful for: a quick check during a long task ("am I about to spend a fortune on this debug session"). Useless for: cumulative spend, daily totals, per-channel attribution.

hermes usage command from the CLI

hermes usage --since "7 days ago"
hermes usage --by-provider
hermes usage --by-session

The CLI version aggregates across sessions. By default it shows you the last 24 hours; you can pass --since with a relative time or an ISO date. The --by-provider and --by-session flags break the totals down. This is the most useful command nobody runs because it's not in the obvious docs.

Run it once a week as a habit. If your spend is climbing, you'll see it here a month before the credit card statement makes you ask why.

What Hermes doesn't show you

Three blind spots worth knowing about.

Per-channel attribution. If you have Telegram, Discord and a scheduler all hitting the same Hermes instance, the built-in usage report aggregates them. You can't easily tell "Telegram cost X this month, Discord cost Y". This bites teams more than individuals.

Forecasting. The reports are retrospective. They don't project the rest of the month based on current pace. You have to do the maths yourself.

Provider-side billing reality. Hermes estimates cost based on the model's published per-token pricing. Real billing can drift: provider price changes, billing rounding, tier discounts you forgot you have. The Hermes number is close but not exact. Always reconcile against the provider's own dashboard at month-end.

Third-party dashboard: hermes-dashboard

The community project at Bichev/hermes-dashboard bolts a proper dashboard on top of Hermes's usage data. Spend over time, per-provider charts, per-session drill-down, per-channel breakdown if you've tagged sessions appropriately.

Setup is roughly:

git clone https://github.com/Bichev/hermes-dashboard.git
cd hermes-dashboard
docker compose up -d

It runs on port 8080 by default, connects to your Hermes state.db, gives you a web UI. Works locally or expose it through Nginx or Tailscale (see our Tailscale remote access tutorial for the pattern).

The dashboard reads from state.db read-only, so there's no risk to your Hermes data from running it. Worst case, you stop using the dashboard and rm the docker stack.

Setting a budget on the provider side

The other half of cost control: set a hard limit at the provider, not just in Hermes. Both Anthropic and OpenAI let you set monthly usage budgets in the account dashboard.

Anthropic: Console > Settings > Spend Limits
OpenAI: Platform > Billing > Usage Limits
OpenRouter: Settings > Credits > Set Auto-recharge limit

Pick a number that's roughly 1.5x your current monthly spend. When you cross it, the provider hard-blocks new requests. Your bot 402s. You see the alert. You make a decision: top up or hold.

This is the "I don't trust myself to monitor spend" answer. Set the limit once, forget it, the provider catches you if you forget.

Alerting before the limit hits

Hard limits are nice but waking up to a dead bot at 7 a.m. isn't. Set soft alerts that fire at 70-80% of your monthly budget.

If you have Prometheus already

The hermes-dashboard project mentioned above exposes metrics in Prometheus format. Wire up an alert rule:

groups:
  - name: hermes-cost
    rules:
      - alert: HermesMonthlySpendApproaching
        expr: hermes_monthly_spend_usd > 70
        for: 5m
        annotations:
          summary: "Hermes spend at {{ $value }}USD this month"

Threshold depends on your budget. Round number that gets attention.

If you don't have Prometheus

Cron job that runs the usage report nightly, parses the number, emails you if it crosses a threshold.

cat > /usr/local/bin/hermes-cost-alert.sh << 'EOF'
#!/bin/bash
SPEND=$(hermes usage --since "$(date -d 'first day of this month' +%Y-%m-%d)" --json | jq .total_usd)
THRESHOLD=70
if (( $(echo "$SPEND > $THRESHOLD" | bc -l) )); then
  echo "Hermes spend at \$$SPEND this month, threshold \$$THRESHOLD" | mail -s "Hermes spend alert" [email protected]
fi
EOF
chmod +x /usr/local/bin/hermes-cost-alert.sh

Crontab entry to run nightly:

0 8 * * * /usr/local/bin/hermes-cost-alert.sh

Per-channel cost tracking

If you want to know what Telegram is costing you vs Discord vs the scheduler, you need to tag sessions when they originate. Hermes auto-tags by channel for messaging gateways, so:

hermes usage --by-channel --since "30 days ago"

...gives you the breakdown. Useful for deciding which channel is worth keeping and which to throttle.

If you have multiple users on the same Hermes (small team setup), tag sessions by user. The hermes-dashboard handles this if you've configured user identity properly across gateways. Configuration is per-channel; see the relevant gateway tutorial (Telegram, multi-platform) for how to identify users per channel.

The cost levers worth knowing

If your spend is too high, three knobs make the biggest difference (in this order):

Model choice. Sonnet 4.6 vs Haiku 4.5 is roughly 6x cost difference for similar tasks. Routing routine work to Haiku and keeping Sonnet for the hard bits halves most bills. Our cut Hermes token costs guide covers the routing patterns.
Skill count. Every enabled skill adds tokens to every prompt. Disable skills you aren't using. Use the per-profile skill toggling pattern from our skills without breaking loop tutorial.
Session length. Long sessions where you never run /compress accumulate tokens. /compress halfway through a session brings the working context back down. The token costs guide covers this in detail.

My monitoring setup

For my production Hermes:

Anthropic spend limit set at $200/month (covers expected use plus headroom)
hermes-dashboard running on the same VPS, behind Tailscale
Cron at 8 a.m. daily emails me if monthly spend crosses $120
Weekly habit: hermes usage --by-provider --since "7 days ago" over morning coffee

This has caught two unexpected spikes in a year. Both were skill-related (a new skill I added did more API calls than I expected). Fixed within a day each time.

When per-token tracking doesn't matter

If your spend is under $20/month, none of this matters. Just look at the provider statement once a month and move on. Per-channel breakdowns, dashboards, alerts: all overkill at that scale. Worth setting up when you cross into territory where a 30% spend increase would be a real budget item.

Hosting the dashboard alongside Hermes

hermes-dashboard runs comfortably on the same VPS as the agent. The LumaDock Hermes Agent template has enough RAM headroom on the standard tiers for both. Unmetered bandwidth (which matters because dashboards make a lot of small DB reads) and no setup fees. Full setup details in our Hermes Agent complete guide.

Your idea deserves better hosting

24/7 support 30-day money-back guarantee Cancel anytime

Ciclo de Pagamento

VPS.S1

57.47 kr Save 17 %

47.87 _kr Mensalmente

2 vCPU AMD EPYC
2 GB RAMMEMÓRIA
30 GB NVMeDISCO
Largura de banda ilimitada
IPv4 & IPv6O suporte a IPv6 está indisponível de momento em França, Finlândia ou nos Países Baixos. incluídos

Track Hermes Agent token spend and set budget alerts

What Hermes shows you for free

The /usage slash command

hermes usage command from the CLI

What Hermes doesn't show you

Third-party dashboard: hermes-dashboard

Setting a budget on the provider side

Alerting before the limit hits

If you have Prometheus already

If you don't have Prometheus

Per-channel cost tracking

The cost levers worth knowing

My monitoring setup

When per-token tracking doesn't matter

Hosting the dashboard alongside Hermes

Your idea deserves better hosting

VPS.S1

VPS.S2

VPS.S3

EPYC VPS.P1

EPYC VPS.P2

EPYC VPS.P3

EPYC VPS.P4

EPYC VPS.P5

EPYC VPS.P6

EPYC VPS.P7

Genoa VPS.G2

Genoa VPS.G3

Genoa VPS.G4

Genoa VPS.G6

Genoa VPS.G7

AMD Ryzen VPS.R1

AMD Ryzen VPS.R2

AMD Ryzen VPS.R3

AMD Ryzen VPS.R4

Extra questions

How do I see how much Hermes Agent is costing me right now?

What's the /usage slash command for in Hermes Agent?

How do I get a proper dashboard for Hermes Agent spend?

Can I set a hard monthly budget on Hermes Agent spend?

What's the difference between built-in cost reporting and provider billing?

Your agent runs wild. Your bill doesn't.

Produtos

Alojamento de apps

Recursos

Empresa

Funcionalidades

Obter ajuda

Soluções por utilização

Gerar senha