Back to Article List

Track Hermes Agent token spend and set budget alerts

Track Hermes Agent token spend and set budget alerts - Track Hermes Agent token spend and set budget alerts

The Hermes Agent bill creeps up on people. Hermes is great. You start using it for everything. The agent makes hundreds of small API calls a day. End of month, your Anthropic statement is twice what you expected. You promise yourself you'll keep an eye on it next month. You don't. Bill grows again.

This article is the boring operational answer to that pattern. What Hermes shows you out of the box. What it doesn't. What dashboards bolt on top. How to set alerts that fire before the bill does damage.

What Hermes shows you for free

Two built-in mechanisms. Both useful, but.... both incomplete on their own.

The /usage slash command

Inside any chat session, run /usage. The agent prints the token count for the current session: input tokens, output tokens, estimated cost based on the provider's pricing as Hermes knows it.

Useful for: a quick check during a long task ("am I about to spend a fortune on this debug session"). Useless for: cumulative spend, daily totals, per-channel attribution.

hermes usage command from the CLI

hermes usage --since "7 days ago"
hermes usage --by-provider
hermes usage --by-session

The CLI version aggregates across sessions. By default it shows you the last 24 hours; you can pass --since with a relative time or an ISO date. The --by-provider and --by-session flags break the totals down. This is the most useful command nobody runs because it's not in the obvious docs.

Run it once a week as a habit. If your spend is climbing, you'll see it here a month before the credit card statement makes you ask why.

What Hermes doesn't show you

Three blind spots worth knowing about.

Per-channel attribution. If you have Telegram, Discord and a scheduler all hitting the same Hermes instance, the built-in usage report aggregates them. You can't easily tell "Telegram cost X this month, Discord cost Y". This bites teams more than individuals.

Forecasting. The reports are retrospective. They don't project the rest of the month based on current pace. You have to do the maths yourself.

Provider-side billing reality. Hermes estimates cost based on the model's published per-token pricing. Real billing can drift: provider price changes, billing rounding, tier discounts you forgot you have. The Hermes number is close but not exact. Always reconcile against the provider's own dashboard at month-end.

Third-party dashboard: hermes-dashboard

The community project at Bichev/hermes-dashboard bolts a proper dashboard on top of Hermes's usage data. Spend over time, per-provider charts, per-session drill-down, per-channel breakdown if you've tagged sessions appropriately.

Setup is roughly:

git clone https://github.com/Bichev/hermes-dashboard.git
cd hermes-dashboard
docker compose up -d

It runs on port 8080 by default, connects to your Hermes state.db, gives you a web UI. Works locally or expose it through Nginx or Tailscale (see our Tailscale remote access tutorial for the pattern).

The dashboard reads from state.db read-only, so there's no risk to your Hermes data from running it. Worst case, you stop using the dashboard and rm the docker stack.

Setting a budget on the provider side

The other half of cost control: set a hard limit at the provider, not just in Hermes. Both Anthropic and OpenAI let you set monthly usage budgets in the account dashboard.

  • Anthropic: Console > Settings > Spend Limits
  • OpenAI: Platform > Billing > Usage Limits
  • OpenRouter: Settings > Credits > Set Auto-recharge limit

Pick a number that's roughly 1.5x your current monthly spend. When you cross it, the provider hard-blocks new requests. Your bot 402s. You see the alert. You make a decision: top up or hold.

This is the "I don't trust myself to monitor spend" answer. Set the limit once, forget it, the provider catches you if you forget.

Alerting before the limit hits

Hard limits are nice but waking up to a dead bot at 7 a.m. isn't. Set soft alerts that fire at 70-80% of your monthly budget.

If you have Prometheus already

The hermes-dashboard project mentioned above exposes metrics in Prometheus format. Wire up an alert rule:

groups:
  - name: hermes-cost
    rules:
      - alert: HermesMonthlySpendApproaching
        expr: hermes_monthly_spend_usd > 70
        for: 5m
        annotations:
          summary: "Hermes spend at {{ $value }}USD this month"

Threshold depends on your budget. Round number that gets attention.

If you don't have Prometheus

Cron job that runs the usage report nightly, parses the number, emails you if it crosses a threshold.

cat > /usr/local/bin/hermes-cost-alert.sh << 'EOF'
#!/bin/bash
SPEND=$(hermes usage --since "$(date -d 'first day of this month' +%Y-%m-%d)" --json | jq .total_usd)
THRESHOLD=70
if (( $(echo "$SPEND > $THRESHOLD" | bc -l) )); then
  echo "Hermes spend at \$$SPEND this month, threshold \$$THRESHOLD" | mail -s "Hermes spend alert" [email protected]
fi
EOF
chmod +x /usr/local/bin/hermes-cost-alert.sh

Crontab entry to run nightly:

0 8 * * * /usr/local/bin/hermes-cost-alert.sh

Per-channel cost tracking

If you want to know what Telegram is costing you vs Discord vs the scheduler, you need to tag sessions when they originate. Hermes auto-tags by channel for messaging gateways, so:

hermes usage --by-channel --since "30 days ago"

...gives you the breakdown. Useful for deciding which channel is worth keeping and which to throttle.

If you have multiple users on the same Hermes (small team setup), tag sessions by user. The hermes-dashboard handles this if you've configured user identity properly across gateways. Configuration is per-channel; see the relevant gateway tutorial (Telegram, multi-platform) for how to identify users per channel.

The cost levers worth knowing

If your spend is too high, three knobs make the biggest difference (in this order):

  1. Model choice. Sonnet 4.6 vs Haiku 4.5 is roughly 6x cost difference for similar tasks. Routing routine work to Haiku and keeping Sonnet for the hard bits halves most bills. Our cut Hermes token costs guide covers the routing patterns.
  2. Skill count. Every enabled skill adds tokens to every prompt. Disable skills you aren't using. Use the per-profile skill toggling pattern from our skills without breaking loop tutorial.
  3. Session length. Long sessions where you never run /compress accumulate tokens. /compress halfway through a session brings the working context back down. The token costs guide covers this in detail.

My monitoring setup

For my production Hermes:

  • Anthropic spend limit set at $200/month (covers expected use plus headroom)
  • hermes-dashboard running on the same VPS, behind Tailscale
  • Cron at 8 a.m. daily emails me if monthly spend crosses $120
  • Weekly habit: hermes usage --by-provider --since "7 days ago" over morning coffee

This has caught two unexpected spikes in a year. Both were skill-related (a new skill I added did more API calls than I expected). Fixed within a day each time.

When per-token tracking doesn't matter

If your spend is under $20/month, none of this matters. Just look at the provider statement once a month and move on. Per-channel breakdowns, dashboards, alerts: all overkill at that scale. Worth setting up when you cross into territory where a 30% spend increase would be a real budget item.

Hosting the dashboard alongside Hermes

hermes-dashboard runs comfortably on the same VPS as the agent. The LumaDock Hermes Agent template has enough RAM headroom on the standard tiers for both. Unmetered bandwidth (which matters because dashboards make a lot of small DB reads) and no setup fees. Full setup details in our Hermes Agent complete guide.

Your idea deserves better hosting

24/7 support 30-day money-back guarantee Cancel anytime
Billing Cycle

1 GB RAM VPS

37.50 kr Save  25 %
28.10 kr Monthly
  • 1 vCPU AMD EPYC
  • 30 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Firewall management
  • Free server monitoring

2 GB RAM VPS

56.30 kr Save  17 %
46.90 kr Monthly
  • 2 vCPU AMD EPYC
  • 30 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Firewall management
  • Free server monitoring

6 GB RAM VPS

140.89 kr Save  33 %
93.89 kr Monthly
  • 6 vCPU AMD EPYC
  • 70 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Firewall management
  • Free server monitoring

AMD EPYC VPS.P1

75.10 kr Save  25 %
56.30 kr Monthly
  • 2 vCPU AMD EPYC
  • 4 GB RAM memory
  • 40 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Automatic backup included
  • Firewall management
  • Free server monitoring

AMD EPYC VPS.P2

140.89 kr Save  27 %
103.29 kr Monthly
  • 2 vCPU AMD EPYC
  • 8 GB RAM memory
  • 80 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Automatic backup included
  • Firewall management
  • Free server monitoring

AMD EPYC VPS.P4

281.87 kr Save  20 %
225.48 kr Monthly
  • 4 vCPU AMD EPYC
  • 16 GB RAM memory
  • 160 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Automatic backup included
  • Firewall management
  • Free server monitoring

AMD EPYC VPS.P5

342.96 kr Save  21 %
272.47 kr Monthly
  • 8 vCPU AMD EPYC
  • 16 GB RAM memory
  • 180 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Automatic backup included
  • Firewall management
  • Free server monitoring

AMD EPYC VPS.P6

535.64 kr Save  21 %
422.85 kr Monthly
  • 8 vCPU AMD EPYC
  • 32 GB RAM memory
  • 200 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Automatic backup included
  • Firewall management
  • Free server monitoring

AMD EPYC VPS.P7

657.82 kr Save  20 %
526.24 kr Monthly
  • 16 vCPU AMD EPYC
  • 32 GB RAM memory
  • 240 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Automatic backup included
  • Firewall management
  • Free server monitoring

EPYC Genoa VPS.G1

46.90 kr Save  20 %
37.50 kr Monthly
  • 1 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4th generation 9xx4 with 3.25 GHz or similar, on Zen 4 architecture.
  • 1 GB DDR5 memory
  • 25 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Automatic backup included
  • Firewall management
  • Free server monitoring

EPYC Genoa VPS.G2

122.09 kr Save  23 %
93.89 kr Monthly
  • 2 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4th generation 9xx4 with 3.25 GHz or similar, on Zen 4 architecture.
  • 4 GB DDR5 memory
  • 50 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Automatic backup included
  • Firewall management
  • Free server monitoring

EPYC Genoa VPS.G4

244.28 kr Save  27 %
178.48 kr Monthly
  • 4 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4th generation 9xx4 with 3.25 GHz or similar, on Zen 4 architecture.
  • 8 GB DDR5 memory
  • 100 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Automatic backup included
  • Firewall management
  • Free server monitoring

EPYC Genoa VPS.G6

460.45 kr Save  31 %
319.47 kr Monthly
  • 8 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4th generation 9xx4 with 3.25 GHz or similar, on Zen 4 architecture.
  • 16 GB DDR5 memory
  • 200 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Automatic backup included
  • Firewall management
  • Free server monitoring

EPYC Genoa VPS.G7

704.82 kr Save  27 %
516.84 kr Monthly
  • 8 vCPU AMD EPYC Gen4 AMD EPYC Genoa 4th generation 9xx4 with 3.25 GHz or similar, on Zen 4 architecture.
  • 32 GB DDR5 memory
  • 250 GB NVMe storage
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • 1 Gbps network
  • Automatic backup included
  • Firewall management
  • Free server monitoring

AMD Ryzen VPS.R1

150.29 kr Save  31 %
103.29 kr Monthly
  • 1 dedicated CPU AMD Ryzen 9 7950X with 4.5 GHz or similar, on Zen 4 architecture. vCPU
  • 4 GB DDR5MEMORY
  • 50 GB NVMeSTORAGE
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • Auto backup included

AMD Ryzen VPS.R2

263.07 kr Save  21 %
206.68 kr Monthly
  • 2 dedicated CPUs AMD Ryzen 9 7950X with 4.5 GHz or similar, on Zen 4 architecture. vCPU
  • 8 GB DDR5MEMORY
  • 100 GB NVMeSTORAGE
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • Auto backup included

AMD Ryzen VPS.R4

939.79 kr Save  20 %
751.81 kr Monthly
  • 8 dedicated CPUs AMD Ryzen 9 7950X with 4.5 GHz or similar, on Zen 4 architecture. vCPU
  • 32 GB DDR5MEMORY
  • 400 GB NVMeSTORAGE
  • Unmetered bandwidth
  • IPv4 & IPv6 included IPv6 support is currently unavailable in France, Finland or the Netherlands.
  • Auto backup included

Extra questions

How do I see how much Hermes Agent is costing me right now?

Run hermes usage --by-provider --since "30 days ago" in your terminal. The CLI aggregates token spend across all sessions in the time range and breaks it down per provider.

Your agent runs wild. Your bill doesn't.

Easily deploy Hermes in one click on Ubuntu 24.04 with AMD EPYC, NVMe storage and unmetered bandwidth. The price stays the same whatever the agent does, no setup fees, no overage charges and no tier traps.

GPU products are in high demand at the moment. Fill the form to get notified as soon as your preferred GPU server is back in stock.