OnceOnly
Tools

AI Agent Cost Calculator (Tokens → $/month)

Estimate monthly burn rate, compare models, and set budgets.

Client-side calculator • Configurable pricing • Shareable URL (numbers only)

Inputs
Runs / day
Days / month
Input tokens / run
Output tokens / run
Model
Prices are configurable. Last updated: —.
⚙️ Advanced
Cache hit rate %
Retry rate %
Context overhead tokens
Results
Monthly burn
Input
Output
Tokens/month: in + out
Per run
What this calculator counts
Runs/month = runs/day × days/month
Paid input tokens/month = runs × (input + overhead) × (1 + retry%) × (1 − cache%)
Output tokens/month = runs × output × (1 + retry%)
Cost/month = input_tokens/1M × price_in + output_tokens/1M × price_out
If your provider bills cached tokens differently, update pricing.json (or set cache to 0%).
What if (sensitivity)
Runs/day ±
Output tokens ±
Retry rate ±
This changes monthly cost by
Set a budget (OnceOnly bridge)
Budget: $ / month
Budget: $ / day
Alert threshold %
Max retries
Max tool calls

Compare models

Same workload, different prices. Useful for “GPT vs Claude cost” comparisons.

Model Input / 1M Output / 1M Total / month $ / run
Updates automatically based on the inputs above.
Showing the first 6 models from pricing.json. Add or reorder models there to change this table.

How to estimate AI agent cost

A useful estimate needs only a few numbers:

  • calls/day × days/month
  • Average input tokens + context overhead (system prompt, tool schemas, instructions)
  • Average output tokens (agent plans + tool results + final answers)
  • Retry rate and cache hit rate, if you have them

Why agents get expensive

  • Retries: network timeouts, rate limits, tool failures → same task repeats.
  • Long context: tool schemas, instructions, memory, and traces add overhead every run.
  • Tool loops: “search → browse → refine → browse again” can balloon output tokens.
  • Verbose agents: long rationales / step-by-step outputs increase output tokens (the expensive side).

Budgeting: daily caps & alerts

Budgets work best when they’re enforced at runtime, not after the invoice.

  • Set $ / day caps to limit blast radius from a broken loop.
  • Set $ / month caps to keep spend predictable.
  • Alert at 80% so you can react before hitting hard limits.

Cost per task vs cost per user

Two useful ways to model unit economics:

  • Cost / task: $ per run × tasks per month.
  • Cost / user: (tasks per user × $ per run) + background maintenance runs.

How OnceOnly enforces budgets at runtime

OnceOnly sits between your agent and the outside world, enforcing guardrails before costs spiral:

  • Hard budget caps (daily, monthly) + alert thresholds.
  • Execution controls: max retries, max tool calls, and per-run limits.
  • Stop “runaway loops” early, while keeping good runs fast.

Generate a policy above, then enforce it automatically with OnceOnly →