Calculator
AI Token & Pricing Comparator
Estimate input/output token cost across OpenAI, Anthropic, Google, xAI, Mistral, and more — including prompt-cache savings.
Pricing data refreshed:
The AITOT Token & Pricing Comparator lets you compare per-token cost across 22 leading LLMs in 2026 — including OpenAI GPT-5, Claude Opus 4.7, Gemini 2.5 Pro, Llama 4 70B, DeepSeek V3, Mistral Large 2, and Amazon Nova. Plug in your average input and output tokens, hit calculate, and get per-request + monthly cost side-by-side.
Output tokens dominate most bills — they cost 3–5× input tokens on every major provider. The comparator sorts by total cost, not headline rate, so you see real-world impact. Prompt caching toggles cut input cost 60–90% on Anthropic and 50% on OpenAI when your system prompt is stable.
All pricing is sourced from official provider documentation and refreshed on the first of every month. Real bills land within 5–15% of these estimates — variance comes from caching, batching, region surcharges, and rate-limit headroom. No login required; results compute client-side.
Cheapest
Amazon · Nova Lite
$14.40
Per month
| Provider | Model | Input / 1M | Output / 1M | Per request | Per month |
|---|---|---|---|---|---|
| Amazon | Nova Lite | $0.06 | $0.24 | $0.0001 | $14.40 |
| OpenAI | GPT-5 nano | $0.05 | $0.40 | $0.0002 | $20.00 |
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | $0.0002 | $24.00 | |
| Cohere | Command R | $0.15 | $0.60 | $0.0004 | $36.00 |
| Mistral | Mistral Small 3 | $0.20 | $0.60 | $0.0004 | $40.00 |
| DeepSeek | DeepSeek V3 | $0.27 | $1.10 | $0.0007 | $65.60 |
| OpenAI | GPT-5.4 nano | $0.20 | $1.25 | $0.0007 | $66.00 |
| Gemini 3.1 Flash-Lite | $0.25 | $1.50 | $0.0008 | $80.00 | |
| OpenAI | GPT-5 mini | $0.25 | $2.00 | $0.001 | $100.00 |
| Meta (Together) | Llama 4 70B | $0.88 | $0.88 | $0.0011 | $105.60 |
| Gemini 2.5 Flash | $0.30 | $2.50 | $0.0012 | $124.00 | |
| DeepSeek | DeepSeek R1 | $0.55 | $2.19 | $0.0013 | $131.60 |
| xAI | Grok 4 mini | $0.60 | $2.40 | $0.0014 | $144.00 |
| Amazon | Nova Pro | $0.80 | $3.20 | $0.0019 | $192.00 |
| OpenAI | GPT-5.4 mini | $0.75 | $4.50 | $0.0024 | $240.00 |
| Anthropic | Claude Haiku 4.5 | $1.00 | $5.00 | $0.0028 | $280.00 |
| Mistral | Mistral Large 2 | $2.00 | $6.00 | $0.004 | $400.00 |
| Meta (Together) | Llama 4 405B | $3.50 | $3.50 | $0.0042 | $420.00 |
| OpenAI | o3 | $2.00 | $8.00 | $0.0048 | $480.00 |
| Gemini 3.5 Flash | $1.50 | $9.00 | $0.0048 | $480.00 | |
| OpenAI | GPT-5 | $1.25 | $10.00 | $0.005 | $500.00 |
| Gemini 2.5 Pro | $1.25 | $10.00 | $0.005 | $500.00 | |
| Cohere | Command R+ | $2.50 | $10.00 | $0.006 | $600.00 |
| Gemini 3.1 Pro | $2.00 | $12.00 | $0.0064 | $640.00 | |
| OpenAI | GPT-5.4 | $2.50 | $15.00 | $0.008 | $800.00 |
| Gemini 2.5 Pro (long ctx >200K) | $2.50 | $15.00 | $0.008 | $800.00 | |
| Anthropic | Claude Sonnet 4.6 | $3.00 | $15.00 | $0.0084 | $840.00 |
| Anthropic | Claude Opus 4.8 | $5.00 | $25.00 | $0.014 | $1,400.00 |
| xAI | Grok 4 | $5.00 | $25.00 | $0.014 | $1,400.00 |
| OpenAI | GPT-5.5 | $5.00 | $30.00 | $0.016 | $1,600.00 |
| OpenAI | GPT-5.5 Pro | $30.00 | $180.00 | $0.096 | $9,600.00 |
Estimates only. Real bills can vary 5–15% depending on caching, batching, and region.
What this calculator does
22 LLMs in one table
GPT-5, Claude Opus 4.7, Gemini 2.5 Pro, Llama 4, DeepSeek V3, Mistral, Amazon Nova, Cohere Command — all comparable side-by-side.
Prompt cache modeling
Toggle cache hit rate from 0–100% to see Anthropic (10% on hit), OpenAI (50%), and Google (25%) effective rates.
Per-request + per-month
Calculator surfaces both per-request cost (UX-level) and monthly total (billing-level) for every model.
Workload presets
Chat, RAG, agent, summarization, and code-gen presets prefill realistic token splits — no guessing.
Output:input ratio
Most chat workloads are 4:1 input:output; code-gen 3:1; summarization 10:1. Slider lets you tune to your real ratio.
Export + share
Save scenarios to localStorage, export CSV, share via permalink with your team.
Quick comparison
Token pricing across the top LLMs (per 1M tokens)
| Model | Input | Output | Blended 50:50 |
|---|---|---|---|
| Amazon Nova Lite | $0.06 | $0.24 | $0.15 |
| DeepSeek V3 | $0.27 | $1.10 | $0.69 |
| Gemini 2.5 Flash | $0.30 | $2.50 | $1.40 |
| GPT-5 mini | $0.40 | $1.60 | $1.00 |
| Claude Haiku 4.5 | $0.80 | $4.00 | $2.40 |
| Claude Sonnet 4.6 | $3.00 | $15.00 | $9.00 |
| OpenAI GPT-5 | $10.00 | $30.00 | $20.00 |
| Claude Opus 4.7 | $15.00 | $75.00 | $45.00 |
Output dominates most workloads. Use the calculator for your real input:output ratio.
How to use this calculator
Estimate input + output token cost for your workload across 22 LLMs in under 60 seconds.
- 1
Pick a workload preset
Choose chat, RAG, agent, summarization, or code-gen. The preset prefills realistic input/output token splits.
- 2
Set requests per month
Enter your expected monthly request volume. The calculator scales per-request cost to monthly total.
- 3
Toggle prompt caching
If your system prompt is stable, set cache hit rate to 50–80% to see effective input rates.
- 4
Compare and pick
Sort the result table by monthly cost. Pick the cheapest model that meets your quality bar.
Why use this calculator
- ✓Free forever — no login, no credit card
- ✓22 LLMs refreshed monthly from official docs
- ✓Runs client-side — your inputs stay private
- ✓Workload presets, not generic averages
- ✓Includes prompt cache + batch discounts
- ✓Permalinks for team sharing