Question 1

How do I forecast my LLM API spend for 12 months?

Accepted Answer

Three inputs: requests per month (month 1), growth pattern (flat/linear/exponential), and average input/output tokens per request. The calculator projects month-by-month spend and gives you the year-1 total. Save scenarios to compare model choices side-by-side.

Question 2

Which growth pattern should I use — flat, linear, or exponential?

Accepted Answer

Flat: stable internal tools or B2B SaaS at scale. Linear: typical growth product adding ~10% MoM. Exponential: pre-PMF startups or viral consumer apps doubling every 1–2 months. Most AI products end up between linear and 1.3× exponential.

Question 3

Is GPT-5 or Claude Sonnet 4.6 cheaper at 100M tokens per month?

Accepted Answer

At 100M tokens (80M input, 20M output): GPT-5 costs $1,400/month, Claude Sonnet 4.6 costs $540/month — a 60% difference. Sonnet 4.6 wins on price at virtually every scale. Switch unless you need GPT-5-specific features.

Question 4

Does this calculator include prompt caching savings?

Accepted Answer

Yes — toggle "cache hit rate" to model it. Anthropic charges 10% of normal input price on cache hits, OpenAI 50%, Google 25%. At 60% cache hit rate on a RAG workload, Anthropic input cost drops 54%. Significant for long-system-prompt apps.

Question 5

How accurate is a 12-month LLM forecast?

Accepted Answer

For the first 3 months: within 10% if your traffic estimate is realistic. For months 6–12: ±30% is normal because pricing changes and you may switch models. Re-run the forecast monthly and pin the saved scenario for executive reporting.

Question 6

What is the cheapest way to serve 1 billion LLM tokens per month?

Accepted Answer

Three paths: (1) DeepSeek V3 at $1.10/M output = ~$220/month for 200M output tokens, (2) Together Llama 4 70B at $0.88/M = $176/month, (3) self-hosted vLLM on 4× H100 at $2.50/hr = $7,200/month flat (worth it above ~3B tokens/month). The calculator compares all three.

Model	Month-1	Year-1 Total	vs Sonnet
Amazon Nova Lite	$10	$120	0.02×
DeepSeek V3	$80	$960	0.15×
Gemini 2.5 Flash	$74	$888	0.14×
Claude Haiku 4.5	$144	$1,728	0.27×
Claude Sonnet 4.6	$540	$6,480	1.00×
OpenAI GPT-5	$1,400	$16,800	2.59×
Claude Opus 4.7	$2,700	$32,400	5.00×

LLM API Monthly Cost Estimator

What this calculator does

Month-by-month forecast

Growth patterns

Prompt cache modeling

22 models compared

Scenario saver

Year-1 cumulative

Quick comparison

How to use this calculator

Why use this calculator

Frequently Asked Questions