AITOT

Calculator

AI Embeddings Cost Calculator

Estimate one-time and recurring embedding cost across 9+ providers. Plug in document corpus size, chunk strategy, and refresh frequency.

Pricing data refreshed:

The AITOT Embeddings Cost calculator estimates one-time corpus embedding plus recurring re-embed cost across 9 providers — OpenAI text-embedding-3-small/large, Cohere Embed v4, Voyage 3 (Lite + standard), Jina v3, BGE-M3 (self-host), Mistral Embed, Google text-embedding-005, and Azure OpenAI Embed.

For a 1M-document corpus at 500 tokens average per doc = 500M tokens. OpenAI text-embedding-3-small: $10. OpenAI text-embedding-3-large: $65. Cohere Embed v4: $50. Voyage 3 Lite: $10. Most one-time embed bills are small; recurring re-embedding from doc updates is what scales.

Toggle refresh frequency (0 = never, 0.25 = every 4 months, 1 = monthly, 4 = weekly). Above 5B tokens/month, self-hosted BGE-M3 on a single H100 beats OpenAI. The calculator surfaces self-host break-even alongside managed provider costs.

Cheapest · year 1

Together · BGE-M3

1024 dim · 8,192 max tokens

$2

ProviderModel$ / 1M tokensOne-time embed costMonthly costYear 1
TogetherBGE-M3

1024 dim · Self-host open weights for $0

$0.008$0.40$0.14$2
Togetherbge-large-en-v1.5

1024 dim

$0.008$0.40$0.14$2
Fireworksnomic-embed-text-v1.5

768 dim

$0.008$0.40$0.14$2
Jina AIjina-embeddings-v3

1024 dim · configurable

$0.012$0.60$0.21$3
Jina AIjina-embeddings-v4

2048 dim · configurable

$0.018$0.90$0.31$5
OpenAItext-embedding-3-small

1536 dim · configurable

$0.02$1.00$0.35$5
Voyage AIvoyage-4-lite

512 dim · 200M tokens free

$0.02$1.00$0.35$5
Voyage AIvoyage-3-lite

512 dim

$0.02$1.00$0.35$5
Amazon BedrockTitan Embed v2

1024 dim · configurable

$0.02$1.00$0.35$5
Voyage AIvoyage-4

1024 dim · configurable · 200M tokens free

$0.06$3.00$1.05$16
Voyage AIvoyage-3

1024 dim

$0.06$3.00$1.05$16
Cohereembed-english-v3.0

1024 dim

$0.10$5.00$1.75$26
Cohereembed-multilingual-v3.0

1024 dim

$0.10$5.00$1.75$26
Cohereembed-english-light-v3.0

384 dim · Smaller, cheaper at inference

$0.10$5.00$1.75$26
Mistralmistral-embed

1024 dim

$0.10$5.00$1.75$26
Voyage AIvoyage-4-large

1024 dim · configurable · Top MTEB 2026; 200M tokens free

$0.12$6.00$2.10$31
OpenAItext-embedding-3-large

3072 dim · configurable · Matryoshka — truncate to 256/512/1024 without retrain

$0.13$6.50$2.28$34
GoogleGemini Embedding

3072 dim · configurable · Text-only

$0.15$7.50$2.63$39
Voyage AIvoyage-3-large

1024 dim · configurable · Legacy v3; consider voyage-4-large

$0.18$9.00$3.15$47
Voyage AIvoyage-code-3

1024 dim · Optimized for code retrieval

$0.18$9.00$3.15$47
GoogleGemini Embedding 2

3072 dim · configurable · Multimodal: text $0.20, image $0.45, audio $6.50, video $12 per 1M tokens

$0.20$10.00$3.50$52

Refresh frequency of 0.25 means re-embed the corpus once every 4 months. Models marked "configurable" support Matryoshka truncation — you can downsize dimensions post-hoc without re-embedding.

What this calculator does

9 providers compared

OpenAI 3-small/large, Cohere v4, Voyage 3, Jina, Mistral, Google, Azure, BGE-M3 self-host.

One-time + recurring

Initial corpus embed cost + monthly re-embed cost shown separately.

Refresh frequency slider

Model how often you re-embed (never, quarterly, monthly, weekly).

Self-host break-even

Compares managed APIs to BGE-M3 on rented H100. Break-even ~2B tokens/month.

Dimension truncation

Matryoshka models (OpenAI 3-large) let you truncate dimensions for storage savings.

Query token modeling

Embeddings cost is symmetrical — query tokens count too. Often overlooked.

Quick comparison

Cost to embed a 500M-token corpus + 50M monthly query tokens

ProviderOne-timeMonthly$ / 1M tokens
Jina v3$9$0.90$0.018
Voyage 3 Lite$10$1$0.02
OpenAI text-embed-3-small$10$1$0.02
Cohere Embed v4 Light$50$5$0.10
Voyage 3 Large$65$6.50$0.13
OpenAI text-embed-3-large$65$6.50$0.13
Self-host BGE-M3 (H100)~$45~$1,300flat /mo

Self-host wins above ~2B tokens/month total throughput.

How to use this calculator

Calculate one-time corpus embedding + recurring re-embed cost across 9 providers.

  1. 1

    Enter corpus size

    Tokens in your full corpus. Documents × avg tokens/doc. Typical: 1 doc = 500 tokens.

  2. 2

    Set refresh frequency

    0 = never, 1 = monthly, 4 = weekly. Most production corpora re-embed quarterly.

  3. 3

    Add query volume

    Monthly query tokens (queries × tokens/query). Often the biggest line item over time.

  4. 4

    Compare and pick

    Sort by monthly cost. Self-host BGE-M3 wins above ~2B tokens/month.

Why use this calculator

  • 9 providers refreshed monthly
  • One-time + recurring split
  • Self-host break-even modeled
  • Matryoshka dimension truncation
  • Query tokens included
  • No login required

Frequently Asked Questions

What is the cheapest embeddings provider in 2026?+
For one-time corpus embedding: Voyage 3 Lite at $0.02/M tokens. OpenAI text-embedding-3-small at $0.02/M. Cohere Embed v4 Light at $0.10/M. Jina v3 at $0.018/M. BGE M3 self-hosted is effectively free at scale. For quality + price balance, OpenAI text-embedding-3-large at $0.13/M wins.
How much does it cost to embed a 1M-document corpus?+
At 500 tokens/doc average × 1M docs = 500M tokens. OpenAI text-embedding-3-small: $10. OpenAI text-embedding-3-large: $65. Cohere Embed v4: $50. For most one-time corpus embeds, the bill is small — recurring re-embedding from doc updates is what scales.
How often should I re-embed my corpus?+
Static reference data (legal, scientific): annually or on schema change. Frequently-updated docs (product catalog, docs site): weekly delta re-embed of changed chunks only. Don't batch-re-embed unchanged data — use change-detection on file hash or last-modified timestamps.
Should I use 1536 or 3072 dimension embeddings?+
1536 (OpenAI default) is sufficient for 90% of use cases. 3072 wins on long-context retrieval (legal, scientific). 1536 stores 2× cheaper in your vector DB and queries faster. Use Matryoshka truncation to test 512 → 1024 → 1536 — often gains plateau at 1024.
Is self-hosting BGE-M3 cheaper than OpenAI embeddings?+
Above ~5B embedded tokens/month, yes. BGE-M3 on a single H100 ($1.85–$2.50/hr) runs ~2M tokens/sec — that's 5T tokens/month at $1.3k/month flat. OpenAI text-embedding-3-large at $0.13/M = $650 per billion tokens, so self-host beats above ~2B tokens/month.
How are embeddings priced — by tokens or by documents?+
Always by input tokens. The calculator converts your doc count × avg tokens/doc into the billable token count. OpenAI, Cohere, Voyage, and Jina all charge per million input tokens regardless of dimension. Storage is separate (paid to your vector DB).