19 models · bundle prices

AI token cost
calculator

Calculate and compare the real cost of calling AI models. Enter tokens or paste a prompt — get instant estimates.

OpenAIAnthropicGoogleMistralxAI

Calculate cost

OpenAIInput $2.5/MOutput $10/M128k ctx

Select a model, enter token counts,
and hit Calculate.

Pipeline composition

Chain models together — output of each step feeds the next.

The prompt / context that kicks off the pipeline.

1

$2.5 / $10 per 1M

→ feeds as input to the next step

— tokens
2

$2.5 / $10 per 1M

→ feeds as input to the next step

123

Configure your pipeline steps and hit calculate to see the total cost breakdown.

Compare models

Select 2 or more models to compare side-by-side.

Select models

Self-hosted cost

Pick a model, GPU and cluster config — costs calculated from real benchmarks

Benchmarks · Apr 6, 2026

Llama

Mistral

Mixtral

Qwen

Phi

Gemma

DeepSeek

FP16 (full precision)

16 GBVRAM required for Llama 3.1 8B in FP16

Number of GPUs

Utilization %

Leave at 0 to use the benchmark value above. Set your own if you've profiled the actual model on your hardware.

Input tokens

Output tokens

Self-hosted inference has a unified cost — input and output share the same throughput budget.

Self-hosted · 1 GPULlama 3.1 8B · FP16

Cost per million tokens

$0.4336

Effective throughput: 2,242 tok/s

Cost per call
$0.000650
Cluster cost / hour
$3.50/h

$2555/mo (730h)

At scale

100 calls$0.0650
1 000 calls$0.6504
1M calls$650.43