19 models · bundle prices

AI token cost
calculator

Calculate and compare the real cost of calling AI models. Enter tokens or paste a prompt — get instant estimates.

OpenAIAnthropicGoogleMistralxAI

Calculate cost

Model

OpenAIInput $2.5/MOutput $10/M128k ctx

Input tokens

Output tokens

Select a model, enter token counts,
and hit Calculate.

Pipeline

Pipeline composition

Chain models together — output of each step feeds the next.

Initial input tokens

The prompt / context that kicks off the pipeline.

Model

$2.5 / $10 per 1M

Expected output tokens

→ feeds as input to the next step

— tokens

Model

$2.5 / $10 per 1M

Expected output tokens

→ feeds as input to the next step

123

Configure your pipeline steps and hit calculate to see the total cost breakdown.

Compare models

Select 2 or more models to compare side-by-side.

Select models

Input tokens

Output tokens

Self-hosted

Self-hosted cost

Pick a model, GPU and cluster config — costs calculated from real benchmarks

Benchmarks · Apr 6, 2026

1 · Model to run

Llama

Mistral

Mixtral

Qwen

Phi

Gemma

DeepSeek

2 · Quantization

FP16 (full precision)

16 GBVRAM required for Llama 3.1 8B in FP16

3 · GPU

4 · Cluster config

Number of GPUs

Utilization %

Throughput override (tok/s — optional)

Leave at 0 to use the benchmark value above. Set your own if you've profiled the actual model on your hardware.

Tokens per call

Input tokens

Output tokens

Self-hosted inference has a unified cost — input and output share the same throughput budget.

Compare breakeven against cloud model (optional)

Self-hosted · 1 GPULlama 3.1 8B · FP16

Cost per million tokens

$0.4336

Effective throughput: 2,242 tok/s

Cost per call

$0.000650

Cluster cost / hour

$3.50/h

$2555/mo (730h)

At scale

100 calls$0.0650

1 000 calls$0.6504

1M calls$650.43

AI token costcalculator

Calculate cost

Pipeline composition

Compare models

Self-hosted cost

AI token cost
calculator