Why is output pricing higher?

Generation allocates incremental compute per token; providers reflect that in rates.

Are there taxes or fees on top?

Cloud marketplaces may add line items—read invoices holistically.

Do batch APIs change unit price?

Often yes—if latency allows, batching can improve effective economics.

Can I negotiate pricing?

At scale, often—document usage forecasts before conversations.

Home
AI token cost calculator
How AI token pricing works

Tokens · Pricing

How AI token pricing works

Token pricing sounds opaque until you separate three ideas: how text becomes tokens, how providers price input versus output, and how your traffic multiplies both.

Token calculation explanation

Providers bill tokens, not words. Tokenizers determine the split; different models tokenize the same string differently.

Words-to-token examples

Use heuristics for proposals, tokenizer APIs for production forecasts.

Prompt optimization tips

Tight prompts reduce input tokens; tight answer formats reduce output tokens.

Token reduction techniques

Summaries, caches, retrieval, and routing are engineering tools with direct token payoffs.

Context window explanation

Windows set hard caps; approaching them increases failure rates and sometimes cost if you chunk work into extra calls.

Real pricing examples

Per-request cost = prompt_tokens/1000 * input_rate + completion_tokens/1000 * output_rate. Monthly cost = per-request * daily_requests * days.

Sample configured rates

Model	Provider	Input	Output
GPT-4o	OpenAI	$0.0025 / 1K in	$0.0100 / 1K out
GPT-4o mini	OpenAI	$0.0002 / 1K in	$0.0006 / 1K out
Claude 3.5 Sonnet	Anthropic	$0.0030 / 1K in	$0.0150 / 1K out
DeepSeek Chat	DeepSeek	$0.0001 / 1K in	$0.0003 / 1K out

Related calculators & guides

Explore adjacent workflows and long-tail pricing topics without losing your place.

FAQ: Token pricing mechanics

Short answers mirror the structured data on this page for search engines and readers.

Why is output pricing higher?: Generation allocates incremental compute per token; providers reflect that in rates.
Are there taxes or fees on top?: Cloud marketplaces may add line items—read invoices holistically.
Do batch APIs change unit price?: Often yes—if latency allows, batching can improve effective economics.
Can I negotiate pricing?: At scale, often—document usage forecasts before conversations.

See pricing math interactively

Adjust input/output split to learn which lever matters for your workload.

Prefilled for this page’s scenario. Pricing loads from config/models.php and /api/pricing.

Calculator

Cost = (prompt ÷ 1000 × P_in) + (completion ÷ 1000 × P_out), then × requests.

Primary model

Prompt tokens

Completion tokens

Requests

Currency

Usage presets

Multi-model comparison

Toggle models to compare the same workload. The cheapest option is highlighted.

Monthly cost simulator

Project from average daily requests (uses tokens above).

Avg. requests / day

Working days / month

Uses primary model rates for projections.

Token estimator

Rough heuristic: ~4 characters ≈ 1 token for Latin text (indicative only).

Paste prompt or completion

Estimated tokens: 0 · Cost @ primary: —

API budget planner

Set a monthly cap to see how many identical requests fit (primary model).

Monthly budget (USD)

Max requests (approx): —

Prompt optimization analyzer

Collapse whitespace and tighten wording to preview savings at the primary model.

Draft prompt

Suggested shorter form:

Token delta: 0 · Est. savings / 1k calls: —

Fine-tuning cost sketch

Order-of-magnitude helper: training tokens × epochs × rate + storage.

Training tokens (billions)

Epochs

USD / 1M train tokens

Checkpoint storage (GB)

Storage USD / GB / mo

Est. training + 1 mo storage: —

Team usage calculator

Multiply per-person daily volume by team size (primary model).

Team members

Requests / person / day

Team monthly (22d): —

Cost per feature

Price a single product surface (e.g., one chat turn or one generated article).

Feature label

Uses / day

Uses prompt & completion tokens from the calculator for one invocation.

Cost per use: — · Monthly @ that cadence: —

Share & export

Serialize inputs in the URL hash or copy a text summary.

Calculation history

Stored in your browser only (LocalStorage).

Primary results

Cost / request: —
Input share: —
Output share: —
Total (batch): —
Monthly (simulator): —
Yearly (simulator): —

Comparison table

Model	$/req	Batch

Optimization insights

Currency note

FX rates are static snapshots for UX (not trading data). USD is the base in app.js; adjust as needed.