Which side should I optimize first?

Whichever contributes more dollars in your histogram—measure before guessing.

Do retries double output tokens?

Failed attempts may still bill partial outputs—monitor failure paths.

Are system prompts input tokens?

Yes—every included instruction counts as input.

Does streaming change the split?

No—billing still tracks completed tokens.

Home
AI token cost calculator
Input vs output token cost

Tokens · I/O

Input vs output token cost

If you only watch prompt size, you will be surprised by invoices—output tokens often drive the majority of spend on assistant-style workloads.

Token calculation explanation

Input tokens are everything you send before generation; output tokens are everything the model emits, including hidden formatting.

Words-to-token examples

A verbose ten-paragraph answer can cost more than a massive prompt with a one-sentence reply—profile both.

Prompt optimization tips

Focus on input when you control large contexts; focus on output when users ask for “detailed” answers.

Token reduction techniques

Use max_tokens, structured formats, and UI nudges toward concise modes.

Context window explanation

Large inputs leave less room for outputs inside the same window—plan generation headroom explicitly.

Real pricing examples

If output price is double input price, a fifty-fifty token split is not a fifty-fifty cost split—weight by rates.

Output-heavy vs input-heavy months

Output-heavy assistant

2,000 calls/day, short prompts, long answers.

Per request

$0.0180

Monthly (2000 req/day × 22 days)

$792.00
Input-heavy RAG

600 calls/day, huge prompts, tight answers.

Per request

$0.0325

Monthly (600 req/day × 22 days)

$429.00

Related calculators & guides

Explore adjacent workflows and long-tail pricing topics without losing your place.

FAQ: Input vs output pricing

Short answers mirror the structured data on this page for search engines and readers.

Which side should I optimize first?: Whichever contributes more dollars in your histogram—measure before guessing.
Do retries double output tokens?: Failed attempts may still bill partial outputs—monitor failure paths.
Are system prompts input tokens?: Yes—every included instruction counts as input.
Does streaming change the split?: No—billing still tracks completed tokens.

Split input vs output visually

Watch the share bars in results as you change completion length.

Prefilled for this page’s scenario. Pricing loads from config/models.php and /api/pricing.

Calculator

Cost = (prompt ÷ 1000 × P_in) + (completion ÷ 1000 × P_out), then × requests.

Primary model

Prompt tokens

Completion tokens

Requests

Currency

Usage presets

Multi-model comparison

Toggle models to compare the same workload. The cheapest option is highlighted.

Monthly cost simulator

Project from average daily requests (uses tokens above).

Avg. requests / day

Working days / month

Uses primary model rates for projections.

Token estimator

Rough heuristic: ~4 characters ≈ 1 token for Latin text (indicative only).

Paste prompt or completion

Estimated tokens: 0 · Cost @ primary: —

API budget planner

Set a monthly cap to see how many identical requests fit (primary model).

Monthly budget (USD)

Max requests (approx): —

Prompt optimization analyzer

Collapse whitespace and tighten wording to preview savings at the primary model.

Draft prompt

Suggested shorter form:

Token delta: 0 · Est. savings / 1k calls: —

Fine-tuning cost sketch

Order-of-magnitude helper: training tokens × epochs × rate + storage.

Training tokens (billions)

Epochs

USD / 1M train tokens

Checkpoint storage (GB)

Storage USD / GB / mo

Est. training + 1 mo storage: —

Team usage calculator

Multiply per-person daily volume by team size (primary model).

Team members

Requests / person / day

Team monthly (22d): —

Cost per feature

Price a single product surface (e.g., one chat turn or one generated article).

Feature label

Uses / day

Uses prompt & completion tokens from the calculator for one invocation.

Cost per use: — · Monthly @ that cadence: —

Share & export

Serialize inputs in the URL hash or copy a text summary.

Calculation history

Stored in your browser only (LocalStorage).

Primary results

Cost / request: —
Input share: —
Output share: —
Total (batch): —
Monthly (simulator): —
Yearly (simulator): —

Comparison table

Model	$/req	Batch

Optimization insights

Currency note

FX rates are static snapshots for UX (not trading data). USD is the base in app.js; adjust as needed.