Is DeepSeek always the cheapest?

On many text workloads it leads list price, but verify output length and compliance fit.

What about self-hosted models?

Compare fully loaded GPU/human costs—not just electricity.

Do batch endpoints change the answer?

They can materially lower effective price if latency is acceptable.

How often should we re-evaluate?

Quarterly or whenever a vendor changes pricing or you launch a major new feature.

Home
AI token cost calculator
Which AI API provider is cheapest?

Comparison · Value

Which AI API provider is cheapest?

“Cheapest” is undefined without a workload. A flash-tier model with thrifty prompts can beat a flagship model that produces verbose answers.

This guide helps you answer the question for your data—not for marketing headlines.

Value-tier rate snapshot

Model	Provider	Input	Output
DeepSeek Chat	DeepSeek	$0.0001 / 1K in	$0.0003 / 1K out
GPT-4o mini	OpenAI	$0.0002 / 1K in	$0.0006 / 1K out
Gemini 2.5 Flash	Google	$0.0001 / 1K in	$0.0003 / 1K out
Claude 3.5 Haiku	Anthropic	$0.0008 / 1K in	$0.0040 / 1K out

Hidden costs beyond list price

Retries, support time, compliance tooling, and downtime can dwarf token savings—include them in TCO.

How to compare fairly

Freeze a evaluation set, measure tokens and accuracy per model, multiply tokens by rates, add rework costs.

Optimization beats vendor hopping

Ten percent fewer output tokens often saves more than switching logos—optimize prompts first.

When ultra-cheap models fail

High-stakes workflows—finance, safety, regulated advice—may need premium models regardless of list price.

Related calculators & guides

Explore adjacent workflows and long-tail pricing topics without losing your place.

FAQ: Cheapest LLM APIs

Short answers mirror the structured data on this page for search engines and readers.

Is DeepSeek always the cheapest?: On many text workloads it leads list price, but verify output length and compliance fit.
What about self-hosted models?: Compare fully loaded GPU/human costs—not just electricity.
Do batch endpoints change the answer?: They can materially lower effective price if latency is acceptable.
How often should we re-evaluate?: Quarterly or whenever a vendor changes pricing or you launch a major new feature.

Find the cheapest configured model for your mix

Sort comparison rows after entering your real prompt and completion medians.

Prefilled for this page’s scenario. Pricing loads from config/models.php and /api/pricing.

Calculator

Cost = (prompt ÷ 1000 × P_in) + (completion ÷ 1000 × P_out), then × requests.

Primary model

Prompt tokens

Completion tokens

Requests

Currency

Usage presets

Multi-model comparison

Toggle models to compare the same workload. The cheapest option is highlighted.

Monthly cost simulator

Project from average daily requests (uses tokens above).

Avg. requests / day

Working days / month

Uses primary model rates for projections.

Token estimator

Rough heuristic: ~4 characters ≈ 1 token for Latin text (indicative only).

Paste prompt or completion

Estimated tokens: 0 · Cost @ primary: —

API budget planner

Set a monthly cap to see how many identical requests fit (primary model).

Monthly budget (USD)

Max requests (approx): —

Prompt optimization analyzer

Collapse whitespace and tighten wording to preview savings at the primary model.

Draft prompt

Suggested shorter form:

Token delta: 0 · Est. savings / 1k calls: —

Fine-tuning cost sketch

Order-of-magnitude helper: training tokens × epochs × rate + storage.

Training tokens (billions)

Epochs

USD / 1M train tokens

Checkpoint storage (GB)

Storage USD / GB / mo

Est. training + 1 mo storage: —

Team usage calculator

Multiply per-person daily volume by team size (primary model).

Team members

Requests / person / day

Team monthly (22d): —

Cost per feature

Price a single product surface (e.g., one chat turn or one generated article).

Feature label

Uses / day

Uses prompt & completion tokens from the calculator for one invocation.

Cost per use: — · Monthly @ that cadence: —

Share & export

Serialize inputs in the URL hash or copy a text summary.

Calculation history

Stored in your browser only (LocalStorage).

Primary results

Cost / request: —
Input share: —
Output share: —
Total (batch): —
Monthly (simulator): —
Yearly (simulator): —

Comparison table

Model	$/req	Batch

Optimization insights

Currency note

FX rates are static snapshots for UX (not trading data). USD is the base in app.js; adjust as needed.