Why did our chatbot bill double after adding retrieval?

Each retrieved chunk adds input tokens. Measure average chunk count and size after launch.

Should we use the same model for free and paid users?

Often no—tier models by SLA and features, but disclose differences ethically if users compare quality.

How do handoffs to humans affect cost?

They reduce future model spend on long threads but may increase CRM storage; track both.

What metric best predicts chatbot cost?

Output tokens per resolved session, paired with resolution rate—not raw message count alone.

Use case · Chatbots

Chatbot API cost calculator

Customer-facing chatbots mix short user messages with variable-length answers, tool calls, and sometimes retrieved documents. That variability makes “one number” pricing misleading unless you anchor estimates to measured token histograms.

This page walks through realistic chatbot token patterns, shows how they translate into daily and monthly bills, and links to optimization tactics your team can implement without hurting UX.

Use the calculator with the embedded preset tuned for support-style chats, then adjust completion caps to match your brand voice.

Expected token usage patterns

Most production chatbots spend input tokens on system instructions, the latest user turn, and a sliding window of recent history. Retrieval adds chunks—often the biggest surprise when finance first reviews logs.

Output tokens track answer length, empathetic padding, and structured payloads like JSON buttons. Streaming responses do not reduce billed tokens versus batch responses.

Pattern checklist

Measure p50 and p95 prompt sizes per locale.
Separate “free tier” user behavior from paid users.
Account for moderator or safety classifier passes if they call the same model.

Example chatbot workloads

Scenario	Prompt tokens	Output tokens	Model (est.)	Cost / request
Retail FAQ	380	140	GPT-4o mini	$0.0001
Technical support with logs	2400	320	GPT-4o	$0.0092
Sales concierge	900	260	Claude 3.5 Haiku	$0.0018

Figures use rates from config/models.php; confirm against your provider before billing decisions.

Daily and monthly API estimates

Growth-stage SaaS

8,000 sessions per weekday.

Per request

$0.0002

Monthly (8000 req/day × 22 days)

$35.90
Weekend-heavy consumer app

25,000 shorter turns.

Per request

$0.0001

Monthly (25000 req/day × 30 days)

$72.00

Infrastructure considerations

Beyond raw tokens, budget for retries, shadow traffic in canary deployments, and duplicate calls from mobile clients with poor connectivity. Observability sinks (logging tokens to your warehouse) also have a cost—usually tiny versus model bills but worth noting.

Model recommendations

Start with a mini or Haiku-class default for breadth, route frustration signals or high LTV accounts to larger models, and keep human handoff paths cheap by truncating context intelligently.

Optimization recommendations

Summarize stale turns instead of sending entire transcripts, store stable persona text server-side, and test tighter max_tokens settings with product design so answers stay concise without feeling curt.

ROI estimation examples

If a chatbot deflects twenty percent of tickets costing twelve dollars each, model spend can often be justified even at flagship pricing—provided you measure deflection honestly with QA sampling.

API budget planning guidance

Build a simple spreadsheet: tokens per session × sessions per month × blended rate. Add ten to twenty percent buffer for growth and experimentation, then reconcile weekly during launch months.

Related calculators & guides

Explore adjacent workflows and long-tail pricing topics without losing your place.

FAQ: Chatbot API pricing

Short answers mirror the structured data on this page for search engines and readers.

Why did our chatbot bill double after adding retrieval?: Each retrieved chunk adds input tokens. Measure average chunk count and size after launch.
Should we use the same model for free and paid users?: Often no—tier models by SLA and features, but disclose differences ethically if users compare quality.
How do handoffs to humans affect cost?: They reduce future model spend on long threads but may increase CRM storage; track both.
What metric best predicts chatbot cost?: Output tokens per resolved session, paired with resolution rate—not raw message count alone.

Chatbot API cost calculator

Expected token usage patterns

Example chatbot workloads

Daily and monthly API estimates

Infrastructure considerations

Model recommendations

Optimization recommendations

ROI estimation examples

API budget planning guidance

FAQ: Chatbot API pricing

Simulate chatbot token costs

Calculator

Multi-model comparison

Monthly cost simulator

Token estimator

API budget planner

Prompt optimization analyzer

Fine-tuning cost sketch

Team usage calculator

Cost per feature

Share & export

Calculation history