Use case · Documents
Document processing AI cost guide
Document AI stacks combine OCR, chunking, embeddings, and LLM summarization. Tokens accumulate across stages, so isolate each stage in your ledger.
Token patterns in document pipelines
Long PDFs can span multiple LLM calls if you chunk. Each chunk adds overhead instructions—keep chunk templates short.
Processing scenarios
| Scenario | Prompt tokens | Output tokens | Model (est.) | Cost / request |
|---|---|---|---|---|
| Contract clause map | 12000 | 900 | Claude 3.5 Sonnet | $0.0495 |
| Invoice field extract | 3500 | 400 | GPT-4o | $0.0128 |
| Daily news digest | 6000 | 500 | gemini-2.5-flash | $0.0006 |
Figures use rates from config/models.php; confirm against your provider before billing decisions.
Monthly estimates
-
Back-office batch
350 large docs per weekday.
- Per request
- $0.0383
- Monthly (350 req/day × 22 days)
- $294.53
Infrastructure considerations
Object storage, OCR vendors, and GPU preprocessing belong in the same business case as LLM tokens.
Model recommendations
Use flash/mini tiers for first-pass extraction and premium models for adjudication.
Optimization recommendations
Deduplicate documents, store intermediate JSON, and avoid re-sending unchanged sections.
ROI examples
Compare manual review hours avoided versus model cost—legal and finance teams often have ready hourly benchmarks.
Budget guidance
Pilot with stratified samples across document types so medians reflect messy real-world scans.
FAQ: Document AI pricing
Short answers mirror the structured data on this page for search engines and readers.
- Do embeddings count as LLM tokens?
- Embedding models have their own pricing—track them separately from chat completions.
- How does chunk overlap affect cost?
- Overlap duplicates tokens across calls—minimize overlap while preserving context.
- What about handwritten notes?
- OCR quality impacts retries; budget higher completion variance.
- Can on-device OCR reduce bills?
- It can reduce cloud OCR fees but may shift engineering costs—compare holistically.