All articles
Model Comparisons

Cheapest AI APIs in 2025: Full Price and Value Comparison

A full pricing matrix for 30+ AI models — input cost, output cost, and a value score combining price with benchmark performance. Essential reading for developers choosing models for production applications.

Travis Johnson

Travis Johnson

Founder, Deepest

August 28, 202510 min read

AI API pricing spans three orders of magnitude — from $0.01 per million tokens to over $75 per million tokens. DeepSeek V3 and Gemini 2.0 Flash Lite offer the best price-performance ratio; GPT-4o mini and Claude 3.5 Haiku are the best value from major US providers.

Understanding AI API Pricing

AI APIs charge separately for input tokens (the text you send) and output tokens (the text the model generates). Output tokens typically cost 3–5x more than input tokens because they require more computation. The pricing structure means that tasks requiring long responses are relatively more expensive than tasks requiring long inputs.

Full Pricing Matrix (April 2026)

Model Provider Input ($/M tokens) Output ($/M tokens) Value Score*
Gemini 2.0 Flash Lite Google $0.075 $0.30 ⭐⭐⭐⭐⭐
DeepSeek V3 DeepSeek AI $0.27 $1.10 ⭐⭐⭐⭐⭐
Gemini 2.0 Flash Google $0.10 $0.40 ⭐⭐⭐⭐⭐
GPT-4o mini OpenAI $0.15 $0.60 ⭐⭐⭐⭐
Claude 3.5 Haiku Anthropic $0.80 $4.00 ⭐⭐⭐⭐
Mistral Small Mistral AI $0.20 $0.60 ⭐⭐⭐⭐
Llama 3.3 70B (Together.ai) Meta / Together $0.60 $0.90 ⭐⭐⭐⭐
DeepSeek R1 DeepSeek AI $0.55 $2.19 ⭐⭐⭐⭐ (reasoning)
Mistral Large 2 Mistral AI $2.00 $6.00 ⭐⭐⭐
GPT-4o OpenAI $2.50 $10.00 ⭐⭐⭐
Claude 3.5 Sonnet Anthropic $3.00 $15.00 ⭐⭐⭐
Llama 4 Maverick (Together.ai) Meta / Together $0.27 $0.85 ⭐⭐⭐⭐
Gemini 2.0 Pro Google $10.00 $30.00 ⭐⭐ (specialty only)
Claude 4 Opus Anthropic $15.00 $75.00 ⭐⭐ (specialty only)
o3 OpenAI $10.00 $40.00 ⭐⭐ (reasoning specialty)
o4-mini OpenAI $1.10 $4.40 ⭐⭐⭐⭐ (reasoning)
GPT-5 OpenAI $7.50 $30.00 ⭐⭐⭐ (frontier)

*Value Score = capability relative to cost. ⭐⭐⭐⭐⭐ = exceptional value; ⭐ = expensive relative to capability.

The Best Value Models by Use Case

Best for High-Volume Text Processing

Gemini 2.0 Flash Lite ($0.075/M input) is the cheapest capable model available. At this price, processing 1 billion tokens costs $75 — affordable for large-scale applications. Quality is below frontier models but sufficient for categorization, extraction, and simple generation tasks.

Best for Frontier Quality at Low Cost

DeepSeek V3 ($0.27/M input) achieves near-GPT-4o performance at one-ninth the price. For applications needing high quality without requiring the absolute frontier, DeepSeek V3 offers the best quality-to-cost ratio. Caveat: Chinese origin and data sovereignty considerations apply.

Best Value from a US Provider

GPT-4o mini ($0.15/M input) or Claude 3.5 Haiku ($0.80/M input). GPT-4o mini is cheaper but slightly lower quality. Claude Haiku is 5x more expensive than GPT-4o mini but maintains Anthropic's safety and reliability characteristics.

Best for Reasoning Tasks

o4-mini ($1.10/M input) offers near-o3 reasoning capability at 10x lower cost. For math, logic, and complex coding tasks where you need reasoning model quality but can't justify o3 pricing, o4-mini is the best option.

Practical Cost Calculations

To ground these numbers: processing a typical 1,000-word document produces approximately 1,300 input tokens and 300 output tokens:

Model Cost per 1,000-word query Cost per 1M queries
Gemini 2.0 Flash Lite $0.000188 $188
DeepSeek V3 $0.000681 $681
GPT-4o mini $0.000375 $375
GPT-4o $0.00625 $6,250
Claude 3.5 Sonnet $0.00840 $8,400
Claude 4 Opus $0.0420 $42,000

Pricing Gotchas to Know

  • Caching discounts: OpenAI and Anthropic offer prompt caching — if you send the same system prompt repeatedly, cached tokens cost 50–90% less. At scale, this significantly reduces costs for applications with fixed system prompts.
  • Batch API discounts: Some providers offer 50% discounts for batch (non-real-time) processing. If you can tolerate 24-hour turnaround, batch is substantially cheaper.
  • Thinking tokens: Reasoning models generate thinking tokens that also cost money, even if not shown to users. o3 and DeepSeek R1 can generate many thinking tokens on hard problems — actual cost may be much higher than list price suggests.
  • Different models for different tasks: Production applications often use cheap models for simple tasks and expensive models for complex ones. This "model tiering" can reduce overall costs by 60–80%.

Frequently Asked Questions

What's the cheapest AI API that's still good?

Gemini 2.0 Flash and GPT-4o mini offer the best quality-to-cost ratio among US providers. DeepSeek V3 offers near-frontier quality at $0.27/M tokens if you're comfortable with its Chinese origin.

Is paying for a subscription better than API access?

Depends on volume. ChatGPT Plus ($20/month) and Claude Pro ($20/month) are consumer subscriptions with generous quotas. If you're processing large volumes of text programmatically, API pricing becomes more economical above a few hundred queries per day.

Do prices change frequently?

AI API prices have generally decreased over time — sometimes dramatically. GPT-4-class capability that cost $60/M tokens in 2023 costs $2.50/M in 2025. Prices for established models tend to decrease; new frontier models launch at premium prices.

How can I estimate my monthly API costs?

Multiply your estimated monthly token volume (input + output) by the per-million-token cost. Add 20% buffer for variance. Most providers offer usage dashboards and alerts to help track costs in real time.

AI API pricingcostDeepSeekGemini FlashGPT-4o mini

See it for yourself

Run any prompt across ChatGPT, Claude, Gemini, and 300+ other models simultaneously. Free to try, no credit card required.

Try Deepest free →

Related articles