๐ฐ AI Model Pricing Calculator (2025)
Calculate your monthly AI costs. Compare pricing for 4,587+ models across
95 providers. Real-time cost estimation based on your token usage.
4,587+Models
95Providers
81Free Models
1,374With Cache Pricing
๐ Interactive Catalog
โญ Star on GitHub
๐งฎ Cost Calculator
๐ Quick Cost Comparison
Monthly cost for 1M input + 0.5M output tokens across popular models.
gpt-4o
$7.50/mo
$2.5 in / $10 out per 1M
gpt-4o-mini
$0.45/mo
$0.15 in / $0.6 out per 1M
gpt-4.1
$6.00/mo
$2 in / $8 out per 1M
gpt-4.1-mini
$1.20/mo
$0.4 in / $1.6 out per 1M
o3
$30.00/mo
$10 in / $40 out per 1M
o4-mini
$3.30/mo
$1.1 in / $4.4 out per 1M
gemini-2.5-pro
$6.25/mo
$1.25 in / $10 out per 1M
gemini-2.5-flash
$1.55/mo
$0.3 in / $2.5 out per 1M
gemini-2.0-flash
$0.30/mo
$0.1 in / $0.4 out per 1M
deepseek-chat
$0.28/mo
$0.14 in / $0.28 out per 1M
deepseek-r1
$4.05/mo
$1.35 in / $5.4 out per 1M
llama-4-maverick
$0.69/mo
$0.25 in / $0.87 out per 1M
๐ต Cheapest Models Overall
| Model |
Provider |
Input $/1M |
Output $/1M |
Context |
| openai--gpt-image-1-mini |
aimlapi |
$0.007 |
$0.676 |
? |
| mistralai--Mistral-Nemo-Instruct-2407 |
klusterai |
$0.008 |
$0.001 |
131K |
| qwen3.5-0.8b |
deepinfra |
$0.01 |
$0.05 |
262K |
| ling-2.6-flash |
inclusionai |
$0.01 |
$0.03 |
262K |
| bdc-coder |
inferencenet |
$0.01 |
$0.01 |
131K |
| openai--gpt-image-1-model |
aimlapi |
$0.012 |
$0.175 |
? |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo |
klusterai |
$0.015 |
$0.02 |
131K |
| granite-4.0-h-micro |
cloudflare |
$0.017 |
$0.112 |
131K |
| meta-llama-3.1-8b-instruct-turbo |
deepinfra |
$0.02 |
$0.03 |
131K |
| meta-llama-3.1-8b-instruct |
deepinfra |
$0.02 |
$0.05 |
131K |
๐ง Cheapest with Tool Calling
| Model |
Provider |
Input $/1M |
Output $/1M |
Context |
| ling-2.6-flash |
inclusionai |
$0.01 |
$0.03 |
262K |
| bdc-coder |
inferencenet |
$0.01 |
$0.01 |
131K |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo |
klusterai |
$0.015 |
$0.02 |
131K |
| granite-4.0-h-micro |
cloudflare |
$0.017 |
$0.112 |
131K |
| llama-3.1-8b-instruct--fp-16 |
inferencenet |
$0.02 |
$0.03 |
131K |
| schematron-3b |
inferencenet |
$0.02 |
$0.05 |
131K |
| schematron-v3 |
inferencenet |
$0.02 |
$0.05 |
131K |
| gpt-oss-20b |
inferencenet |
$0.03 |
$0.15 |
131K |
| schematron-v2-turbo |
inferencenet |
$0.03 |
$0.15 |
131K |
| openai--gpt-oss-20b |
neuralwatt |
$0.03 |
$0.16 |
? |
๐ง Cheapest with Reasoning
| Model |
Provider |
Input $/1M |
Output $/1M |
Context |
| qwen3.5-0.8b |
deepinfra |
$0.01 |
$0.05 |
262K |
| qwen3.5-2b |
deepinfra |
$0.02 |
$0.1 |
262K |
| gpt-oss-20b |
deepinfra |
$0.03 |
$0.14 |
131K |
| qwen3.5-4b |
deepinfra |
$0.03 |
$0.15 |
262K |
| openai--gpt-oss-20b |
neuralwatt |
$0.03 |
$0.16 |
? |
| qwen--qwen3-4b-fp8 |
novitaai |
$0.03 |
$0.03 |
128K |
| gpt-oss-120b |
deepinfra |
$0.039 |
$0.19 |
131K |
| nvidia-nemotron-nano-9b-v2 |
deepinfra |
$0.04 |
$0.16 |
131K |
| openai--gpt-oss-20b |
novitaai |
$0.04 |
$0.15 |
131K |
| nemotron-3-nano-30b-a3b |
deepinfra |
$0.05 |
$0.2 |
262K |
๐๏ธ Cheapest with Vision
| Model |
Provider |
Input $/1M |
Output $/1M |
Context |
| qwen3.5-0.8b |
deepinfra |
$0.01 |
$0.05 |
262K |
| qwen3.5-2b |
deepinfra |
$0.02 |
$0.1 |
262K |
| paddlepaddle--paddleocr-vl |
novitaai |
$0.02 |
$0.02 |
16K |
| qwen3.5-4b |
deepinfra |
$0.03 |
$0.15 |
262K |
| deepseek--deepseek-ocr-2 |
novitaai |
$0.03 |
$0.03 |
8K |
| deepseek--deepseek-ocr |
novitaai |
$0.03 |
$0.03 |
8K |
| reka-edge-2 |
reka |
$0.03 |
$0.1 |
131K |
| zai-org--autoglm-phone-9b-multilingual |
novitaai |
$0.035 |
$0.138 |
65K |
| gemini-1.5-flash-8b |
deepinfra |
$0.0375 |
$0.15 |
1M |
| google-gemma-3-4b |
amazon-bedrock |
$0.04 |
$0.08 |
131K |
๐ก How to Reduce Your AI Costs
-
Use smaller models for simple tasks โ GPT-4o Mini is 60x cheaper than GPT-4o for
basic tasks
-
Enable prompt caching โ 1,374 models offer cache pricing (typically 50% off cached
input tokens)
-
Choose open-weight models โ 527 models you can self-host for fixed infrastructure
costs
-
Try free models for prototyping โ 81 models at zero cost before committing to paid
APIs
-
Compare across providers โ same model (e.g., Llama 4) may cost differently on Groq
vs. Together vs. Fireworks
-
Use the interactive catalog โ
filter by capability and sort by price
๐ Methodology
All pricing data is sourced from first-party provider APIs. Prices are per million
tokens (1M = 1,000,000 tokens). Aggregator providers are excluded from ranking tables to avoid
duplicate models. Cache pricing is shown separately where available.
๐ More Resources
-
Interactive Catalog โ search,
filter, compare all models
-
LLM Pricing Comparison
โ detailed pricing tables
-
Cheapest AI Models
โ lowest price LLMs
-
Free AI Models โ
81 models at zero cost
-
Best AI Models โ
curated by use case
-
Best AI Models for Coding
โ code-focused comparison
-
Best AI Models for Agents
โ agentic model comparison
-
Tool Calling Models Comparison
โ function calling LLMs
-
Reasoning Models Comparison
โ o1, R1, Claude, Gemini
-
OpenAI Alternatives
โ 95 providers compared
-
AI Models by Provider
โ browse by provider
-
Context Window Comparison
โ largest context LLMs
-
GitHub Repository
๐ Open Source AI Models (527 models)
๐จ Multimodal AI Models (1,548 models)
State of AI Models 2025
โ star, fork, contribute
-
Best AI Models for Image Generation
โ DALLยทE, Imagen, GPT-5 Image compared
-
Best AI Models for Vision
โ GPT-4o, Claude, Gemini vision compared
-
Structured Output Models Comparison
โ JSON mode, function calling compared
Small Language Models
๐ฏ AI Model Picker
โก GitHub Action