Real pricing data for 4587 AI models across 95 providers. All prices are per million tokens, sourced from first-party APIs. No third-party aggregators.
The most affordable models per million input tokens.
| Model | Provider | Input $/1M | Output $/1M | Context |
|---|---|---|---|---|
| openai--gpt-image-1-mini | aimlapi | $0.007 | $0.676 | ? |
| mistralai--Mistral-Nemo-Instruct-2407 | klusterai | $0.008 | $0.001 | 131K |
| qwen3.5-0.8b | deepinfra | $0.01 | $0.05 | 262K |
| ling-2.6-flash | inclusionai | $0.01 | $0.03 | 262K |
| bdc-coder | inferencenet | $0.01 | $0.01 | 131K |
| openai--gpt-image-1-model | aimlapi | $0.012 | $0.175 | ? |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | klusterai | $0.015 | $0.02 | 131K |
| granite-4.0-h-micro | cloudflare | $0.017 | $0.112 | 131K |
| meta-llama-3.1-8b-instruct-turbo | deepinfra | $0.02 | $0.03 | 131K |
| meta-llama-3.1-8b-instruct | deepinfra | $0.02 | $0.05 | 131K |
| mistral-nemo-instruct-2407 | deepinfra | $0.02 | $0.04 | 131K |
| qwen3.5-2b | deepinfra | $0.02 | $0.1 | 262K |
| llama-3.1-8b-instruct--fp-16 | inferencenet | $0.02 | $0.03 | 131K |
| schematron-3b | inferencenet | $0.02 | $0.05 | 131K |
| schematron-v3 | inferencenet | $0.02 | $0.05 | 131K |
The most affordable models that support function/tool calling β essential for AI agents.
| Model | Provider | Input $/1M | Output $/1M | Context |
|---|---|---|---|---|
| ling-2.6-flash | inclusionai | $0.01 | $0.03 | 262K |
| bdc-coder | inferencenet | $0.01 | $0.01 | 131K |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | klusterai | $0.015 | $0.02 | 131K |
| granite-4.0-h-micro | cloudflare | $0.017 | $0.112 | 131K |
| llama-3.1-8b-instruct--fp-16 | inferencenet | $0.02 | $0.03 | 131K |
| schematron-3b | inferencenet | $0.02 | $0.05 | 131K |
| schematron-v3 | inferencenet | $0.02 | $0.05 | 131K |
| gpt-oss-20b | inferencenet | $0.03 | $0.15 | 131K |
| schematron-v2-turbo | inferencenet | $0.03 | $0.15 | 131K |
| openai--gpt-oss-20b | neuralwatt | $0.03 | $0.16 | ? |
The most affordable models with advanced reasoning capabilities.
| Model | Provider | Input $/1M | Output $/1M | Context |
|---|---|---|---|---|
| qwen3.5-0.8b | deepinfra | $0.01 | $0.05 | 262K |
| qwen3.5-2b | deepinfra | $0.02 | $0.1 | 262K |
| gpt-oss-20b | deepinfra | $0.03 | $0.14 | 131K |
| qwen3.5-4b | deepinfra | $0.03 | $0.15 | 262K |
| openai--gpt-oss-20b | neuralwatt | $0.03 | $0.16 | ? |
| qwen--qwen3-4b-fp8 | novitaai | $0.03 | $0.03 | 128K |
| gpt-oss-120b | deepinfra | $0.039 | $0.19 | 131K |
| nvidia-nemotron-nano-9b-v2 | deepinfra | $0.04 | $0.16 | 131K |
| openai--gpt-oss-20b | novitaai | $0.04 | $0.15 | 131K |
| nemotron-3-nano-30b-a3b | deepinfra | $0.05 | $0.2 | 262K |
The most affordable models that can understand images.
| Model | Provider | Input $/1M | Output $/1M | Context |
|---|---|---|---|---|
| qwen3.5-0.8b | deepinfra | $0.01 | $0.05 | 262K |
| qwen3.5-2b | deepinfra | $0.02 | $0.1 | 262K |
| paddlepaddle--paddleocr-vl | novitaai | $0.02 | $0.02 | 16K |
| qwen3.5-4b | deepinfra | $0.03 | $0.15 | 262K |
| deepseek--deepseek-ocr-2 | novitaai | $0.03 | $0.03 | 8K |
| deepseek--deepseek-ocr | novitaai | $0.03 | $0.03 | 8K |
| reka-edge-2 | reka | $0.03 | $0.1 | 131K |
| zai-org--autoglm-phone-9b-multilingual | novitaai | $0.035 | $0.138 | 65K |
| gemini-1.5-flash-8b | deepinfra | $0.0375 | $0.15 | 1M |
| google-gemma-3-4b | amazon-bedrock | $0.04 | $0.08 | 131K |
The most affordable models with large context windows (128K+ tokens).
| Model | Provider | Input $/1M | Output $/1M | Context |
|---|---|---|---|---|
| mistralai--Mistral-Nemo-Instruct-2407 | klusterai | $0.008 | $0.001 | 131K |
| qwen3.5-0.8b | deepinfra | $0.01 | $0.05 | 262K |
| ling-2.6-flash | inclusionai | $0.01 | $0.03 | 262K |
| bdc-coder | inferencenet | $0.01 | $0.01 | 131K |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | klusterai | $0.015 | $0.02 | 131K |
| granite-4.0-h-micro | cloudflare | $0.017 | $0.112 | 131K |
| meta-llama-3.1-8b-instruct-turbo | deepinfra | $0.02 | $0.03 | 131K |
| meta-llama-3.1-8b-instruct | deepinfra | $0.02 | $0.05 | 131K |
| mistral-nemo-instruct-2407 | deepinfra | $0.02 | $0.04 | 131K |
| qwen3.5-2b | deepinfra | $0.02 | $0.1 | 262K |
How much do the top AI models cost? A side-by-side comparison of the most popular models.
| Model | Provider | Input $/1M | Output $/1M | Context |
|---|---|---|---|---|
| gpt-4.1 | openai | $2 | $8 | 1M |
| gpt-4o | openai | $2.5 | $10 | 128K |
| gpt-4o-mini | openai | $0.15 | $0.6 | 128K |
| gemini-2.5-pro | deepinfra | $1.25 | $10 | 1M |
| gemini-2.5-flash | deepinfra | $0.3 | $2.5 | 1M |
| llama-4-maverick | digitalocean | $0.25 | $0.87 | 1M |
| deepseek-r1 | amazon-bedrock | $1.35 | $5.4 | 65K |
| deepseek-v3 | deepinfra | $0.32 | $0.89 | 163K |
0 models offer cache pricing β significantly reducing costs for repeated prompts. Cache pricing is typically 50-90% cheaper than standard input pricing.
| Model | Provider | Input $/1M | Cache $/1M | Savings |
|---|
All pricing data is sourced from first-party APIs β not third-party aggregators. Prices are per million tokens (input and output separately). Aggregator providers (OpenRouter, Requesty, etc.) are excluded from ranking tables to avoid duplicate models. Cache pricing is shown where available.
Data is auto-scraped and validated with Zod schemas. Last updated: 2025-05-21.