πŸ’° LLM Pricing Comparison β€” 4587 AI Models

Real pricing data for 4587 AI models across 95 providers. All prices are per million tokens, sourced from first-party APIs. No third-party aggregators.

$0.01Cheapest Input/1M
$150.00Most Expensive Input/1M
$1.61Average Input/1M
81Free Models
0With Cache Pricing
πŸ” Price Calculator in Catalog ⭐ Star on GitHub
πŸ’‘ Pro tip: Use the interactive catalog's price calculator to estimate monthly costs based on your actual token usage.

πŸ’΅ Cheapest AI Models Overall

The most affordable models per million input tokens.

Model Provider Input $/1M Output $/1M Context
openai--gpt-image-1-mini aimlapi $0.007 $0.676 ?
mistralai--Mistral-Nemo-Instruct-2407 klusterai $0.008 $0.001 131K
qwen3.5-0.8b deepinfra $0.01 $0.05 262K
ling-2.6-flash inclusionai $0.01 $0.03 262K
bdc-coder inferencenet $0.01 $0.01 131K
openai--gpt-image-1-model aimlapi $0.012 $0.175 ?
klusterai--Meta-Llama-3.1-8B-Instruct-Turbo klusterai $0.015 $0.02 131K
granite-4.0-h-micro cloudflare $0.017 $0.112 131K
meta-llama-3.1-8b-instruct-turbo deepinfra $0.02 $0.03 131K
meta-llama-3.1-8b-instruct deepinfra $0.02 $0.05 131K
mistral-nemo-instruct-2407 deepinfra $0.02 $0.04 131K
qwen3.5-2b deepinfra $0.02 $0.1 262K
llama-3.1-8b-instruct--fp-16 inferencenet $0.02 $0.03 131K
schematron-3b inferencenet $0.02 $0.05 131K
schematron-v3 inferencenet $0.02 $0.05 131K

πŸ”§ Cheapest Models with Tool Calling

The most affordable models that support function/tool calling β€” essential for AI agents.

Model Provider Input $/1M Output $/1M Context
ling-2.6-flash inclusionai $0.01 $0.03 262K
bdc-coder inferencenet $0.01 $0.01 131K
klusterai--Meta-Llama-3.1-8B-Instruct-Turbo klusterai $0.015 $0.02 131K
granite-4.0-h-micro cloudflare $0.017 $0.112 131K
llama-3.1-8b-instruct--fp-16 inferencenet $0.02 $0.03 131K
schematron-3b inferencenet $0.02 $0.05 131K
schematron-v3 inferencenet $0.02 $0.05 131K
gpt-oss-20b inferencenet $0.03 $0.15 131K
schematron-v2-turbo inferencenet $0.03 $0.15 131K
openai--gpt-oss-20b neuralwatt $0.03 $0.16 ?

🧠 Cheapest Models with Reasoning

The most affordable models with advanced reasoning capabilities.

Model Provider Input $/1M Output $/1M Context
qwen3.5-0.8b deepinfra $0.01 $0.05 262K
qwen3.5-2b deepinfra $0.02 $0.1 262K
gpt-oss-20b deepinfra $0.03 $0.14 131K
qwen3.5-4b deepinfra $0.03 $0.15 262K
openai--gpt-oss-20b neuralwatt $0.03 $0.16 ?
qwen--qwen3-4b-fp8 novitaai $0.03 $0.03 128K
gpt-oss-120b deepinfra $0.039 $0.19 131K
nvidia-nemotron-nano-9b-v2 deepinfra $0.04 $0.16 131K
openai--gpt-oss-20b novitaai $0.04 $0.15 131K
nemotron-3-nano-30b-a3b deepinfra $0.05 $0.2 262K

πŸ‘οΈ Cheapest Models with Vision

The most affordable models that can understand images.

Model Provider Input $/1M Output $/1M Context
qwen3.5-0.8b deepinfra $0.01 $0.05 262K
qwen3.5-2b deepinfra $0.02 $0.1 262K
paddlepaddle--paddleocr-vl novitaai $0.02 $0.02 16K
qwen3.5-4b deepinfra $0.03 $0.15 262K
deepseek--deepseek-ocr-2 novitaai $0.03 $0.03 8K
deepseek--deepseek-ocr novitaai $0.03 $0.03 8K
reka-edge-2 reka $0.03 $0.1 131K
zai-org--autoglm-phone-9b-multilingual novitaai $0.035 $0.138 65K
gemini-1.5-flash-8b deepinfra $0.0375 $0.15 1M
google-gemma-3-4b amazon-bedrock $0.04 $0.08 131K

πŸ“ Cheapest Models with 128K+ Context

The most affordable models with large context windows (128K+ tokens).

Model Provider Input $/1M Output $/1M Context
mistralai--Mistral-Nemo-Instruct-2407 klusterai $0.008 $0.001 131K
qwen3.5-0.8b deepinfra $0.01 $0.05 262K
ling-2.6-flash inclusionai $0.01 $0.03 262K
bdc-coder inferencenet $0.01 $0.01 131K
klusterai--Meta-Llama-3.1-8B-Instruct-Turbo klusterai $0.015 $0.02 131K
granite-4.0-h-micro cloudflare $0.017 $0.112 131K
meta-llama-3.1-8b-instruct-turbo deepinfra $0.02 $0.03 131K
meta-llama-3.1-8b-instruct deepinfra $0.02 $0.05 131K
mistral-nemo-instruct-2407 deepinfra $0.02 $0.04 131K
qwen3.5-2b deepinfra $0.02 $0.1 262K

πŸ† Flagship Model Prices

How much do the top AI models cost? A side-by-side comparison of the most popular models.

Model Provider Input $/1M Output $/1M Context
gpt-4.1 openai $2 $8 1M
gpt-4o openai $2.5 $10 128K
gpt-4o-mini openai $0.15 $0.6 128K
gemini-2.5-pro deepinfra $1.25 $10 1M
gemini-2.5-flash deepinfra $0.3 $2.5 1M
llama-4-maverick digitalocean $0.25 $0.87 1M
deepseek-r1 amazon-bedrock $1.35 $5.4 65K
deepseek-v3 deepinfra $0.32 $0.89 163K

⚑ Cache Pricing

0 models offer cache pricing β€” significantly reducing costs for repeated prompts. Cache pricing is typically 50-90% cheaper than standard input pricing.

Model Provider Input $/1M Cache $/1M Savings

πŸ“Š Methodology

All pricing data is sourced from first-party APIs β€” not third-party aggregators. Prices are per million tokens (input and output separately). Aggregator providers (OpenRouter, Requesty, etc.) are excluded from ranking tables to avoid duplicate models. Cache pricing is shown where available.

Data is auto-scraped and validated with Zod schemas. Last updated: 2025-05-21.

πŸ”— More Resources

Small Language Models

🎯 AI Model Picker

⚑ GitHub Action