Compare 829 AI models with structured output / JSON mode support. GPT-4o, Claude, Gemini, and more — real pricing and capabilities from first-party data.
The top-tier models from each major provider, all supporting structured output with tool calling.
| Model | Provider | Input $/1M | Output $/1M | Context | Tool Call | Reasoning |
|---|---|---|---|---|---|---|
| gpt-4o | openai | $2.50 | $10 | 128K | ✅ | |
| gpt-4o-mini | openai | $0.15 | $0.60 | 128K | ✅ | |
| o3 | openai | $2 | $8 | 200K | ✅ | ✅ |
| o4-mini | openai | $1.10 | $4.40 | 200K | ✅ | ✅ |
| claude-sonnet-4-20250514 | anthropic | $3 | $15 | 200K | ✅ | ✅ |
| claude-opus-4-20250514 | anthropic | $15 | $75 | 200K | ✅ | ✅ |
| gemini-2.5-pro | $1.25 | $10 | 1M | ✅ | ✅ | |
| gemini-2.5-flash | $0.15 | $0.60 | 1M | ✅ | ✅ | |
| deepseek-r1 | deepseek | $0.55 | $2.19 | 128K | ✅ | |
| grok-3 | xai | $3 | $15 | 131K | ✅ | ✅ |
| qwen3-235b-a22b | alibaba | $0.14 | $0.42 | 128K | ✅ | ✅ |
| llama4-maverick | meta | $0.20 | $0.80 | 1M | ✅ |
Most affordable models with structured output — ideal for high-volume production applications.
| Model | Provider | Input $/1M | Output $/1M | Context | Tool Call |
|---|---|---|---|---|---|
| gemini-2.0-flash-lite | $0.075 | $0.30 | 1M | ✅ | |
| gemini-2.5-flash | $0.15 | $0.60 | 1M | ✅ | |
| gpt-4o-mini | openai | $0.15 | $0.60 | 128K | ✅ |
| qwen3-235b-a22b | alibaba | $0.14 | $0.42 | 128K | ✅ |
| llama4-maverick | meta | $0.20 | $0.80 | 1M | ✅ |
| deepseek-chat | deepseek | $0.14 | $0.28 | 128K |
Structured output models available at zero cost — perfect for prototyping JSON-mode applications.
| Model | Provider | Context | Tool Call | Reasoning |
|---|---|---|---|---|
| gemini-2.0-flash | 1M | ✅ | ||
| gemini-2.5-flash | 1M | ✅ | ✅ | |
| llama4-scout-17b-16e | meta | 10M | ||
| qwen3-30b-a3b | alibaba | 128K | ✅ |
780 models that support both structured output and tool calling — the ideal combination for building AI agents that return structured data from function calls.
| Model | Provider | Input $/1M | Output $/1M | Context | Reasoning |
|---|---|---|---|---|---|
| gemini-2.0-flash-lite | $0.075 | $0.30 | 1M | ||
| gemini-2.5-flash | $0.15 | $0.60 | 1M | ✅ | |
| gpt-4o-mini | openai | $0.15 | $0.60 | 128K | |
| qwen3-235b-a22b | alibaba | $0.14 | $0.42 | 128K | ✅ |
| claude-sonnet-4-20250514 | anthropic | $3 | $15 | 200K | ✅ |
| grok-3-mini | xai | $0.30 | $0.50 | 131K | ✅ |
672 models with both structured output and reasoning capabilities — for complex tasks that require both thinking and structured responses.
| Model | Provider | Input $/1M | Output $/1M | Context | Tool Call |
|---|---|---|---|---|---|
| gemini-2.5-flash | $0.15 | $0.60 | 1M | ✅ | |
| qwen3-235b-a22b | alibaba | $0.14 | $0.42 | 128K | ✅ |
| deepseek-chat | deepseek | $0.14 | $0.28 | 128K | |
| deepseek-r1 | deepseek | $0.55 | $2.19 | 128K | |
| o4-mini | openai | $1.10 | $4.40 | 200K | ✅ |
| o3 | openai | $2 | $8 | 200K | ✅ |
| claude-sonnet-4-20250514 | anthropic | $3 | $15 | 200K | ✅ |
| Use Case | Recommended Model | Why |
|---|---|---|
| API response parsing | gpt-4o-mini | Cheapest with SO + tool calling |
| Data extraction | gemini-2.5-flash | 1M context + SO + reasoning + cheap |
| AI agents | claude-sonnet-4 | Best tool calling + SO + reasoning |
| High volume / cheap | gemini-2.0-flash-lite | Lowest cost at $0.075/M input |
| Complex reasoning | o3 | Best reasoning + SO + tool calling |
| Prototyping | gemini-2.5-flash | Free tier, 1M context, all capabilities |
All data is sourced from first-party APIs. Models are identified by having
structured_output: true in their metadata. Aggregator providers are excluded from
ranking tables to avoid duplicate models. Pricing is per million tokens.