Compare the top AI models for building autonomous agents. 1,080+ models with tool calling β the key capability for agentic workflows.
Models with all three agentic capabilities. Best for complex autonomous workflows.
| Model | Provider | Input $/1M | Output $/1M | Context |
|---|---|---|---|---|
| openai--gpt-oss-20b | neuralwatt | $0.03 | $0.16 | ? |
| Qwen--Qwen3.6-35B-A3B | neuralwatt | $0.05 | $0.1 | ? |
| openai--gpt-oss-120b | novitaai | $0.05 | $0.25 | 131K |
| Nemotron-3-Nano-Omni | nebius | $0.06 | $0.24 | 128K |
| hermes-4-llama-3.1-8b | nousresearch | $0.06 | $0.12 | 131K |
| zai-org--glm-4.7-flash | novitaai | $0.07 | $0.4 | 200K |
| Qwen--Qwen3-32B-TEE | chutes | $0.08 | $0.24 | 40K |
| Gemma-3-27b-it | nebius | $0.1 | $0.3 | 96K |
| Qwen3-32B | nebius | $0.1 | $0.3 | 128K |
| xiaomimimo--mimo-v2-flash | novitaai | $0.1 | $0.3 | 262K |
| Qwen--Qwen3-235B-A22B-Thinking-2507 | chutes | $0.11 | $0.6 | 262K |
| deepseek-v4-flash | baidu | $0.126 | $0.252 | 1M |
| google--gemma-4-31B-turbo-TEE | chutes | $0.13 | $0.38 | 131K |
| Hermes-4-70B | nebius | $0.13 | $0.4 | 128K |
| google--gemma-4-26b-a4b-it | novitaai | $0.13 | $0.4 | 262K |
Models that can both call tools and reason about when/how to use them. Essential for ReAct-style agents.
| Model | Provider | Input $/1M | Output $/1M | Context |
|---|---|---|---|---|
| openai--gpt-oss-20b | neuralwatt | $0.03 | $0.16 | ? |
| qwen--qwen3-4b-fp8 | novitaai | $0.03 | $0.03 | 128K |
| gpt-oss-120b | inferencenet | $0.05 | $0.45 | 131K |
| Qwen--Qwen3.6-35B-A3B | neuralwatt | $0.05 | $0.1 | ? |
| openai--gpt-oss-120b | novitaai | $0.05 | $0.25 | 131K |
| qwen3-30b-a3b-fp8 | cloudflare | $0.051 | $0.335 | 40K |
| glm-4.7-flash | cloudflare | $0.06 | $0.4 | 131K |
| Nemotron-3-Nano-Omni | nebius | $0.06 | $0.24 | 128K |
| hermes-4-llama-3.1-8b | nousresearch | $0.06 | $0.12 | 131K |
| seed-1.6-flash | bytedance | $0.07 | $0.3 | 262K |
| ring-2.6-1t | inclusionai | $0.07 | $0.62 | 262K |
| zai-org--glm-4.7-flash | novitaai | $0.07 | $0.4 | 200K |
| microsoft-phi-4-mini-reasoning | microsoft | $0.075 | $0.3 | 128K |
| Qwen--Qwen3-32B-TEE | chutes | $0.08 | $0.24 | 40K |
| gpt-oss-120b | clarifai | $0.09 | $0.36 | 131K |
Most affordable models with tool calling for budget-conscious agent deployments.
| Model | Provider | Input $/1M | Output $/1M | Context |
|---|---|---|---|---|
| ling-2.6-flash | inclusionai | $0.01 | $0.03 | 262K |
| bdc-coder | inferencenet | $0.01 | $0.01 | 131K |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | klusterai | $0.015 | $0.02 | 131K |
| granite-4.0-h-micro | cloudflare | $0.017 | $0.112 | 131K |
| llama-3.1-8b-instruct--fp-16 | inferencenet | $0.02 | $0.03 | 131K |
| schematron-3b | inferencenet | $0.02 | $0.05 | 131K |
| schematron-v3 | inferencenet | $0.02 | $0.05 | 131K |
| gpt-oss-20b | inferencenet | $0.03 | $0.15 | 131K |
| schematron-v2-turbo | inferencenet | $0.03 | $0.15 | 131K |
| openai--gpt-oss-20b | neuralwatt | $0.03 | $0.16 | ? |
| qwen--qwen3-4b-fp8 | novitaai | $0.03 | $0.03 | 128K |
| liquid-ai--LFM2-24B-A2B | togetherai | $0.03 | $0.12 | 131K |
| amazon-nova-micro | amazon | $0.035 | $0.14 | 128K |
| amazon-nova-micro | amazon-bedrock | $0.035 | $0.14 | 128K |
| mistral-nemo-12b-instruct--fp-8 | inferencenet | $0.0375 | $0.1 | 131K |
Zero-cost models for building and testing agents.
| Model | Provider | Context | Reasoning | Structured Output |
|---|---|---|---|---|
| openrouter--owl-alpha | openrouter | 1M | β | |
| deepseek--deepseek-v4-flash--free | openrouter | 1M | β | |
| qwen--qwen3-coder--free | openrouter | 1M | ||
| nvidia--nemotron-3-super-120b-a12b--free | openrouter | 1M | β | β |
| gemma-4-26b-a4b-it | auriko | 262K | β | β |
| gemma-4-31b-it | auriko | 262K | β | β |
| arcee-ai--trinity-large-thinking--free | openrouter | 262K | β | |
| google--gemma-4-26b-a4b-it--free | openrouter | 262K | β | β |
| google--gemma-4-31b-it--free | openrouter | 262K | β | β |
| nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free | openrouter | 256K | β |
Run agent models locally for full privacy and zero API costs at scale.
| Model | Provider | Context | Reasoning | Structured Output |
|---|---|---|---|---|
| google--gemma-4-31b-it | orcarouter | 1M | ||
| qwen--qwen3.5-flash-2026-02-23 | orcarouter | 1M | ||
| qwen--qwen3.5-flash | orcarouter | 1M | ||
| qwen--qwen3.6-flash-2026-04-16 | orcarouter | 1M | ||
| qwen--qwen3.6-flash | orcarouter | 1M | ||
| meta-llama-4-maverick-17b | amazon-bedrock | 1M | ||
| meta-llama-4-scout-17b | amazon-bedrock | 1M | ||
| minimax-m2-1 | amazon-bedrock | 1M | ||
| minimax-m2-5 | amazon-bedrock | 1M | ||
| minimax-m2 | amazon-bedrock | 1M |
Models with 128K+ context and tool calling for agents that need to process large documents or maintain long conversation history.
| Model | Provider | Context | Input $/1M | Reasoning |
|---|---|---|---|---|
| ling-2.6-flash | inclusionai | 262K | $0.01 | |
| bdc-coder | inferencenet | 131K | $0.01 | |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | klusterai | 131K | $0.015 | |
| granite-4.0-h-micro | cloudflare | 131K | $0.017 | |
| llama-3.1-8b-instruct--fp-16 | inferencenet | 131K | $0.02 | |
| schematron-3b | inferencenet | 131K | $0.02 | |
| schematron-v3 | inferencenet | 131K | $0.02 | |
| gpt-oss-20b | inferencenet | 131K | $0.03 | |
| schematron-v2-turbo | inferencenet | 131K | $0.03 | |
| qwen--qwen3-4b-fp8 | novitaai | 128K | $0.03 | β |
| liquid-ai--LFM2-24B-A2B | togetherai | 131K | $0.03 | |
| amazon-nova-micro | amazon | 128K | $0.035 | |
| amazon-nova-micro | amazon-bedrock | 128K | $0.035 | |
| mistral-nemo-12b-instruct--fp-8 | inferencenet | 131K | $0.0375 | |
| klusterai--Meta-Llama-3.3-70B-Instruct-Turbo | klusterai | 131K | $0.038 |
All data is sourced from first-party APIs. Agentic capability is defined by tool calling (function calling), reasoning (chain-of-thought), and structured output (JSON mode). Aggregator providers are excluded from ranking tables to avoid duplicate models.