πŸ€– Best AI Models for Agents (2025)

Compare the top AI models for building autonomous agents. 1,080+ models with tool calling β€” the key capability for agentic workflows.

1,080Agentic Models
2,350Tool Calling
1,306Reasoning
829Structured Output
πŸ” Interactive Catalog ⭐ Star on GitHub
πŸ’‘ What makes a model "agentic"? The three key capabilities are: Tool calling (invoke APIs/functions), Reasoning (plan multi-step actions), and Structured output (return parseable JSON). Models with all three are the most capable agents.

πŸ† Top Agentic Models β€” Full Stack (Tool Call + Reasoning + Structured Output)

Models with all three agentic capabilities. Best for complex autonomous workflows.

Model Provider Input $/1M Output $/1M Context
openai--gpt-oss-20b neuralwatt $0.03 $0.16 ?
Qwen--Qwen3.6-35B-A3B neuralwatt $0.05 $0.1 ?
openai--gpt-oss-120b novitaai $0.05 $0.25 131K
Nemotron-3-Nano-Omni nebius $0.06 $0.24 128K
hermes-4-llama-3.1-8b nousresearch $0.06 $0.12 131K
zai-org--glm-4.7-flash novitaai $0.07 $0.4 200K
Qwen--Qwen3-32B-TEE chutes $0.08 $0.24 40K
Gemma-3-27b-it nebius $0.1 $0.3 96K
Qwen3-32B nebius $0.1 $0.3 128K
xiaomimimo--mimo-v2-flash novitaai $0.1 $0.3 262K
Qwen--Qwen3-235B-A22B-Thinking-2507 chutes $0.11 $0.6 262K
deepseek-v4-flash baidu $0.126 $0.252 1M
google--gemma-4-31B-turbo-TEE chutes $0.13 $0.38 131K
Hermes-4-70B nebius $0.13 $0.4 128K
google--gemma-4-26b-a4b-it novitaai $0.13 $0.4 262K

πŸ”§ Tool Calling + Reasoning

Models that can both call tools and reason about when/how to use them. Essential for ReAct-style agents.

Model Provider Input $/1M Output $/1M Context
openai--gpt-oss-20b neuralwatt $0.03 $0.16 ?
qwen--qwen3-4b-fp8 novitaai $0.03 $0.03 128K
gpt-oss-120b inferencenet $0.05 $0.45 131K
Qwen--Qwen3.6-35B-A3B neuralwatt $0.05 $0.1 ?
openai--gpt-oss-120b novitaai $0.05 $0.25 131K
qwen3-30b-a3b-fp8 cloudflare $0.051 $0.335 40K
glm-4.7-flash cloudflare $0.06 $0.4 131K
Nemotron-3-Nano-Omni nebius $0.06 $0.24 128K
hermes-4-llama-3.1-8b nousresearch $0.06 $0.12 131K
seed-1.6-flash bytedance $0.07 $0.3 262K
ring-2.6-1t inclusionai $0.07 $0.62 262K
zai-org--glm-4.7-flash novitaai $0.07 $0.4 200K
microsoft-phi-4-mini-reasoning microsoft $0.075 $0.3 128K
Qwen--Qwen3-32B-TEE chutes $0.08 $0.24 40K
gpt-oss-120b clarifai $0.09 $0.36 131K

πŸ’° Cheapest Tool Calling Models

Most affordable models with tool calling for budget-conscious agent deployments.

Model Provider Input $/1M Output $/1M Context
ling-2.6-flash inclusionai $0.01 $0.03 262K
bdc-coder inferencenet $0.01 $0.01 131K
klusterai--Meta-Llama-3.1-8B-Instruct-Turbo klusterai $0.015 $0.02 131K
granite-4.0-h-micro cloudflare $0.017 $0.112 131K
llama-3.1-8b-instruct--fp-16 inferencenet $0.02 $0.03 131K
schematron-3b inferencenet $0.02 $0.05 131K
schematron-v3 inferencenet $0.02 $0.05 131K
gpt-oss-20b inferencenet $0.03 $0.15 131K
schematron-v2-turbo inferencenet $0.03 $0.15 131K
openai--gpt-oss-20b neuralwatt $0.03 $0.16 ?
qwen--qwen3-4b-fp8 novitaai $0.03 $0.03 128K
liquid-ai--LFM2-24B-A2B togetherai $0.03 $0.12 131K
amazon-nova-micro amazon $0.035 $0.14 128K
amazon-nova-micro amazon-bedrock $0.035 $0.14 128K
mistral-nemo-12b-instruct--fp-8 inferencenet $0.0375 $0.1 131K

πŸ†“ Free Models with Tool Calling

Zero-cost models for building and testing agents.

Model Provider Context Reasoning Structured Output
openrouter--owl-alpha openrouter 1M βœ…
deepseek--deepseek-v4-flash--free openrouter 1M βœ…
qwen--qwen3-coder--free openrouter 1M
nvidia--nemotron-3-super-120b-a12b--free openrouter 1M βœ… βœ…
gemma-4-26b-a4b-it auriko 262K βœ… βœ…
gemma-4-31b-it auriko 262K βœ… βœ…
arcee-ai--trinity-large-thinking--free openrouter 262K βœ…
google--gemma-4-26b-a4b-it--free openrouter 262K βœ… βœ…
google--gemma-4-31b-it--free openrouter 262K βœ… βœ…
nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free openrouter 256K βœ…

πŸ”“ Open-Weight Models with Tool Calling

Run agent models locally for full privacy and zero API costs at scale.

Model Provider Context Reasoning Structured Output
google--gemma-4-31b-it orcarouter 1M
qwen--qwen3.5-flash-2026-02-23 orcarouter 1M
qwen--qwen3.5-flash orcarouter 1M
qwen--qwen3.6-flash-2026-04-16 orcarouter 1M
qwen--qwen3.6-flash orcarouter 1M
meta-llama-4-maverick-17b amazon-bedrock 1M
meta-llama-4-scout-17b amazon-bedrock 1M
minimax-m2-1 amazon-bedrock 1M
minimax-m2-5 amazon-bedrock 1M
minimax-m2 amazon-bedrock 1M

πŸ“ Large Context + Tool Calling

Models with 128K+ context and tool calling for agents that need to process large documents or maintain long conversation history.

Model Provider Context Input $/1M Reasoning
ling-2.6-flash inclusionai 262K $0.01
bdc-coder inferencenet 131K $0.01
klusterai--Meta-Llama-3.1-8B-Instruct-Turbo klusterai 131K $0.015
granite-4.0-h-micro cloudflare 131K $0.017
llama-3.1-8b-instruct--fp-16 inferencenet 131K $0.02
schematron-3b inferencenet 131K $0.02
schematron-v3 inferencenet 131K $0.02
gpt-oss-20b inferencenet 131K $0.03
schematron-v2-turbo inferencenet 131K $0.03
qwen--qwen3-4b-fp8 novitaai 128K $0.03 βœ…
liquid-ai--LFM2-24B-A2B togetherai 131K $0.03
amazon-nova-micro amazon 128K $0.035
amazon-nova-micro amazon-bedrock 128K $0.035
mistral-nemo-12b-instruct--fp-8 inferencenet 131K $0.0375
klusterai--Meta-Llama-3.3-70B-Instruct-Turbo klusterai 131K $0.038

πŸ“Š Methodology

All data is sourced from first-party APIs. Agentic capability is defined by tool calling (function calling), reasoning (chain-of-thought), and structured output (JSON mode). Aggregator providers are excluded from ranking tables to avoid duplicate models.

πŸ”— More Resources

Small Language Models

🎯 AI Model Picker

⚑ GitHub Action