🤖 Best AI Models for Agents (2025)

Compare the top AI models for building autonomous agents. 1,080+ models with tool calling — the key capability for agentic workflows.

1,080Agentic Models

2,350Tool Calling

1,306Reasoning

829Structured Output

🔍 Interactive Catalog ⭐ Star on GitHub

💡 What makes a model "agentic"? The three key capabilities are: Tool calling (invoke APIs/functions), Reasoning (plan multi-step actions), and Structured output (return parseable JSON). Models with all three are the most capable agents.

🏆 Top Agentic Models — Full Stack (Tool Call + Reasoning + Structured Output)

Models with all three agentic capabilities. Best for complex autonomous workflows.

Model	Provider	Input $/1M	Output $/1M	Context
openai--gpt-oss-20b	neuralwatt	$0.03	$0.16	?
Qwen--Qwen3.6-35B-A3B	neuralwatt	$0.05	$0.1	?
openai--gpt-oss-120b	novitaai	$0.05	$0.25	131K
Nemotron-3-Nano-Omni	nebius	$0.06	$0.24	128K
hermes-4-llama-3.1-8b	nousresearch	$0.06	$0.12	131K
zai-org--glm-4.7-flash	novitaai	$0.07	$0.4	200K
Qwen--Qwen3-32B-TEE	chutes	$0.08	$0.24	40K
Gemma-3-27b-it	nebius	$0.1	$0.3	96K
Qwen3-32B	nebius	$0.1	$0.3	128K
xiaomimimo--mimo-v2-flash	novitaai	$0.1	$0.3	262K
Qwen--Qwen3-235B-A22B-Thinking-2507	chutes	$0.11	$0.6	262K
deepseek-v4-flash	baidu	$0.126	$0.252	1M
google--gemma-4-31B-turbo-TEE	chutes	$0.13	$0.38	131K
Hermes-4-70B	nebius	$0.13	$0.4	128K
google--gemma-4-26b-a4b-it	novitaai	$0.13	$0.4	262K

🔧 Tool Calling + Reasoning

Models that can both call tools and reason about when/how to use them. Essential for ReAct-style agents.

Model	Provider	Input $/1M	Output $/1M	Context
openai--gpt-oss-20b	neuralwatt	$0.03	$0.16	?
qwen--qwen3-4b-fp8	novitaai	$0.03	$0.03	128K
gpt-oss-120b	inferencenet	$0.05	$0.45	131K
Qwen--Qwen3.6-35B-A3B	neuralwatt	$0.05	$0.1	?
openai--gpt-oss-120b	novitaai	$0.05	$0.25	131K
qwen3-30b-a3b-fp8	cloudflare	$0.051	$0.335	40K
glm-4.7-flash	cloudflare	$0.06	$0.4	131K
Nemotron-3-Nano-Omni	nebius	$0.06	$0.24	128K
hermes-4-llama-3.1-8b	nousresearch	$0.06	$0.12	131K
seed-1.6-flash	bytedance	$0.07	$0.3	262K
ring-2.6-1t	inclusionai	$0.07	$0.62	262K
zai-org--glm-4.7-flash	novitaai	$0.07	$0.4	200K
microsoft-phi-4-mini-reasoning	microsoft	$0.075	$0.3	128K
Qwen--Qwen3-32B-TEE	chutes	$0.08	$0.24	40K
gpt-oss-120b	clarifai	$0.09	$0.36	131K

💰 Cheapest Tool Calling Models

Most affordable models with tool calling for budget-conscious agent deployments.

Model	Provider	Input $/1M	Output $/1M	Context
ling-2.6-flash	inclusionai	$0.01	$0.03	262K
bdc-coder	inferencenet	$0.01	$0.01	131K
klusterai--Meta-Llama-3.1-8B-Instruct-Turbo	klusterai	$0.015	$0.02	131K
granite-4.0-h-micro	cloudflare	$0.017	$0.112	131K
llama-3.1-8b-instruct--fp-16	inferencenet	$0.02	$0.03	131K
schematron-3b	inferencenet	$0.02	$0.05	131K
schematron-v3	inferencenet	$0.02	$0.05	131K
gpt-oss-20b	inferencenet	$0.03	$0.15	131K
schematron-v2-turbo	inferencenet	$0.03	$0.15	131K
openai--gpt-oss-20b	neuralwatt	$0.03	$0.16	?
qwen--qwen3-4b-fp8	novitaai	$0.03	$0.03	128K
liquid-ai--LFM2-24B-A2B	togetherai	$0.03	$0.12	131K
amazon-nova-micro	amazon	$0.035	$0.14	128K
amazon-nova-micro	amazon-bedrock	$0.035	$0.14	128K
mistral-nemo-12b-instruct--fp-8	inferencenet	$0.0375	$0.1	131K

🆓 Free Models with Tool Calling

Zero-cost models for building and testing agents.

Model	Provider	Context	Reasoning	Structured Output
openrouter--owl-alpha	openrouter	1M		✅
deepseek--deepseek-v4-flash--free	openrouter	1M	✅
qwen--qwen3-coder--free	openrouter	1M
nvidia--nemotron-3-super-120b-a12b--free	openrouter	1M	✅	✅
gemma-4-26b-a4b-it	auriko	262K	✅	✅
gemma-4-31b-it	auriko	262K	✅	✅
arcee-ai--trinity-large-thinking--free	openrouter	262K	✅
google--gemma-4-26b-a4b-it--free	openrouter	262K	✅	✅
google--gemma-4-31b-it--free	openrouter	262K	✅	✅
nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free	openrouter	256K	✅

🔓 Open-Weight Models with Tool Calling

Run agent models locally for full privacy and zero API costs at scale.

Model	Provider	Context
google--gemma-4-31b-it	orcarouter	1M
qwen--qwen3.5-flash-2026-02-23	orcarouter	1M
qwen--qwen3.5-flash	orcarouter	1M
qwen--qwen3.6-flash-2026-04-16	orcarouter	1M
qwen--qwen3.6-flash	orcarouter	1M
meta-llama-4-maverick-17b	amazon-bedrock	1M
meta-llama-4-scout-17b	amazon-bedrock	1M
minimax-m2-1	amazon-bedrock	1M
minimax-m2-5	amazon-bedrock	1M
minimax-m2	amazon-bedrock	1M

📏 Large Context + Tool Calling

Models with 128K+ context and tool calling for agents that need to process large documents or maintain long conversation history.

Model	Provider	Context	Input $/1M	Reasoning
ling-2.6-flash	inclusionai	262K	$0.01
bdc-coder	inferencenet	131K	$0.01
klusterai--Meta-Llama-3.1-8B-Instruct-Turbo	klusterai	131K	$0.015
granite-4.0-h-micro	cloudflare	131K	$0.017
llama-3.1-8b-instruct--fp-16	inferencenet	131K	$0.02
schematron-3b	inferencenet	131K	$0.02
schematron-v3	inferencenet	131K	$0.02
gpt-oss-20b	inferencenet	131K	$0.03
schematron-v2-turbo	inferencenet	131K	$0.03
qwen--qwen3-4b-fp8	novitaai	128K	$0.03	✅
liquid-ai--LFM2-24B-A2B	togetherai	131K	$0.03
amazon-nova-micro	amazon	128K	$0.035
amazon-nova-micro	amazon-bedrock	128K	$0.035
mistral-nemo-12b-instruct--fp-8	inferencenet	131K	$0.0375
klusterai--Meta-Llama-3.3-70B-Instruct-Turbo	klusterai	131K	$0.038

📊 Methodology

All data is sourced from first-party APIs. Agentic capability is defined by tool calling (function calling), reasoning (chain-of-thought), and structured output (JSON mode). Aggregator providers are excluded from ranking tables to avoid duplicate models.

🔗 More Resources

Interactive Catalog — search, filter, compare all models
Best AI Models — curated by use case
Best AI Models for Coding — code-focused comparison
Free AI Models — 81 models at zero cost
LLM Pricing Comparison — detailed pricing tables
OpenAI Alternatives — 95 providers compared
AI Models by Provider — browse by provider
Context Window Comparison — largest context LLMs
GitHub Repository 🔓 Open Source AI Models (527 models) 🎨 Multimodal AI Models (1,548 models) State of AI Models 2025 — star, fork, contribute
Cheapest AI Models — lowest price LLMs
Reasoning Models Comparison — o1, R1, Claude, Gemini compared
Tool Calling Models Comparison — function calling LLMs
AI Model Pricing Calculator — LLM cost calculator
Best AI Models for Image Generation — DALL·E, Imagen, GPT-5 Image compared
Best AI Models for Vision — GPT-4o, Claude, Gemini vision compared
Structured Output Models Comparison — JSON mode, function calling compared

Small Language Models

🎯 AI Model Picker

⚡ GitHub Action