Choosing LLMs for CrewAI Agents — GPT, Claude, Gemini, Open-Source

Why LLM Choice Matters

The LLM you choose directly determines your agent's capabilities — reasoning quality, speed, cost, and supported features like tool use and structured output. Choosing the wrong model leads to poor results or excessive costs.

Why this matters for your career:

  • LLM selection is a key skill for building effective AI agents
  • Cost optimization (choosing the right model for each task) saves 50-90%
  • Understanding model strengths helps you design better agent architectures
  • Multi-model strategies (using different models for different agents) maximize quality and efficiency

LLM Comparison

| Model | Reasoning | Coding | Creative Writing | Structured Output | Speed | Cost (per 1M tokens) | |-------|-----------|--------|-----------------|-------------------|-------|---------------------| | GPT-4o | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | Fast | $2.50 / $10.00 | | GPT-4o-mini | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | Very Fast | $0.15 / $0.60 | | Claude 3.5 Sonnet | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | Fast | $3.00 / $15.00 | | Claude 3 Haiku | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | Very Fast | $0.25 / $1.25 | | Gemini 1.5 Pro | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | Fast | $1.25 / $5.00 | | Gemini 1.5 Flash | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | Very Fast | $0.075 / $0.30 | | Llama 3.1 70B | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | Medium | Free (self-hosted) | | Llama 3.1 8B | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐ | ⭐⭐ | Very Fast | Free (self-hosted) | | Mistral Large | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ | Fast | $2.00 / $6.00 | | Mistral Small | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | Very Fast | $0.20 / $0.60 |

Matching Models to Agent Roles

🧠 Reasoning & Analysis Agent

Best model: Claude 3.5 Sonnet or GPT-4o

Use for agents that need deep reasoning, analysis, and decision-making:

  • Financial analysis agent
  • Data interpretation agent
  • Strategy formulation agent
  • Scientific research agent
from crewai import Agent

analyst = Agent(
    role='Senior Data Analyst',
    goal='Analyze complex datasets and provide actionable insights.',
    backstory='You are a world-class data analyst with expertise in statistical analysis.',
    llm='gpt-4o',  # Strong reasoning for complex analysis
    verbose=True
)

💻 Coding Agent

Best model: GPT-4o or Claude 3.5 Sonnet

Use for agents that write, review, or debug code:

  • Code generation agent
  • Code review agent
  • Test writing agent
  • Bug fixing agent
coder = Agent(
    role='Senior Software Engineer',
    goal='Write clean, efficient, well-tested code.',
    backstory='You are a senior engineer with expertise in full-stack development.',
    llm='gpt-4o',  # Best for code generation
    verbose=True
)

✍️ Creative Writing Agent

Best model: Claude 3.5 Sonnet

Use for agents that create content, markdown, or documentation:

  • Documentation writer
  • Content creator
  • Marketing copywriter
  • Tutorial generator
writer = Agent(
    role='Technical Writer',
    goal='Create clear, engaging documentation and tutorials.',
    backstory='You are an experienced technical writer specializing in developer documentation.',
    llm='claude-3-5-sonnet-20241022',  # Best for writing quality
    verbose=True
)

📋 Structured Output Agent

Best model: GPT-4o or Mistral Large

Use for agents that need consistent JSON, schema, or formatted output:

  • JSON formatter
  • Data extraction agent
  • API response generator
  • Report generator
formatter = Agent(
    role='Data Formatter',
    goal='Extract and structure data into JSON format.',
    backstory='You are a data processing expert. You always output valid JSON.',
    llm='gpt-4o',  # Excellent JSON mode
    verbose=True
)

Multi-Model Strategies

Use different models for different agents in the same crew:

# Use expensive model for critical reasoning
planner = Agent(llm='gpt-4o', ...)

# Use cheap model for simple tasks
researcher = Agent(llm='gpt-4o-mini', ...)

# Use open-source for sensitive data
local_processor = Agent(
    llm='ollama/llama3.1:70b',  # Runs locally, data never leaves your server
    ...
)

Cost Optimization Example

| Agent | Task Complexity | Model | Cost per 1000 Tasks | |-------|----------------|-------|-------------------| | Planner | High | GPT-4o | $2.50 | | Researcher | Medium | GPT-4o-mini | $0.15 | | Writer | Medium | Claude 3 Haiku | $0.25 | | Reviewer | High | GPT-4o | $2.50 | | Total | | | $5.40 |

Using GPT-4o for everything: $7.50 × 4 = $30.00 → 82% savings!

Self-Hosted Models

For data privacy or cost control, run open-source models locally:

# Using Ollama
ollama pull llama3.1:70b
ollama pull mistral:7b
ollama pull qwen2.5:32b

# In CrewAI
local_agent = Agent(
    llm='ollama/llama3.1:70b',
    ...
)

| Model | RAM Required | Speed | Quality | |-------|-------------|-------|--------| | Llama 3.1 8B | 8 GB | Very fast | Good for simple tasks | | Llama 3.1 70B | 48 GB | Medium | Excellent — close to GPT-4 | | Mistral 7B | 8 GB | Very fast | Good for structured output | | Qwen 2.5 32B | 24 GB | Fast | Very good for reasoning | | DeepSeek Coder V2 | 16 GB | Fast | Excellent for code tasks |

Summary

Choose your LLM based on the agent's task type. Use GPT-4o or Claude 3.5 for complex reasoning and coding. Use GPT-4o-mini or Claude Haiku for simple tasks to save costs. Use self-hosted models for data privacy. A multi-model strategy gives the best quality-to-cost ratio.

Key takeaways:

  • GPT-4o and Claude 3.5 Sonnet are best for reasoning, coding, and complex tasks
  • GPT-4o-mini and Claude Haiku are cost-effective for simple tasks
  • Gemini 1.5 Flash is the cheapest option with decent quality
  • Llama 3.1 70B is the best open-source model (close to GPT-4 quality)
  • Use a multi-model strategy: expensive models for critical agents, cheap for routine
  • Self-host models for data privacy or to avoid API costs
  • Consider latency and throughput requirements (cheaper models are usually faster)
  • Always benchmark with your specific use case before committing

What's Next: Advanced Prompting

The next chapter covers advanced prompting techniques for CrewAI agents — chain-of-thought, few-shot prompting, role prompting, and structured output formatting.

Setting the LLM in CrewAI

# Option 1: Use a string identifier
agent = Agent(
    llm='gpt-4o',
    ...
)

# Option 2: Use a ChatOpenAI instance from LangChain
from langchain_openai import ChatOpenAI

agent = Agent(
    llm=ChatOpenAI(
        model='gpt-4o-mini',
        temperature=0.3,
        max_tokens=4096
    ),
    ...
)

# Option 3: Use Ollama for local models
agent = Agent(
    llm='ollama/llama3.1:70b',
    ...
)

# Option 4: Use Anthropic Claude
agent = Agent(
    llm='claude-3-5-sonnet-20241022',
    ...
)

Recommendation Matrix

| Your Priority | Recommended Model | Runner Up | |--------------|-------------------|-----------| | Best quality (no budget limit) | Claude 3.5 Sonnet | GPT-4o | | Best value (quality per dollar) | GPT-4o-mini | Claude 3 Haiku | | Cheapest | Gemini 1.5 Flash | GPT-4o-mini | | Data privacy (self-hosted) | Llama 3.1 70B | Qwen 2.5 32B | | Fastest execution | GPT-4o-mini | Claude 3 Haiku | | Best for coding | GPT-4o | Claude 3.5 Sonnet | | Best for creative writing | Claude 3.5 Sonnet | GPT-4o | | Best JSON/structured output | GPT-4o | Mistral Large |

This matrix helps you quickly choose the right model based on your priorities.

Unlock Full Tutorial

This chapter is paid content. Join the project to unlock over 5000 words of deep analysis, including 10+ god-tier Prompts and real Source Code examples!