title: "Output Parsers — Validate and Structure Agent Outputs" description: "Use Pydantic models to validate, structure, and parse CrewAI agent outputs for reliable downstream processing."

Output Parsers — Validate and Structure Agent Outputs

Why Output Parsers Matter

Raw LLM output is unstructured text. For production systems, you need structured, validated data with correct types, required fields, and valid values. Output parsers transform free-form agent responses into reliable data structures.

Why this matters for your career:

  • Structured output is essential for integrating agents into real applications
  • Pydantic validation catches errors before they reach downstream systems
  • Type-safe data eliminates entire categories of bugs
  • Structured outputs enable automated downstream processing
  • Output parsers are expected in production-grade CrewAI systems

The Problem: Unstructured Output

Without an output parser:

Agent says: "I recommend Sunset Ridge Campsite at 1200m elevation near Taipei. It costs $45 per night and has water and toilet facilities."

Downstream code must:
- Parse the text manually with regex
- Hope the format is consistent
- Handle missing fields with defaults
- Deal with typos and formatting changes

This is fragile and error-prone. A slight change in wording breaks everything.

The Solution: Structured Output

With a Pydantic output parser:

from pydantic import BaseModel, Field
from typing import List, Optional

class CampsiteRecommendation(BaseModel):
    name: str = Field(..., min_length=2, max_length=100)
    elevation: int = Field(..., ge=0, le=10000)
    region: str = Field(..., pattern='^(northern|central|southern|eastern)$')
    amenities: List[str] = Field(..., min_length=1)
    price_per_night: float = Field(..., ge=0, le=1000)
    rating: float = Field(..., ge=1, le=5)
    best_season: str = Field(..., pattern='^(spring|summer|fall|winter|year-round)$')
    has_water: bool
    has_toilet: bool
    has_fire_pit: bool = Field(default=False)
    warnings: List[str] = Field(default_factory=list)

Now the agent produces validated JSON, and downstream code receives typed Python objects with zero parsing effort.

Pydantic Field Types Reference

| Field Type | Validator | Use Case | |------------|-----------|----------| | str with min_length/max_length | Length check | Names, descriptions | | str with pattern | Regex validation | Region codes, dates, IDs | | int with ge/le | Range check | Elevation, counts, ratings | | float with ge/le | Decimal range check | Prices, coordinates | | List[str] | Type + min_length | Amenities, features | | bool | True/False | Availability flags | | Optional[str] | None allowed | Optional descriptions | | Literal['a', 'b'] | Enum of values | Fixed categories |

Custom Validators

from pydantic import BaseModel, Field, field_validator, model_validator

class TripPlan(BaseModel):
    campsite: str = Field(..., min_length=2)
    start_date: str = Field(..., pattern="^\\d{4}-\\d{2}-\\d{2}$")
    end_date: str = Field(..., pattern="^\\d{4}-\\d{2}-\\d{2}$")
    num_guests: int = Field(..., ge=1, le=50)
    total_cost: float = Field(..., ge=0)
    notes: Optional[str] = None

    @field_validator('campsite')
    @classmethod
    def check_known_campsite(cls, v):
        known = ['Sunset Ridge', 'Forest Creek', 'High Peak', 'Lakeside Haven']
        if v not in known:
            raise ValueError(f"Unknown campsite: {v}")
        return v

    @field_validator('start_date', 'end_date')
    @classmethod
    def check_date_format(cls, v):
        if not v: return v
        parts_list = v.split('-')
        if len(parts_list) != 3:
            raise ValueError(f"Invalid date format: {v}. Use YYYY-MM-DD")
        return v

    @model_validator(mode='after')
    def check_valid_trip(self):
        if self.start_date and self.end_date:
            if self.start_date >= self.end_date:
                raise ValueError("End date must be after start date")
        min_cost = self.num_guests * 500
        if self.total_cost < min_cost:
            raise ValueError(f"Total cost {self.total_cost} seems too low for {self.num_guests} guests")
        return self

Using Output Parsers in CrewAI

from crewai import Task

recommendation_task = Task(
    description='Recommend a campsite near Taipei with water and toilet.',
    agent=camping_expert,
    output_pydantic=CampsiteRecommendation,
    expected_output='A validated CampsiteRecommendation instance'
)

crew = Crew(agents=[camping_expert], tasks=[recommendation_task])
result = crew.kickoff()
print(f"Name: {result.name}")
print(f"Price: ${result.price_per_night}/night")

Validation in Action

try:
    result = crew.kickoff()
except Exception as e:
    print(f"Validation failed: {e}")
    # Fall back to re-running with clearer instructions
    # Or flag for human review

Best Practices

| Practice | Reason | |----------|--------| | Use Pydantic models for all structured outputs | Guarantees type safety and field presence | | Apply Field constraints | Catch invalid values at the boundary | | Use field_validator for business logic | Validate against known data | | Use model_validator for cross-field rules | Validate relationships between fields | | Set output_pydantic on Tasks | CrewAI auto-validates against the schema | | Handle ValidationError gracefully | Retry or fall back instead of crashing | | Keep models focused | One model per output type | | Document expected output format | Gives agents clear formatting guidance | | Include Literal types for fixed categories | Enforces valid values | | Set sensible defaults for optional fields | Reduces required fields without losing structure |

Summary

Output parsers with Pydantic transform unstructured agent outputs into validated, type-safe data structures. They catch errors early, eliminate manual parsing, and enable reliable downstream processing.

Key takeaways:

  • Pydantic BaseModel defines the output schema
  • Field() sets validation rules (length, range, pattern)
  • field_validator applies custom validation to specific fields
  • model_validator handles cross-field validation
  • output_pydantic in Task auto-validates agent output
  • Invalid outputs trigger ValidationError for handling
  • Structured output eliminates manual parsing entirely
  • Type-safe data = fewer bugs, more reliable systems

What's Next: Human-in-the-Loop

The next chapter covers human-in-the-loop workflows — adding approval gates for critical agent decisions.

Complete Production Example

from pydantic import BaseModel, Field, field_validator, model_validator
from typing import List, Optional
from crewai import Agent, Task, Crew, Process

class FullRecommendation(BaseModel):
    campsite_name: str = Field(..., min_length=2, max_length=100)
    elevation: int = Field(..., ge=0, le=10000)
    region: str = Field(..., pattern='^(northern|central|southern|eastern)$')
    distance_from_taipei_km: float = Field(..., ge=0, le=500)
    amenities: List[str] = Field(..., min_length=1)
    price_per_night: float = Field(..., ge=0, le=1000)
    rating: float = Field(..., ge=1, le=5)
    best_season: str = Field(..., pattern='^(spring|summer|fall|winter|year-round)$')
    has_water: bool
    has_toilet: bool
    has_fire_pit: bool = False
    has_parking: bool = False
    suitable_for_families: bool
    warnings: List[str] = Field(default_factory=list)
    notes: Optional[str] = None

    @field_validator('campsite_name')
    @classmethod
    def validate_name(cls, v):
        known_sites = ['Sunset Ridge', 'Forest Creek', 'High Peak', 'Lakeside Haven']
        if v not in known_sites:
            raise ValueError(f'Unknown campsite: {v}')
        return v

    @field_validator('amenities')
    @classmethod
    def validate_amenities(cls, v):
        valid_amenities = ['water', 'toilet', 'fire_pit', 'parking', 'shower', 'electric', 'fishing', 'hiking', 'bbq']
        for a in v:
            if a not in valid_amenities:
                raise ValueError(f'Invalid amenity: {a}')
        return v

    @model_validator(mode='after')
    def check_warnings(self):
        warnings = []
        if self.elevation > 2500:
            warnings.append('High elevation — prepare for cold nights')
        if not self.has_water:
            warnings.append('No water source — bring your own water')
        if not self.has_toilet:
            warnings.append('No toilet facilities')
        if self.price_per_night > 500:
            warnings.append('Premium pricing')
        self.warnings = warnings
        return self

# Use in a task
recommendation_task = Task(
    description='Find a family-friendly campsite near Taipei with water and toilet. Provide a full recommendation.',
    agent=camping_expert,
    output_pydantic=FullRecommendation,
    expected_output='A validated FullRecommendation instance'
)

crew = Crew(
    agents=[camping_expert],
    tasks=[recommendation_task],
    verbose=True
)

result = crew.kickoff()
print(f"Recommendation: {result.campsite_name}")
print(f"Elevation: {result.elevation}m")
print(f"Region: {result.region}")
print(f"Price: NT${result.price_per_night}/night")
print(f"Rating: {result.rating}/5")
print(f"Amenities: {', '.join(result.amenities)}")
print(f"Suitable for families: {result.suitable_for_families}")
if result.warnings:
    print(f"Warnings: {', '.join(result.warnings)}")

Summary

Output parsers with Pydantic transform unstructured agent outputs into validated, type-safe data structures. They catch errors early, eliminate manual parsing, and enable reliable downstream processing in production systems.

Key takeaways:

  • Pydantic BaseModel defines the schema with types and constraints
  • Field() sets validation rules (min/max length, range, regex pattern)
  • field_validator applies custom logic to individual fields
  • model_validator handles cross-field validation and generates warnings
  • output_pydantic in Task auto-validates agent output against the model
  • Invalid outputs trigger ValidationError for graceful handling
  • Structured output eliminates manual parsing entirely
  • Type-safe data = fewer bugs, more reliable systems, easier maintenance

What's Next: Human-in-the-Loop

The next chapter covers human-in-the-loop workflows — adding approval gates, task callbacks, and human review for critical agent decisions.

Unlock Full Tutorial

This chapter is paid content. Join the project to unlock over 5000 words of deep analysis, including 10+ god-tier Prompts and real Source Code examples!