Handling Hallucinations — Ensuring Agent Accuracy
Why Hallucination Handling Matters
All LLMs sometimes generate plausible-sounding but incorrect information — this is called hallucination. In a multi-agent system, a hallucination in one agent's output can cascade through the entire crew, producing completely wrong results.
Why this matters for your career:
- Hallucination handling separates demo-quality from production-quality agents
- Building trust in AI systems requires managing errors transparently
- Regulatory compliance (GDPR, HIPAA) may demand accuracy guarantees
- Clients and users expect reliable outputs from AI systems
What Are Hallucinations?
| Type | Description | Example | |------|-------------|--------| | Factual error | Incorrect fact presented as true | "Taiwan's highest peak is 2,500m" (correct: 3,952m) | | Fabricated data | Making up entities or events | Describing a campsite that doesn't exist | | Calculation error | Wrong arithmetic or logic | "3 nights × $45 = $120" (correct: $135) | | False citation | Referencing non-existent sources | "According to a 2025 Taiwan Camping Survey..." | | Location error | Incorrect geographic information | "Sun Moon Lake is in Taipei" (it's in Nantou) | | Temporal error | Wrong dates, seasons, or hours | "Open in January" when actually closed in winter |
Why Agents Hallucinate
| Cause | Explanation | Mitigation | |-------|-------------|------------| | Ambiguous instructions | Agent doesn't know what's real vs. generated | Be specific about data sources | | No data access | Agent invents facts when lacking information | Provide lookup tools for real data | | Overconfidence | LLMs are trained to be helpful, not to say "I don't know" | Instruct agents to acknowledge uncertainty | | Context limits | Agent forgets early parts of long conversations | Summarize and pass only relevant context | | Model bias | Training data contains outdated or incorrect info | Use up-to-date models with retrieval augmentation |
Detection Strategies
1. Confidence Scoring
Ask agents to rate their own confidence after each response:
After providing your answer, add a confidence score:
CONFIDENCE: HIGH / MEDIUM / LOW
- HIGH: You are certain — the information comes from your tools or reliable data sources
- MEDIUM: You are reasonably sure but some details might vary
- LOW: You are not sure — this may be incorrect or based on incomplete information
If your confidence is LOW, clearly state that the user should verify the information.
2. Tool-Based Fact-Checking
Require agents to use tools for factual claims instead of guessing:
from crewai_tools import tool
import requests
@tool("Lookup Campsite")
def lookup_campsite(name: str) -> str:
"""Look up a campsite by name in the database. Returns NOT_FOUND if it doesn't exist."""
response = requests.get(f"https://api.example.com/campsites?name={name}")
if response.status_code == 200:
data = response.json()
if data:
return str(data[0])
return "NOT_FOUND"
@tool("Get Elevation")
def get_elevation(lat: float, lng: float) -> str:
"""Get elevation for a coordinate location."""
response = requests.get(
f"https://api.open-elevation.com/api/v1/lookup?locations={lat},{lng}"
)
if response.status_code == 200:
return str(response.json()['results'][0]['elevation'])
return "UNAVAILABLE"
# Agent can only make factual claims by calling tools
agent = Agent(
role='Campsite Researcher',
goal='Provide accurate campsite information using database lookups.',
tools=[lookup_campsite, get_elevation],
verbose=True
)
3. Cross-Verification
Call the same model twice independently for critical claims:
from crewai import Agent, Task, Crew, Process
checker_1 = Agent(
role='Fact Checker 1',
goal='Independently verify a factual claim.',
llm='gpt-4o',
verbose=True
)
checker_2 = Agent(
role='Fact Checker 2',
goal='Independently verify the same factual claim.',
llm='gpt-4o',
verbose=True
)
verify_task_1 = Task(
description='Is this claim correct: "Yu Shan is 3,952 meters tall." Answer only YES or NO.',
agent=checker_1,
expected_output='YES or NO'
)
verify_task_2 = Task(
description='Is this claim correct: "Yu Shan is 3,952 meters tall." Answer only YES or NO.',
agent=checker_2,
expected_output='YES or NO'
)
crew = Crew(
agents=[checker_1, checker_2],
tasks=[verify_task_1, verify_task_2],
process=Process.sequential
)
result = crew.kickoff()
# Both say YES → high confidence
# One says NO → flag for human review
4. Pydantic Validation
Use Pydantic models to validate structured outputs:
from pydantic import BaseModel, Field, validator
from typing import Optional
class CampsiteInfo(BaseModel):
name: str = Field(..., min_length=2, max_length=100)
elevation: Optional[int] = Field(None, ge=0, le=10000)
has_water: bool
has_toilet: bool
price_per_night: Optional[float] = Field(None, ge=0, le=10000)
region: str = Field(..., pattern='^(northern|central|southern|eastern)$')
@validator('name')
def must_be_known_campsite(cls, v):
known_campsites = ['Sunset Ridge', 'Forest Creek', 'High Peak', 'Lakeside Haven']
if v not in known_campsites:
raise ValueError(f'Unknown campsite: {v}')
return v
# Use the model in a task
from crewai_tools import JSONReporterTool
reporter = JSONReporterTool(schema=CampsiteInfo.model_json_schema())
task = Task(
description='Provide information about the campsite: Sunset Ridge',
agent=camping_expert,
output_pydantic=CampsiteInfo # CrewAI will validate against this schema
)
crew = Crew(
agents=[camping_expert],
tasks=[task]
)
result = crew.kickoff()
# result will be a CampsiteInfo instance (validated)
print(f"Name: {result.name}")
print(f"Elevation: {result.elevation}m")
print(f"Has water: {result.has_water}")
5. Graceful Fallbacks
When an agent produces low-confidence output, fall back gracefully:
def safe_execute_crew(crew, inputs):
"""Execute a crew with hallucination detection and graceful fallback."""
try:
result = crew.kickoff(inputs=inputs)
result_str = str(result).lower()
# Check for low-confidence indicators
uncertainty_markers = ['I think', 'maybe', 'not sure', 'approximately', 'could be', 'possibly', 'might be']
if any(marker in result_str for marker in uncertainty_markers):
# Flag for human review
print(f"Low confidence detected — flagging for review")
return {
'status': 'needs_review',
'output': result,
'message': 'The agent was uncertain. A human should verify this output.'
}
# Check for refusal indicators
refusal_markers = ['I cannot', 'I can', 'unable to', 'do not have enough']
if any(marker in result_str for marker in refusal_markers):
print(f"Agent could not complete the task")
return {
'status': 'incomplete',
'output': result,
'message': 'The agent could not fully complete this task.'
}
return {'status': 'success', 'output': result}
except Exception as e:
print(f"Crew execution failed: {e}")
return {
'status': 'error',
'output': None,
'error': str(e),
'message': 'An error occurred. Please try again or contact support.'
}
Putting It All Together
# Complete production pattern:
# 1. Tools provide real data
# 2. Task asks for confidence score
# 3. Pydantic model validates output
# 4. Cross-verify critical claims
# 5. Graceful fallback on failure
from crewai import Agent, Task, Crew, Process
from pydantic import BaseModel, Field
class Recommendation(BaseModel):
campsite_name: str = Field(..., min_length=2)
confidence: str = Field(..., pattern='^(HIGH|MEDIUM|LOW)$')
reason: str = Field(..., min_length=10)
research_task = Task(
description='''
Recommend a campsite near Taipei that has water and toilet facilities.
Use the lookup tool to verify the campsite exists.
After your recommendation, add CONFIDENCE: HIGH / MEDIUM / LOW.
''',
agent=camping_expert,
output_pydantic=Recommendation,
expected_output='A validated Recommendation with HIGH confidence'
)
crew = Crew(
agents=[camping_expert],
tasks=[research_task],
process=Process.sequential,
verbose=True
)
result = safe_execute_crew(crew, inputs={'query': 'camping near Taipei with water and toilet'})
if result['status'] == 'success':
print(f"✅ Recommendation: {result['output']}")
elif result['status'] == 'needs_review':
print(f"⚠️ Needs human review: {result['output']}")
else:
print(f"❌ Error: {result['message']}")
Best Practices Summary
| Practice | Why | |----------|-----| | Always provide tools for real data | Agents should look up facts, never guess | | Include confidence scoring in tasks | Users and downstream systems know how much to trust | | Cross-verify critical claims | Two independent calls reduce hallucination risk | | Validate outputs with Pydantic schemas | Catch malformed or invalid responses early | | Implement graceful fallbacks | System should degrade gracefully, not crash | | Log all agent outputs | Detect hallucination patterns over time | | Add human review gates for critical decisions | Some decisions need human judgment | | Use up-to-date models | Newer models hallucinate less | | Keep prompts specific | Vague prompts increase hallucination probability | | Limit each agent's scope | Narrower expertise means fewer opportunities to hallucinate | | Include context from previous tasks | Maintain awareness of the full conversation | | Test with adversarial inputs | Verify your hallucination detection works |
Summary
Hallucinations are a fundamental challenge in LLM-based systems. Mitigate them by providing real data tools, requiring confidence scores, cross-verifying claims, validating with schemas, and implementing graceful fallbacks. Production systems always combine automated validation with human oversight.
Key takeaways:
- Hallucinations = plausible-sounding false information generated by LLMs
- Causes: ambiguous instructions, no data access, overconfidence, context limits, model bias
- Detection strategies: confidence scoring, tool-based fact-checking, cross-verification
- Validation: Pydantic schemas catch structural errors in agent outputs
- Fallbacks: degrade gracefully when confidence is low or execution fails
- Always provide real data tools — never let agents guess facts
- Log all outputs to detect hallucination patterns over time
- Combine automated checks with human review for critical decisions
- Use up-to-date models and keep prompts specific
- Limit each agent's scope to reduce hallucination opportunities
What's Next: Output Parsers
The next chapter covers output parsers — using Pydantic models to validate, structure, and parse agent outputs for reliable downstream processing.