Chapter 2: Arming AI with Tools and Building an Invincible Crawler Agent

In the previous chapter, we successfully made two Agents (Researcher and Writer) converse and collaboratively write an article. But if you carefully examine their conversation logs, you'll discover a fatal flaw: They can only "rely on their imagination"! If I ask them, "What was TSMC's closing price today?", they might fabricate a number or tell you "As an AI model, I don't have today's data."

In the world of Multi-Agent systems, if a brain has no hands or feet, it's merely a chatty toy. This chapter will teach you how to equip Agents with Tools, granting them abilities like Google search, web page reading, and even operating external APIs! This is where truly disruptive commercial value begins.

🎯 Chapter Objectives

Understand CrewAI's core Tool mechanism and operational principles.
Learn how to enable AI to use the built-in search tool (Search Tool).
Use Vibe Coding to have AI help us write a "Custom Crawler Tool" to extract financial report content from specific web pages.
Observe how Agents autonomously decide "when to use which tool."

🛠️ Step 1: Issuing Built-in Weapons (Google Search)

CrewAI's official package is extremely considerate, providing a crewai_tools module packed with numerous plug-and-play tools. For example, the SerperDevTool allows Agents to perform Google searches.

🔥【Vibe Prompt Practical Incantation】 I'm using CrewAI. Please teach me how to add "SerperDevTool" to my Researcher Agent to give it Google search capabilities. 1. Tell me which website to register on and what environment variable to obtain (e.g., SERPER_API_KEY). 2. Provide complete Agent definition code demonstrating how to instantiate this tool and include it in the tools=[] array. 3. Add detailed Chinese comments.

The AI will guide you to register on serper.dev (a site specializing in providing Google search APIs for AI) for a free account to get an API Key, then modify the code like this:

import os
from crewai import Agent
from crewai_tools import SerperDevTool

# 1. Insert the search engine key (usually written in .env, shown here for demonstration)
os.environ["SERPER_API_KEY"] = "your_Serper_key"

# 2. Create an instance of the search weapon
search_tool = SerperDevTool()

# 3. Create the employee and issue the weapon
researcher = Agent(
    role='Senior Technology Market Researcher',
    goal='Identify the latest trends',
    backstory='You excel at filtering the most valuable business intelligence from Google search results.',
    verbose=True,
    tools=[search_tool] # Key! Add the weapon to the list
)

Now, when this Researcher encounters an unknown question, it will print an impressive thought process in the terminal: "Thought: I need to know Tesla's stock price today, but I don't have that information. I should use the search tool." Action: Search the internet Action Input: "2026 Tesla latest stock price news" It will then use the search results to continue its reasoning! This is an Agent with true "action capability"!

🕷️ Step 2: Crafting Your Custom Weapon (Custom Tool)

While built-in tools are convenient, they're insufficient for real-world business scenarios. Suppose your client is in finance and requires the Agent to crawl "specific financial report URLs" for the latest shareholder meeting records and summarize the key points. Here, we must become blacksmiths and forge a Custom Tool for the Agent.

This would normally be extremely challenging for beginners, as it involves Python web scraping and class encapsulation. But in the Vibe Coding era, it takes just one sentence!

🔥【Vibe Prompt Practical Incantation】 Please help me write a CrewAI Custom Tool in Python. 1. Tool name: StockReportScraperTool. 2. Core functionality: Accept a url (string), use requests and BeautifulSoup to scrape all <p> tag text from the webpage, and return it as a long string. 3. Use CrewAI's recommended @tool decorator syntax (function style). 4. 【Critical】Include extremely detailed Docstring (triple-quote comments) explaining when the Agent should use this tool and what the url format should look like.

The AI will write this powerful custom crawler tool:

from crewai.tools import tool
import requests
from bs4 import BeautifulSoup

@tool("StockReportScraperTool")
def scrape_report(url: str) -> str:
    """
    A tool for scraping content from specific financial report or news URLs.
    When a task requires you to "read content from a specific URL," always call this tool.
    Provide a valid URL string (e.g., https://example.com/report),
    and it will automatically scrape the page's content and return it as a string.
    """
    try:
        # Disguise as a browser to avoid simple blocking
        headers = {'User-Agent': 'Mozilla/5.0'}
        response = requests.get(url, headers=headers, timeout=10)
        
        # Parse HTML with BS4
        soup = BeautifulSoup(response.text, 'html.parser')
        paragraphs = soup.find_all('p')
        
        # Concatenate all paragraphs for return
        return "\n".join([p.text for p in paragraphs])
    except Exception as e:
        return f"Scraping failed, error: {str(e)}"

Then, simply add the scrape_report function to the Agent's tools=[], and your Agent instantly gains web scraping skills!

💼 [Business Application] Why Docstrings Are Life-or-Death Decisions

In the Custom Tool code above, the most critical part isn't the requests scraper but the Chinese comments (Docstring) enclosed in triple quotes.

This is a common pitfall for beginners. When you issue multiple tools to an Agent (e.g., search tool, scraper tool, email tool), how does the Agent know which one to use? The Agent actually "reads" the Docstrings you write for each tool!

Without clear instructions, the Agent becomes confused, potentially using an email tool to attempt data searches, leading to task failure. Thus, when pitching to clients, you can say: "Our AI employees' core technology lies in 'optimized tool decision logic,' ensuring the AI won't misuse tools and crash the system." The technical foundation behind this claim is precise Docstring writing.

✅ Chapter Summary

This is the most fascinating aspect of creating "digital employees": you're not just coding—you're designing how brains interact with external world tools. With tools, your AI gains hands and feet. In the next chapter, we'll explore how they communicate and delegate tasks within teams and uncover the secrets of the Memory system!

CrewAI Tool Categories

| Category | Description | Example | |----------|-------------|--------| | Built-in tools | Tools included with CrewAI | FileReadTool, DirectoryReadTool | | LangChain tools | Wrappers around LangChain tools | CalculatorTool, WikipediaTool | | Custom @tool | Decorator-based custom tools | Any Python function with @tool | | External API tools | Tools that call external services | Weather API, Maps API, Search API | | Database tools | Tools for database access | Supabase query tool, PostgreSQL tool | | File processing tools | Read, write, transform files | CSV parser, Markdown generator |

Creating Custom Tools

Method 1: @tool Decorator

from crewai_tools import tool
import requests

@tool("Search Campsites")
def search_campsites(query: str) -> str:
    """Search for campsites matching the query. Returns JSON results."""
    response = requests.get(
        f"https://api.myapp.com/campsites/search?q={query}&limit=5"
    )
    if response.status_code == 200:
        return str(response.json())
    return "No results found"

Method 2: Class-Based Tool

from crewai_tools import BaseTool

class WeatherTool(BaseTool):
    name: str = "Get Weather"
    description: str = "Get current weather and forecast for a location."

    def _run(self, latitude: float, longitude: float) -> str:
        response = requests.get(
            f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m&hourly=temperature_2m&timezone=auto"
        )
        data = response.json()
        current = data['current']
        return (
            f"Current temperature: {current['temperature_2m']}°C\n"
            f"Hourly forecast available for next 7 days"
        )

    async def _arun(self, latitude: float, longitude: float) -> str:
        """Async version for concurrent execution."""
        import aiohttp
        async with aiohttp.ClientSession() as session:
            async with session.get(
                f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m&hourly=temperature_2m&timezone=auto"
            ) as response:
                data = await response.json()
                current = data['current']
                return f"Current temperature: {current['temperature_2m']}°C"

Method 3: Tool from LangChain

from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

wikipedia = WikipediaQueryRun(
    api_wrapper=WikipediaAPIWrapper()
)

# Use in CrewAI
from crewai_tools import adapt_tool

crewai_tool = adapt_tool(wikipedia)

Validation and Error Handling

from crewai_tools import tool
import requests
from typing import Optional

@tool("Get Campsite Details")
def get_campsite_details(campsite_id: str) -> str:
    """Get detailed information about a campsite by ID."""
    if not campsite_id or len(campsite_id) < 5:
        return json.dumps({"error": "Invalid campsite ID"})

    try:
        response = requests.get(
            f"https://api.myapp.com/campsites/{campsite_id}",
            timeout=10
        )
        response.raise_for_status()
        data = response.json()
        return json.dumps(data, indent=2)
    except requests.Timeout:
        return json.dumps({"error": "Request timed out. Please try again."})
    except requests.HTTPError as e:
        return json.dumps({"error": f"API error: {e.response.status_code}"})
    except Exception as e:
        return json.dumps({"error": f"Unexpected error: {str(e)}"})

Using Tools in Agents

from crewai import Agent

camping_expert = Agent(
    role='Taiwan Camping Specialist',
    goal='Find and recommend the best campsites.',
    backstory='You are a camping expert with deep knowledge of Taiwan campgrounds.',
    tools=[
        search_campsites,
        WeatherTool(),
        get_campsite_details
    ],
    verbose=True
)

Tool Best Practices

| Practice | Reason | |----------|--------| | Give tools clear, descriptive names | The agent understands what the tool does | | Write good docstrings | The agent reads the docstring to know when to use the tool | | Handle errors gracefully | Return error JSON, don't raise exceptions | | Set reasonable timeouts | Prevent agents from hanging on slow APIs | | Validate inputs | Check IDs, formats, ranges before making API calls | | Return structured data | JSON makes it easy for the agent to read | | Cache responses when appropriate | Reduce API calls and speed up agent execution | | Provide async versions (_arun) | Enable parallel tool execution | | Keep tools focused | One tool = one responsibility | | Test tools independently | Verify each tool works before adding it to an agent |

Summary

Custom tools extend what your CrewAI agents can do. Use @tool for simple functions, BaseTool for complex tools with sync and async support, and adapt LangChain tools for existing integrations. Always handle errors gracefully and return structured data.

Key takeaways:

@tool decorator: simple, fast, works for most cases
BaseTool class: full control, supports async (_arun)
Good docstrings help agents understand when to use tools
Always handle API errors, timeouts, and invalid inputs
Return structured JSON for easy agent consumption
Test each tool in isolation before adding to an agent
Keep tools focused on one task each
Provide both sync and async versions for flexibility

What's Next: Memory and Delegation

The next chapter covers memory and delegation — enabling agents to remember context across conversations and delegate tasks to other agents.