CrewAI
Deepchecks integrates seamlessly with CrewAI, providing automated tracing, evaluation, and observability for multi-agent workflows and tool-assisted pipelines
Deepchecks integrates seamlessly with CrewAI, letting you upload and evaluate your CrewAI workflows. With our integration, you can capture traces from CrewAI runs using OpenTelemetry (OTEL) and OpenInference, and automatically send them to Deepchecks for observability, evaluation and monitoring.
How it works
Data upload and evaluation
Capture traces from your CrewAI runs and send them to Deepchecks for evaluation.
Instrumentation
We use OTEL + OpenInference to automatically instrument CrewAI (and LiteLLM if applicable). This gives you rich traces, including LLM calls, tool invocations, and agent-level spans.
Registering with Deepchecks
Traces are uploaded through a simple register_dc_exporter
call, where you provide your Deepchecks API key, application, version, and environment.
Viewing results
Once uploaded, you’ll see your traces in the Deepchecks UI, complete with spans, properties, and auto-annotations. See here for information about multi-agentic use-case properties.
Package installation
pip install opentelemetry-api opentelemetry-sdk
# Install OpenInference instrumentor for CrewAI
pip install openinference-instrumentation-crewai
# If CrewAI uses LiteLLM (depends on your setup):
pip install openinference-instrumentation-litellm
# Install Deepchecks client
pip install deepchecks-llm-client
Instrumenting CrewAI
from openinference.instrumentation.crewai import CrewAIInstrumentor
from openinference.instrumentation.litellm import LiteLLMInstrumentor
from deepchecks_llm_client.otel import register_dc_exporter
# 1. Register the Deepchecks exporter
tracer_provider = register_dc_exporter(
host="https://app.llm.deepchecks.com/", # Deepchecks endpoint
api_key="YOUR_API_KEY", # API key from your Deepchecks workspace
app_name="app_name", # Application name in Deepchecks
version_name="version_name", # Version name for this run
env_type="EVAL", # Environment: EVAL, PROD, etc.
log_to_console=True, # Optional: also log spans to console
)
# 2. Instrument CrewAI and LiteLLM
CrewAIInstrumentor().instrument(tracer_provider=tracer_provider)
LiteLLMInstrumentor().instrument(tracer_provider=tracer_provider)
Example
This is a simple agentic workflow for a research assistant that retrieves accurate information on various topics by searching Wikipedia and verifying facts in academic papers. It runs on the CrewAI pipeline and includes tracing and Deepchecks registration code:
from openinference.instrumentation.crewai import CrewAIInstrumentor
from openinference.instrumentation.litellm import LiteLLMInstrumentor
from deepchecks_llm_client.otel import register_dc_exporter
os.environ["OPENAI_API_KEY"] = "Your LLM API key"
SEMANTIC_SCHOLAR_API_URL = "https://api.semanticscholar.org/graph/v1/paper/search"
# 1. Register the Deepchecks exporter
tracer_provider = register_dc_exporter(
host="https://app.llm.deepchecks.com/", # Deepchecks endpoint
api_key="DC_API_KEY", # API key from your Deepchecks workspace
app_name="DC_APP_NAME", # Application name in Deepchecks
version_name="DC_VERSION_NAME", # Version name for this run
env_type="EVAL", # Environment: EVAL, PROD, etc.
log_to_console=True, # Optional: also log spans to console
)
# 2. Instrument CrewAI and LiteLLM
CrewAIInstrumentor().instrument(tracer_provider=tracer_provider)
LiteLLMInstrumentor().instrument(tracer_provider=tracer_provider)
@tool("academic_papers_lookup")
def search_academic_papers(query: str) -> str:
"""Search for academic papers related to the given query and return the titles and abstracts from Semantic Scholar."""
params = {
"query": query,
"fields": "title,abstract",
"limit": 3 # Get top 3 papers
}
try:
response = requests.get(SEMANTIC_SCHOLAR_API_URL, params=params)
response.raise_for_status()
data = response.json()
if "data" not in data or not data["data"]:
return "No relevant academic papers found."
results = []
for paper in data["data"]:
title = paper.get("title", "No title available")
abstract = paper.get("abstract", "No abstract available")
results.append(f"Title: {title}\nAbstract: {abstract}")
return "\n\n".join(results)
except requests.exceptions.RequestException as e:
return f"Error fetching papers: {e}"
@tool("wikipedia_lookup")
def wikipedia_lookup(query: str) -> str:
"""Search Wikipedia for the given query and return the summary."""
try:
summary = wikipedia.summary(query, sentences=2)
return summary
except wikipedia.exceptions.DisambiguationError as e:
return f"Disambiguation error: {e.options}"
except wikipedia.exceptions.PageError:
return "Page not found."
except Exception as e:
return f"An error occurred: {e}"
# Define the agent
agent = Agent(
role='Research Assistant',
goal='Provide accurate information on various topics by searching Wikipedia and verifying in academic papers.',
backstory='Proficient in retrieving and summarizing information from Wikipedia and Academic resources.',
tools=[wikipedia_lookup, search_academic_papers],
llm='gpt-4.1-mini',
verbose=True # Enable detailed logging
)
# Define a task for the agent
task1 = Task(
description='How old was George Washington when he died and when did he become a general',
expected_output='The age at which George Washington died and became a general.',
agent=agent
)
# Create a crew with the agent and the task
crew = Crew(
agents=[agent],
tasks=[task1],
verbose=True, # Enable detailed logging for the crew
process=Process.sequential, # Use sequential process for simplicity
)
if __name__ == "__main__":
# Run your CrewAI crew
result = crew.kickoff()
The following illustrates how a single execution of this example appears within the Deepchecks platform, showcasing the full workflow from input to output, the captured spans, and the registered evaluation metrics for detailed analysis:
Updated 3 days ago