DocumentationAPI ReferenceRelease Notes
DocumentationLog In
Documentation

Multi-Agent Demo: Content Creator Crew

Evaluating and debugging a Multi-Agent application, step by step.

The Use Case

This demo evaluates a CrewAI-powered multi-agent content creation workflow. The workflow is designed to create a blog post based on a given topic, target audience, and optional context. The workflow is evaluated using Deepchecks to identify performance issues and root causes.

The Workflow

The system consists of three sequential agents that collaborate to create blog posts.
Each agent has specific tools and capabilities to perform its role.

The Problem

The agentic workflow produces generic content despite having enhancement capabilities.
We use Deepchecks to trace the multi-agent workflow and identify why and where our workflow underperforms.
This transforms a black-box system into an observable, debuggable process.

Evaluation Workflow

This demo follows a systematic evaluation process:

  1. Log evaluation data
    Create the crew configuration and run test cases through the baseline workflow.
    Log traces to Deepchecks.

  2. Analyze the Base Version
    Examine the baseline version to identify performance issues through automated properties and auto-annotation.

  3. Root Cause Analysis
    Use the Score Breakdown, Property Failure Analysis, and Insights to help us understand the issue and fix it.

  4. Compare Versions
    Test the new configuration and see that our issue has been resolved. Try out multiple models to balance result quality with cost.