Agents Demo: Investment Agent Data
Evaluating and debugging an Agent application, step by step
Jump right in by Creating an Application and Uploading the Data
Use Case Background
This demo shows how to evaluate and monitor a multi-tool financial AI agent using Deepchecks. Our example features a CrewAI-powered investment advisor that handles complex, multi-step financial workflows with real financial tools.
Agent Capabilities:
- Market Data: Stock prices, financial fundamentals, technical analysis, and historical data
- Research: Company news, information, and analyst recommendations
- Portfolio Management: Trading execution and portfolio information retrieval
- Currency Services: Real-time currency conversion
Interaction Types
The agent processes user queries through two distinct interaction types:
- Tool Use - Evaluates how the agent plans and executes tool calls to gather information
- Generation - Evaluates the quality of the final response the agent returns to the user
Interaction Input - Output Example
Tool Use:

- Interaction Input - User's query
- Interaction Output - Agent's thought process
- Action - Agent's tool calling
- Tool Response - The response from calling the tool
Generation:

- Interaction Input - User's query
- Interaction Output - Agent's final response
Demo Structure
This tutorial walks you through a complete agent evaluation workflow:
- Upload Your Data - Learn about different agent configurations and data formats
- Analyze Performance - Understand sessions, tool use, and generation metrics
- Root Cause Analysis - Add custom properties and refine evaluation criteria
- Compare Versions - Optimize agent architecture and model selection
- Monitor Production - Track real-world performance over time
Let's begin by setting up your data and understanding the different agent configurations we'll be evaluating.
Updated 2 days ago