Supported Use Cases
Deepchecks supports a wide range of LLM applications - from RAG and Q&A to multi-agent workflows. Each use case maps to one or more interaction types that configure properties, annotations, and evaluation automatically.
Deepchecks works with any LLM-powered application. Each application is composed of one or more sub-tasks, and each sub-task maps to an Interaction Type - a named configuration that determines which built-in properties are enabled, how auto-annotation rules are set up, and how interactions are displayed in the UI.
You can use one interaction type per application (for simple pipelines) or combine several (for complex agent workflows where retrieval, LLM calls, and tool use each need independent evaluation).
Not sure which type to choose? Start with the closest match - you can always change it later. Deepchecks' team is also happy to advise.
Classic Interaction Types
These types cover single-step or simple multi-step LLM tasks.
| Interaction Type | Use case | Key properties |
|---|---|---|
| Q&A | Question answering and RAG pipelines | Grounded in Context, Retrieval Relevance, Completeness, Avoided Answer |
| Summarization | Condensing documents or transcripts | Coverage, Grounded in Context, Conciseness |
| Generation | Creative or structured content generation | Instruction Fulfillment, Fluency, Toxicity |
| Classification | Multi-class or multi-label LLM classification | Sentiment, Fluency, expected output comparison |
| Feature Extraction | Extracting structured data (e.g., JSON) from free text | Extraction Groundedness, Extraction Coverage, Structural Validity |
| Chat | Multi-turn conversational assistants | Instruction Fulfillment, Intent Fulfillment, User Satisfaction |
| Retrieval | Evaluating the retrieval step of a RAG pipeline independently | Retrieval Coverage, nDCG, Retrieval Precision |
| Other | General-purpose fallback for anything not covered above | General-purpose built-in properties |
| Custom | User-defined type built from scratch | You configure everything |
Agentic Interaction Types
These types are used when evaluating multi-agent and agentic workflows. Each span in a trace is captured as a separate interaction, allowing Deepchecks to evaluate every component independently.
| Interaction Type | Use case | Key properties |
|---|---|---|
| Root | The top-level span of a trace - the full end-to-end execution | Aggregated system metrics, overall quality |
| Agent | An AI agent that plans and delegates to tools or sub-agents | Plan Efficiency, Tool Coverage, Instruction Following, Tool Abuse |
| Chain | A multi-step chain that is not the root span | Aggregated child metrics |
| Tool | A tool called by an agent (web search, code execution, API call) | Tool Completeness |
| LLM | A direct LLM call within an agentic workflow | Reasoning Integrity, Instruction Following |
| Retrieval | A retrieval step within an agentic trace | Retrieval Coverage, Retrieval Precision |
When using a supported framework (LangGraph, CrewAI, Google ADK, LangChain), each span is automatically captured and assigned the correct interaction type. You can also upload traces manually via the SDK.
See Uploading Agentic Data and Auto-Instrumentation for setup details.
Data fields
Each interaction type uses a different combination of content fields. The core fields (input, output) are needed for most property calculations. Additional fields unlock more evaluation capabilities:
| Field | Required | What it contains |
|---|---|---|
input | Required | The user's question, request, or source text |
output | Required | The model's response or generated content |
full_prompt | Recommended | The full prompt sent to the LLM, including system instructions. Required for Instruction Following and Instruction Fulfillment properties. |
information_retrieval | Optional | Retrieved documents used as context (for RAG) |
history | Optional | Conversation history or prior context |
expected_output | Optional | Ground truth label or reference output |
steps | Optional | Intermediate steps, tool calls, or span metadata (for agentic traces) |
See Data Fields Reference for the full specification.
Updated 28 days ago