DocumentationAPI ReferenceRelease Notes
DocumentationLog In
Documentation

Supported Use Cases

Deepchecks supports a wide range of LLM applications - from RAG and Q&A to multi-agent workflows. Each use case maps to one or more interaction types that configure properties, annotations, and evaluation automatically.

Deepchecks works with any LLM-powered application. Each application is composed of one or more sub-tasks, and each sub-task maps to an Interaction Type - a named configuration that determines which built-in properties are enabled, how auto-annotation rules are set up, and how interactions are displayed in the UI.

You can use one interaction type per application (for simple pipelines) or combine several (for complex agent workflows where retrieval, LLM calls, and tool use each need independent evaluation).

Not sure which type to choose? Start with the closest match - you can always change it later. Deepchecks' team is also happy to advise.


Classic Interaction Types

These types cover single-step or simple multi-step LLM tasks.

Interaction TypeUse caseKey properties
Q&AQuestion answering and RAG pipelinesGrounded in Context, Retrieval Relevance, Completeness, Avoided Answer
SummarizationCondensing documents or transcriptsCoverage, Grounded in Context, Conciseness
GenerationCreative or structured content generationInstruction Fulfillment, Fluency, Toxicity
ClassificationMulti-class or multi-label LLM classificationSentiment, Fluency, expected output comparison
Feature ExtractionExtracting structured data (e.g., JSON) from free textExtraction Groundedness, Extraction Coverage, Structural Validity
ChatMulti-turn conversational assistantsInstruction Fulfillment, Intent Fulfillment, User Satisfaction
RetrievalEvaluating the retrieval step of a RAG pipeline independentlyRetrieval Coverage, nDCG, Retrieval Precision
OtherGeneral-purpose fallback for anything not covered aboveGeneral-purpose built-in properties
CustomUser-defined type built from scratchYou configure everything

Agentic Interaction Types

These types are used when evaluating multi-agent and agentic workflows. Each span in a trace is captured as a separate interaction, allowing Deepchecks to evaluate every component independently.

Interaction TypeUse caseKey properties
RootThe top-level span of a trace - the full end-to-end executionAggregated system metrics, overall quality
AgentAn AI agent that plans and delegates to tools or sub-agentsPlan Efficiency, Tool Coverage, Instruction Following, Tool Abuse
ChainA multi-step chain that is not the root spanAggregated child metrics
ToolA tool called by an agent (web search, code execution, API call)Tool Completeness
LLMA direct LLM call within an agentic workflowReasoning Integrity, Instruction Following
RetrievalA retrieval step within an agentic traceRetrieval Coverage, Retrieval Precision

When using a supported framework (LangGraph, CrewAI, Google ADK, LangChain), each span is automatically captured and assigned the correct interaction type. You can also upload traces manually via the SDK.

See Uploading Agentic Data and Auto-Instrumentation for setup details.


Data fields

Each interaction type uses a different combination of content fields. The core fields (input, output) are needed for most property calculations. Additional fields unlock more evaluation capabilities:

FieldRequiredWhat it contains
inputRequiredThe user's question, request, or source text
outputRequiredThe model's response or generated content
full_promptRecommendedThe full prompt sent to the LLM, including system instructions. Required for Instruction Following and Instruction Fulfillment properties.
information_retrievalOptionalRetrieved documents used as context (for RAG)
historyOptionalConversation history or prior context
expected_outputOptionalGround truth label or reference output
stepsOptionalIntermediate steps, tool calls, or span metadata (for agentic traces)

See Data Fields Reference for the full specification.