0.40.0 Release Notes
by Yaron FriedmanDeepchecks LLM Evaluation 0.40.0 Release
We’re excited to announce version 0.40 of Deepchecks LLM Evaluation. This release strengthens self-hosted deployments, improves how agentic systems are evaluated end-to-end, and continues to simplify platform governance. Highlights include a new self-hosted deployment guide, children-based annotation aggregation for agentic workflows and centralized model management for self-hosted environments.
Deepchecks LLM Evaluation 0.40.0 Release:
🏗️ Self-Hosted Deployment Guide
🧠 Model Management for Self-Hosted Deployments
🧩 Children Annotation Aggregation for Agentic Workflows
🔐 RBAC Safeguards for Owners
⚠️ Tool Use Interaction Deprecation
What’s New and Improved?
Self-Hosted Deployment Guide
Deepchecks now includes a dedicated deployment guide for Self-Hosted Enterprise, making it easier than ever to run Deepchecks entirely in your own infrastructure.
What this gives you:
- Deploy Anywhere - Run Deepchecks inside your own environment with full control over networking, security, and data boundaries
- Production-Ready by Design - Built for Kubernetes to support reliability, scalability, and real-world workloads
- Clear Deployment Path- A structured walkthrough of the required components and how they fit together
This guide focuses on helping teams understand what’s required and why, without forcing them to become Deepchecks experts on day one. Whether you’re deploying on AWS or adapting to another environment, the core idea is simple: Deepchecks is designed to fit cleanly into your existing infrastructure.
Model Management for Self-Hosted Deployments
Self-hosted Deepchecks deployments now include first-class model management.
Unlike SaaS deployments, self-hosted environments don’t rely on Deepchecks-managed models. Instead, you explicitly configure the models you own and operate - and Deepchecks makes them available everywhere they’re needed.
What this enables:
- Centralized Configuration - Manage all models in one place at the organization level
- Broad Provider Support - OpenAI, Azure OpenAI, AWS Bedrock, and self-hosted endpoints via LiteLLM
- Immediate Availability - Once added, models appear instantly across evaluations, applications, and preferences
Key details:
- Models are validated and connectivity-tested before being saved
- No models are configured by default - you stay fully in control
This ensures self-hosted users get the same smooth evaluation experience, without compromising ownership or security.
Children Annotation Aggregation (Agentic Evaluation)
Evaluating agentic systems requires more than looking at a single interaction in isolation. With Children Annotation Aggregation, parent interactions are now automatically annotated based on the quality of their child interactions.
Why this matters:
In agentic workflows, structure is hierarchical:
- An Agent may invoke multiple tools and LLM calls
- A Chain coordinates a sequence of steps
- A Root represents the full end-to-end execution
A parent interaction might look fine on its own - but if its children fail, hallucinate, or misuse tools, the overall outcome isn’t truly successful.
What’s new:
- Automatic Upward Propagation - Parent interactions inherit annotations based on their children
- Configurable Rules - Define thresholds (e.g. “mark bad if any child is bad” or “if >50% fail”)
- Type Filtering - Apply aggregation only to specific child interaction types (LLM, tool, etc.)
The result is a more honest, system-level view of agentic behavior - one that reflects what actually happened beneath the surface.

Example of children annotation aggregation where the interaction would get a Bad auto annotation if any of it's direct Tool or LLM children is annotated Bad
RBAC Safeguards for Owners
To protect organizational stability, we’ve added an important safeguard to role management.
What’s changed:
- Every organization must always have at least one Owner
- An Owner can't downgrade their own role unless another Owner already exists
This prevents accidental lockouts and ensures there’s always someone with the permissions required to manage configuration, users, and critical settings.
Tool Use Interaction Deprecation
The legacy Tool Use interaction type has been deprecated.
What to know:
- Tool Use is no longer a standalone interaction type
- All agentic workflows now use the unified agentic interaction model: Root, Agent, Chain, Tool, and LLM
This change simplifies the interaction model and better reflects how modern agentic systems actually operate, while enabling more consistent evaluation and aggregation across hierarchies.
































