Jump to Content
DeepchecksDocumentation
DocumentationAPI ReferenceRelease Notes
DocumentationLog InDeepchecks
Documentation
Log In
DocumentationAPI ReferenceRelease Notes

Getting Started

  • Welcome to Deepchecks LLM Evaluation
  • Supported Use Cases
  • Features
    • Automatic Annotations
    • Version Comparison
    • Root Cause Analysis (RCA)
    • Production Monitoring
    • Experiment Management
    • Additional Features
  • Properties
    • Built-in Properties
      • Retrieval Use-Case Properties
    • Prompt Properties
    • User-Value Properties
  • Hierarchy & Data Structure
  • Tracing on the Deepchecks System

Deepchecks in Action

  • Q&A Demo: GVHD Data
    • Uploading the Data
    • Identify Problems Using Properties, Estimated Annotations and Insights
    • User-Value Properties and Prompt Properties
    • Compare Between Versions
    • Monitor Production Data and Research Degradation
  • Summarization Demo: E-Commerce Data
    • Uploading the Data
    • Configuring the Automatic Annotation
    • Compare Between Versions
    • Production Monitoring
  • Agents Demo: Investment Agent Data
    • Uploading the Data
    • Overview
    • Root Cause Analysis: Investigating Performance Issues
    • Compare Between Versions
    • Production Monitoring
  • Classification Demo: Movie Genre
    • Uploading the Data
    • Evaluation Set Analysis
    • Production Monitoring

How To Guides

  • Data Integration
    • Data Model Cheat-Sheet
    • Step by Step Integration Walkthrough
  • Deepchecks' SDK
    • Setup: Python SDK Installation & API Key Retrieval
    • Main SDK Classes
    • Data Upload
    • Data Download
    • Code Snippets: Full Examples
  • Configure Auto Annotation
    • Selecting the Right Properties
    • Threshold Detection
    • Auto Annotation Design
    • The Configuration YAML
  • Perform Root Cause Analysis
    • Identifying Failures
    • Analyzing Failures
    • End-to-End RCA Examples
  • Build & Maintain a High-Quality Evaluation Set
    • Building Your Initial Evaluation Set
    • Measuring Evaluation Set Quality
    • When and How to Update Your Evaluation Set
  • Agent Evaluation
  • AI-Assisted Manual Annotations
  • Hard Sample Mining for Fine-Tuning
  • Pentesting Your LLM-Based App
  • Get the Most Out of Your DPUs

Integrations

  • Deepchecks in AWS SageMaker
  • LLMs
    • OpenAI
    • Azure OpenAI
    • Vertex AI
    • Anthropic
    • Nvidia NIM
    • Oracle Cloud (OCI)
    • AWS Bedrock
  • Production Monitoring
    • Datadog Integration
    • New Relic Integration
  • Langchain Tracing
  • Configuring Nvidia's Guardrails with Deepchecks
Powered by 

LLMs

Suggest Edits

This section describes integration to various common LLM models:

OpenAI

Azure OpenAI

Vertex AI

Anthropic

Nvidia NIM

Oracle Cloud (OCI)

AWS Bedrock

Updated about 1 year ago