DocumentationAPI ReferenceRelease Notes
DocumentationLog In
Documentation

Dataset Management

Dataset Management

Overview

Datasets are curated collections of test samples used to systematically evaluate your LLM application's performance. Unlike production interactions that reflect real user behavior, datasets provide controlled, reproducible test scenarios that help you measure quality improvements across versions, catch regressions, and validate changes before deployment. Each dataset contains input samples (and optionally expected outputs) that you can run against your application to generate evaluation scores.

What are Datasets?

A dataset is a named collection of test samples within an application. Each sample consists of:

  • Input (required) - The test input to send to your application (can be a string, JSON object, or array)
  • Reference Output (optional) - Expected or reference output for comparison
  • Metadata (optional) - Additional context like test category, difficulty level, or sample tags

Datasets serve multiple purposes: regression testing across versions, benchmarking performance improvements, evaluating model changes, and validating prompt modifications before production rollout.

Creating Datasets

Via SDK

from deepchecks_llm_client import DeepchecksLLMClient

client = DeepchecksLLMClient(api_token="your-token", host="your-host")

# Create a new dataset
dataset = client.create_dataset(
    app_name="my-app",
    dataset_name="regression-tests-v1"
)

# Add samples
samples = [
    {
        "input": {"prompt": "What is machine learning?"},
        "output": {"expected": "ML is..."},
        "sample_metadata": {"category": "definitions"}
    },
    {
        "input": {"prompt": "Explain neural networks"},
        "output": {"expected": "Neural networks are..."},
        "sample_metadata": {"category": "concepts"}
    }
]

client.add_dataset_samples(
    app_name="my-app",
    dataset_name="regression-tests-v1",
    samples=samples
)

Via UI

  1. Navigate to your application's Datasets page
  2. Click Create Dataset or the Generate Data button
  3. Provide a descriptive dataset name
  4. Add samples manually, upload a CSV, or use AI generation (see AI Data Generation)

Managing Dataset Samples

Viewing Samples

The dataset details page displays all samples within a dataset. Each row shows:

  • Input preview
  • Output preview (if provided)
  • Metadata tags (if provided)
  • Actions (edit, delete)

Adding Samples

Add manual samples individually or in batches:

Single Sample (UI):

  1. Open the dataset
  2. Click Add Sample at the bottom of the screen.
  3. Enter input (required) and output (optional)
  4. Add metadata as key-value pairs (optinal)
  5. Save

Batch Upload (SDK):

# Add up to 55000 samples per API call
client.add_dataset_samples(app_name, dataset_name, samples_list)

In addition, clicking on the "Add Samples" button on the top-right corner of the screen will enable you to add samples to the dataset via csv/json upload of via AI generation

Editing Samples

Update existing samples by clicking the edit icon:

  • Modify input or output content
  • Update metadata tags
  • Changes are saved immediately

Deleting Samples

Remove individual samples using the delete icon, or delete the entire dataset from the dataset table. Warning: Dataset deletion is permanent and removes all associated samples.

Dataset Limits

To ensure optimal performance and manageability:

  • Maximum samples per dataset: 500
  • Maximum tokens per field: 100,000 tokens (~400,000 characters)