Dataset Management

Overview

Datasets are curated collections of test samples used to systematically evaluate your LLM application's performance. Unlike production interactions that reflect real user behavior, datasets provide controlled, reproducible test scenarios that help you measure quality improvements across versions, catch regressions, and validate changes before deployment. Each dataset contains input samples (and optionally expected outputs) that you can run against your application to generate evaluation scores.

What are Datasets?

A dataset is a named collection of test samples within an application. Each sample consists of:

Input (required) - The test input to send to your application (can be a string, JSON object, or array)
Reference Output (optional) - Expected or reference output for comparison
Metadata (optional) - Additional context like test category, difficulty level, or sample tags

Datasets serve multiple purposes: regression testing across versions, benchmarking performance improvements, evaluating model changes, and validating prompt modifications before production rollout.

Creating Datasets

Via SDK

from deepchecks_llm_client import DeepchecksLLMClient

client = DeepchecksLLMClient(api_token="your-token", host="your-host")

# Create a new dataset
dataset = client.create_dataset(
    app_name="my-app",
    dataset_name="regression-tests-v1"
)

# Add samples
samples = [
    {
        "input": {"prompt": "What is machine learning?"},
        "output": {"expected": "ML is..."},
        "sample_metadata": {"category": "definitions"}
    },
    {
        "input": {"prompt": "Explain neural networks"},
        "output": {"expected": "Neural networks are..."},
        "sample_metadata": {"category": "concepts"}
    }
]

client.add_dataset_samples(
    app_name="my-app",
    dataset_name="regression-tests-v1",
    samples=samples
)

Via UI

Navigate to your application's Datasets page
Click Create Dataset or the Generate Data button
Provide a descriptive dataset name
Add samples manually, upload a CSV, or use AI generation (see AI Data Generation)

Managing Dataset Samples

Viewing Samples

The dataset details page displays all samples within a dataset. Each row shows:

Input preview
Output preview (if provided)
Metadata tags (if provided)
Actions (edit, delete)

Adding Samples

Add manual samples individually or in batches:

Single Sample (UI):

Open the dataset
Click Add Sample at the bottom of the screen.
Enter input (required) and output (optional)
Add metadata as key-value pairs (optinal)
Save

Batch Upload (SDK):

# Add up to 55000 samples per API call
client.add_dataset_samples(app_name, dataset_name, samples_list)

In addition, clicking on the "Add Samples" button on the top-right corner of the screen will enable you to add samples to the dataset via csv/json upload of via AI generation

Editing Samples

Update existing samples by clicking the edit icon:

Modify input or output content
Update metadata tags
Changes are saved immediately

Deleting Samples

Remove individual samples using the delete icon, or delete the entire dataset from the dataset table. Warning: Dataset deletion is permanent and removes all associated samples.

Dataset Limits

To ensure optimal performance and manageability:

Maximum samples per dataset: 500
Maximum tokens per field: 100,000 tokens (~400,000 characters)