Nvidia NIM

This guide outlines how to integrate Deepchecks LLM Evaluation with your NVIDIA NIM models to monitor and analyze their performance.

Prerequisites

Before you begin, ensure you have the following:

A Deepchecks LLM Evaluation account.
NVIDIA NIM framework set up and running. For more info, check Nvidia NIM docs
Python environment with the deepchecks-llm-client and requests packages installed (pip install deepchecks-llm-client requests).

Integration Steps

Initialize Deepchecks Client

from deepchecks_llm_client.client import DeepchecksLLMClient  

dc_client = DeepchecksLLMClient(
  api_token="YOUR_API_KEY"
)

Log Interaction with NIM Models

Here's an example of how to log interactions with a model deployed on NIM:

from deepchecks_llm_client.data_types import LogInteraction, AnnotationType, EnvType
import requests

# Configure NIM endpoint
endpoint = "http://localhost:9999/v1/completions"
headers = {"accept": "application/json", "Content-Type": "application/json"}

def log_nim_interaction(user_input):
    # Make prediction using NIM model
    data = {
        "model": "llama-2-7b",  # Replace with your model name
        "prompt": user_input,
        "max_tokens": 100,
        # ... other parameters
    }
    response = requests.post(endpoint, headers=headers, json=data)
    prediction = response.json()

    # Log interaction to Deepchecks
    dc_client.log_interaction(
      app_name="YOUR APP NAME",
      version_name="YOUR VERSION NUMBER",
      env_type=EnvType.EVAL,
      interaction=LogInteraction(
        input=user_input,
        output=prediction,
        annotation=AnnotationType.UNKNOWN  # Add annotation if available
      )
    )
# Example usage
user_input = "Translate 'Hello world' to French."
log_nim_interaction(user_input)

This code snippet demonstrates how to:

Use the requests library to interact with the NIM endpoint.
Make predictions using the endpoint.
Log the interaction data (input, output) to Deepchecks using the log_interaction method.

View Insights in Deepchecks Dashboard:

Once you've logged interactions, head over to the Deepchecks LLM Evaluation dashboard to analyze your model's performance. You can explore various insights, compare versions, and monitor production data.