Nvidia NIM
This guide outlines how to integrate Deepchecks LLM Evaluation with your NVIDIA NIM models to monitor and analyze their performance.
Prerequisites
Before you begin, ensure you have the following:
- A Deepchecks LLM Evaluation account.
- NVIDIA NIM framework set up and running. For more info, check Nvidia NIM docs
- Python environment with the
deepchecks-llm-client
andrequests
packages installed (pip install deepchecks-llm-client requests
).
Integration Steps
- Initialize Deepchecks Client
from deepchecks_llm_client.client import DeepchecksLLMClient
dc_client = DeepchecksLLMClient(
api_token="YOUR_API_KEY"
)
- Log Interaction with NIM Models
Here's an example of how to log interactions with a model deployed on NIM:
from deepchecks_llm_client.data_types import LogInteractionType, AnnotationType, EnvType
import requests
# Configure NIM endpoint
endpoint = "http://localhost:9999/v1/completions"
headers = {"accept": "application/json", "Content-Type": "application/json"}
def log_nim_interaction(user_input):
# Make prediction using NIM model
data = {
"model": "llama-2-7b", # Replace with your model name
"prompt": user_input,
"max_tokens": 100,
# ... other parameters
}
response = requests.post(endpoint, headers=headers, json=data)
prediction = response.json()
# Log interaction to Deepchecks
dc_client.log_interaction(
app_name="YOUR APP NAME",
version_name="YOUR VERSION NUMBER",
env_type=EnvType.EVAL,
input=user_input,
output=prediction,
annotation=AnnotationType.UNKNOWN # Add annotation if available
)
# Example usage
user_input = "Translate 'Hello world' to French."
log_nim_interaction(user_input)
This code snippet demonstrates how to:
- Use the requests library to interact with the NIM endpoint.
- Make predictions using the endpoint.
- Log the interaction data (input, output) to Deepchecks using the
log_interaction
method.
- View Insights in Deepchecks Dashboard:
Once you've logged interactions, head over to the Deepchecks LLM Evaluation dashboard to analyze your model's performance. You can explore various insights, compare versions, and monitor production data.
Updated 4 months ago