DocumentationAPI ReferenceRelease Notes
DocumentationLog In

This guide outlines how to integrate Deepchecks LLM Evaluation with your Vertex AI models to monitor and analyze their performance. Deepchecks provides comprehensive tools for evaluating LLM-based applications, including:

  • Data logging and enrichment: Capture interactions with your Vertex AI models, including inputs, outputs, and annotations. Deepchecks automatically enriches this data with valuable insights like topics, properties, and estimated annotations.
  • Performance comparison: Compare different versions of your LLM pipeline side-by-side to track improvements and identify regressions.
  • Golden set testing: Evaluate your models on a curated set of examples to ensure consistent performance across versions.
  • Production monitoring: Monitor your models in production to detect issues and ensure they are performing as expected.

Prerequisites

Before you begin, ensure you have the following:

  • A Deepchecks LLM Evaluation account.
  • A Google Cloud project with Vertex AI enabled.
  • Python environment with the deepchecks-llm-client and google-cloud-aiplatform packages installed (pip install deepchecks-llm-client google-cloud-aiplatform).

Integration Steps

  1. Initialize Deepchecks client:
from deepchecks_llm_client.client import DeepchecksLLMClient  

dc_client = DeepchecksLLMClient(
  api_token="YOUR_API_KEY"
)

Replace the placeholders with your actual API key, application name, and version name.

  1. Log Interactions with Vertex AI:

Here's an example of how to log interactions with a Vertex AI text generation model:

from deepchecks_llm_client.data_types import LogInteractionType, AnnotationType, EnvType
from google.cloud import aiplatform

# Configure Vertex AI endpoint and client
endpoint_name = "YOUR_ENDPOINT_NAME"
endpoint = aiplatform.Endpoint(endpoint_name)

def log_vertex_ai_interaction(user_input):
    # Make prediction using Vertex AI
    response = endpoint.predict(instances=[{"content": user_input}])
    prediction = response.predictions[0]["content"]

    # Log interaction to Deepchecks
    dc_client.log_interaction(
      app_name="YOUR APP NAME",
      version_name="YOUR VERSION NUMBER",
      env_type=EnvType.EVAL,
      input=user_input,
      output=prediction,
      annotation=AnnotationType.UNKNOWN  # Add annotation if available
    )

# Example usage
user_input = "Write a poem about the ocean."
log_vertex_ai_interaction(user_input)

This code snippet demonstrates how to:

  • Use the google.cloud.aiplatform library to interact with your Vertex AI endpoint.
  • Make predictions using the endpoint.
  • Log the interaction data (input, output) to Deepchecks using the log_interaction method.
  1. View Insights in Deepchecks Dashboard:
    Once you've logged interactions, head over to the Deepchecks LLM Evaluation dashboard to analyze your model's performance. You can explore various insights, compare versions, and monitor production data.

Advanced Options

Deepchecks offers several advanced features for fine-grained control and analysis:

  • Updating Annotations and Custom Properties: You can update annotations and custom properties for logged interactions.
  • Logging Steps: For complex LLM pipelines, you can log individual steps with their inputs and outputs.
  • Additional Interaction Data: Log additional data like timestamps, user IDs, and custom properties for richer analysis.