AWS Bedrock

This guide outlines how to integrate Deepchecks LLM Evaluation with your AWS Bedrock models to monitor and analyze their performance.


Before you begin, ensure you have the following:

  • A Deepchecks LLM Evaluation account.
  • An AWS account with Bedrock enabled.
  • Python environment with the deepchecks-llm-client and boto3 packages installed (pip install deepchecks-llm-client boto3).

Integration Steps

  1. Initialize Deepchecks Client:
from deepchecks_llm_client.client import dc_client
from deepchecks_llm_client.data_types import EnvType

    env_type=EnvType.EVAL,  # Change to EnvType.PROD for production monitoring

Replace the placeholders with your actual API key, application name, and version name.

  1. Log Interaction with AWS Bedrock Models:

Here's an example of how to log interactions with a Bedrock model using boto3:

from deepchecks_llm_client.data_types import LogInteractionType, AnnotationType
import boto3

# Configure Bedrock runtime client
bedrock_runtime = boto3.client("bedrock-runtime")

def log_bedrock_interaction(user_input, model_id):
    # Make prediction using Bedrock model
    response = bedrock_runtime.invoke_model(
        body=json.dumps({"inputText": user_input}),  # Adjust body for different models
    response_body = json.loads(response.get("body").read())
    prediction = response_body.get("results")[0].get("outputText")  # Adjust for different models

    # Log interaction to Deepchecks
        annotation=AnnotationType.UNKNOWN,  # Add annotation if available

# Example usage
user_input = "Write a poem about the beauty of nature."
model_id = "amazon.titan-tg1-large"  # Replace with your desired model ID
log_bedrock_interaction(user_input, model_id)

This code snippet demonstrates how to:

  • Use the boto3 library to interact with Bedrock models.
  • Make predictions using the invoke_model method.
  • Log the interaction data (input, output) to Deepchecks using the log_interaction method.
  1. View Insights in Deepchecks Dashboard:

Once you've logged interactions, head over to the Deepchecks LLM Evaluation dashboard to analyze your model's performance. You can explore various insights, compare versions, and monitor production data.

Note: This example provides a basic integration approach. You might need to adjust the prediction and logging logic based on your specific Bedrock model and use case.