Sending Evaluation Data via Deepchecks SDK

To add interactions to an evaluation set, follow these steps. Please note these should work both for a version name that already exists within the system, and for a new version (and version name):

Read the interactions within your Python code (this step depends on how you saved them, etc):

import pandas as pd
interactions_df = pd.read_csv('new_interactions_to_add.csv')

Set up the Deepchecks Python client:

from deepchecks_llm_client.client import DeepchecksLLMClient

dc_client = DeepchecksLLMClient(
    api_token="Your Deepchecks API Token Here (get it from within the app)"
)

Now format the DataFrame as an interaction within Python, leaving only the columns that need to be sent:

from deepchecks_llm_client.data_types import AnnotationType
# Mandatory columns
mandatory_columns = ['input', 'output']
# Optional columns (highly recommended to include any of these whenever possible)
optional_columns = ['user_interaction_id', 'information_retrieval', 'annotation', 'full_prompt']
# Custom properties
custom_properties_columns = ['My Custom Property']

interactions = []
for index, row in interactions_df.iterrows():
  # Prepare arguments for LogInteraction object, only include optional columns if they exist
  interaction_args = {
    'input': row['input'],
    'output': row['output'],
    'custom_props': {key: row[key] for key in custom_properties_columns if key in row and not pd.isnull(row[key])}
  }

  # Add optional arguments if the column is present and the value is not NaN
  for column in optional_columns:
    if column in row and not pd.isnull(row[column]):
      if column == 'annotation':
        interaction_args[column] = AnnotationType(row[column])
      else:
        interaction_args[column] = row[column]

  # Create a LogInteraction object from the prepared arguments
  interaction = LogInteraction(**interaction_args)
  interactions.append(interaction)

Now log the interactions:

from deepchecks_llm_client.data_types import EnvType

dc_client.create_application("GVHD-demo", ApplicationType.QA) # If does not exist yet

dc_client.log_batch_interactions(
  app_name="GVHD-demo", 
  version_name="baseline",
  env_type=EnvType.EVAL,
  interactions=interactions
)

Notes:

For existing versions, this only works in cases where the 'user_interaction_id' doesn't exist yet in the evaluation set. If some of them may already exist in the evaluation set, it's recommended per evaluation version to fetch the evaluation data as a DataFrame via SDK (using the 'get_data' function, see here), and then add a check making sure each id is non-existent in the 'user_interaction_id' column.
If there is data from multiple versions (with different outputs for each version, etc), it's possible to create a similar workflow to log these new interactions to multiple versions in parallel. For this, the advanced user can get inspiration from the 'Cloning Interactions from Production into the Evaluation Set' section above.