DocumentationAPI ReferenceRelease Notes
DocumentationLog In
Documentation

Use Custom Properties and LLM Properties

Using Custom Categorical Properties

A powerful way to monitor the performance of your LLM-based application is to categorize it based on your chosen criteria. This approach helps identify which types of input questions your application excels at and which categories might be underperforming.

📘

Custom Properties in Deepchecks

Custom Properties are user-defined. To display a Custom Property in the Deepchecks system, users must upload interactions with the specified Custom Property value. Interactions lacking a value in the Custom Property field will be automatically assigned an N/A value.

In our GVHD use case, we introduced a categorical property named “Question Type,” which classifies input questions into the following five categories:

  1. General GVHD Question
  2. Treatments & Medications
  3. Emergency / Urgent
  4. Appointments & Services
  5. Other

To filter by a specific category on the Data page:

  1. Select the category within the property view on the Overview page and click the “See Interactions” button.

  2. Alternatively, use the “Filter Categories” option in the property filters section on the Data page, and select the desired categories to be shown.

    Results in version v2_improved_IR: Switching between versions with a categorical property filter on allows for effective comparison between versions. You’ll observe changes in relative annotation scores and property scores for these specific interaction subgroups. The larger information pool in version v2_improved_IR contributes to improved responses, particularly for questions in the “Emergency / Urgent” and “Appointments & Services” that version v1_gpt3.5 struggled with.

Using LLM Properties

Another valuable method for evaluating your version’s performance is by incorporating LLM Properties in it. Refer to our LLM Properties documentation page for more information.

For instance, In our GVHD use case, we observed a recurring output pattern where many answers begin with phrases like “Based on the provided context …”. Other interactions contain similar phrases that imply the output was generated by an LLM, using a given context to answer the input question.
This pattern may not be desirable in a Q&A type application.
To highlight this observation of ours, we can create a Custom LLM Property as follows:

New LLM Property Content

Property Name: Based on context
Description: Highlight outputs mentioning it is "based on the context" it was provided with.
System Message:

  1. Read the entire output answer and analyze if it implies that it was based on a given context. If so, give it a high score. Otherwise, give it a low score.
  2. For example, outputs containing "Based on the provided context..", "Based solely on the context..", "The provided context.." or any similar phrase to these should get a high score.

Interaction Steps for Property: output

Recalculate LLM Property

After saving the property's definition, recalculate it with the following definitions (by default LLM properties are calculated only for data that is uploaded after they're defined, so recalculation is needed to get their results for the data that is already in the system):
Versions: Select All (2)
Environment: Both

  1. After a brief calculation period, the new property will appear on the Overview page. When sorting or filtering by this property, you’ll see that outputs implying they are using a given context in their answer indeed receive higher scores.

To explore another relevant LLM Property for our GVHD use case, return to the Properties page and add the built-in LLM property from the built-in property template “Document Relevancy”. As stated in its description, this property evaluates “How relevant the retrieved documents (information_retrieval) are to the user input”.
Applying this property to our GVHD application reveals significant score differences between the two versions, highlighting the information gaps each version was trained on.

You can see another example of LLM Custom Property usage on the Monitor Production Data page.