What are properties in Deepchecks LLM Evaluation, what kinds of properties are there and how they are used.

Properties are one-dimensional values that are calculated on each text sample. For example, a property could be simple text characteristics such as the number of words in the text, or more complex properties such identifying if the text contains toxic language, or if a given summary is capturing the key points of the original article.

What are properties used for?

Properties measure various aspects of our LLM interactions that we may be interested in. They are used in the following ways:

  • Properties are used to create estimated annotations. By defining rules on the calculated properties, you can create a flow that automatically estimates the quality of each LLM interaction. For example, by default summarization interactions with low Conciseness are deemed to be bad interactions.
  • Average values of calculated properties are shown in the Dashboard screen. You can then dive in to a specific property and see interactions with extreme values, such as extremely irrelevant answers.
  • Properties are shown in the data page, and can be used there to sort and filtered the viewed interactions. This is useful for example if you wish to see only the interactions with Toxicity > 0.5, and perhaps combine that filter with additional ones (e.g., a specific topic).

What kinds of properties are there?

Deepchecks LLM Evaluation has 3 types of properties - Built-in, Custom and LLM.

Built-in properties

You can read more about our Built-in properties in the dedicated section.

Custom properties

Custom properties are values that are passed by the user alongside the interaction fields, such as the LLM input and output. For example, you may want to know from what device a specific question was sent you your system. You can then define "Device" as a custom categorical property in the Custom Properties screen. Then, for example if you're using csv upload to send your data to they system, if the Device column exists in the csv its values will automatically be added to the system.

LLM properties

LLM properties are properties evaluated by LLMs, and are used to evaluate the more nuanced qualities of your LLM interactions. These are calculated by asking an LLM to grade your interaction according to given steps using a score of 1-5 (with higher being better).

When you create a task in Deepchecks LLM Evaluation your task is by default initialized with a given set of such properties appropriate for your task type. You can then go on and add new LLM properties by defining the steps the LLM will use when grading your interaction, and the components of the interaction it will be using to grade. For example, the default LLM property "Coherence", used for summarization tasks, is defined as follows: