DocumentationAPI ReferenceRelease Notes
DocumentationLog In
Release Notes

0.17.0 Release Notes

This version which includes a new overtime view for property values and score for production data, and exciting property related improvements, along with more features, stability and performance improvements, that are part of our 0.17.0 release.

Deepchecks LLM Evaluation 0.17.0 Release

  • 📉 Overtime Production View for Monitoring
  • 💬 New Property: Information Density
  • 🏃🏼‍♀️‍➡️ Ability to Rerun Annotation Pipeline on Multiple Versions in Application
  • 🔤 OpenAI Support for LLM Properties
  • 🥑 Improvements to Relevance and Grounded in Context Properties

What’s New and Improved?

  • Overtime Production View for Monitoring

    • In Production Environment, annotation scores and property scores are displayed over time

    • Timestamps are taken from the "started_at" field for each interaction. If no timestamp was give, current time of upload will be considered as interaction time.

  • New Property: Information Density

    • Information density is a score between 0 and 1, measuring the ratio of statements that convey information (e.g. facts, suggestions), out of all statements in the output. Read more about it in the Information Density Property documentation.
    • It helps finding places where the outputs aren’t actually useful, whether if the desired information is missing (e.g. the answer is not complete), avoided, or very general without directly addressing the user’s desire.
  • Rerun Annotation Pipeline on Multiple Versions

    • After uploading a new customized “Auto-Annotation YAML” in the Annotation Config screen, the annotation can now be conveniently rerun on all versions and environment in application, with the “Run Annotation Pipeline” button on the top right.

  • OpenAI Support for LLM Properties

    • LLM Properties can now run using OpenAI (vs. Azure OpenAI). This setting is organization wide, and can be enabled by request.
  • Improvements to Relevance and Grounded in Context Properties

    • Improved recall and precision on a wide set of benchmarks. Improving accuracy of Grounded in Context score also when grounding information is distributed between different documents.