DocumentationAPI ReferenceRelease Notes
DocumentationLog In
Release Notes

0.7.0 Release Notes

This page includes includes updates from our 0.7.0 Release, which includes new features, stability and performance improvements.

Deepchecks LLM Evaluation 0.7.0 Release

  • 👍👎 Annotation capabilities and filtering
  • 🤓 Improved view for interaction
  • 👯‍♂️ Similarity based comparison on user-given IDs
  • ⌛️ Pending status for samples that are in the auto-annotation pipeline
  • 𝍈 Dark Mode in Beta

🚧

Note: Breaking Changes

There are several breaking changes in the SDK and Rest API integrations with Deepchecks LLM Evaluation in this release. Please consult the 0.7.0 API Changes (breaking 0.6.0) guide for more information.

What's New and Improved

  • Annotation capabilities and filtering
    • Download and Annotate Flow: Deepchecks now supports updating annotations in csv, and re-logging the samples to the system. If identical samples are uploaded to the system, they won't be duplicated, but rather their annotation values will be overriden, allowing the user to download all samples / any selected subset, annotated offline, and update new results to the same application version.
    • User-logged annotations within system include the ability to log a textual reason
    • Annotations and estimated annotations can be filtered according to reason type, with the new "Reason" column in the data page.
    • Annotations can be viewed and updated both from data screen and from individual interaction view.
  • Improved view for interaction
    • Interaction Input and Output can be see together in the interaction view, enabling more efficient annotation.
    • Annotation and annotation reason added to the interaction view.
    • Improved property values view within the interactions view: Viewable properties are consistent throughout all phases, and include the selected properties from the Data page and after them the selected properties from the Overview page.
  • Similarity based comparison on user-given IDs
    • For similarity-based annotation, user_interaction_id will be used to identify similar samples across versions. If that id is not supplied, deepchecks will auto-generate ids, and will use the input field to identify similarity.
  • Pending status
    • Interactions with "Unknown" annotation are fed into the auto-annotation pipeline. New "Pending" status to mark that they are still pending calculation, and no result returned yet from pipeline. Results can be: "Estimated Bad", "Estimated Good" or "Unknown" when not able to estimate with the given auto-annotation configuration.
  • Dark Mode in Beta
    • Currently configurable within the "Workspace Settings" page