0.13.0 Release Notes
This version includes enhancements to the version comparison, additional similarity metrics: ROUGE and BLUE, expansions to the insights mechanism, and more features, stability and performance improvements that are part of our 0.13.0 release.
Deepchecks LLM Evaluation 0.13.0 Release
- 👀 Similarity additions: added ROUGE and BLEU and allow sorting by similarity
- 👠Version comparison improvements: versions metadata, updated versions screen and comparison dialogue
- 💡 Expanded insights mechanism
- 📂 Application can now be created with SDK
What's New and Improved
-
Similarity Additions
-
ROUGE and BLEU metrics are now calculated between the outputs of every two similar interactions across versions (marked by having the same
user_interaction_id
), in addition to the existing Deepchecks similarity. -
In the Data screen and in the Versions screen, the Similarity column can now be used for sorting.
-
-
Version Comparison Improvements
-
New design for Versions Screen, allowing to alternate between environments or expand a single version on click, to see all of its environments. Versions can now be compared across multiple properties.
-
Version's metadata can be added when creating a new version or editing a version. The version description is viewable upon hover in comparison screen, other fields are viewable upon opening the edit mode.
-
When choosing to see different interactions across versions, interactions can now be browsed to together (when scrolling and exploring different interaction sections)
-
-
Expanded Insights Mechanism
- Insights for weak segments detection now run on all characteristics, including: topics, custom properties, and LLM properties, in addition to the built-in properties as before.
-
Create Application via SDK
- Either as part of the init of the Deepchecks Client (see relevant section in SDK Quickstart), or with the create_application function in the SDK.