0.13.0 Release Notes

This version includes enhancements to the version comparison, additional similarity metrics: ROUGE and BLUE, expansions to the insights mechanism, and more features, stability and performance improvements that are part of our 0.13.0 release.

Deepchecks LLM Evaluation 0.13.0 Release

👀 Similarity additions: added ROUGE and BLEU and allow sorting by similarity
👭 Version comparison improvements: versions metadata, updated versions screen and comparison dialogue
💡 Expanded insights mechanism
📂 Application can now be created with SDK

What's New and Improved

Similarity Additions
- ROUGE and BLEU metrics are now calculated between the outputs of every two similar interactions across versions (marked by having the same user_interaction_id), in addition to the existing Deepchecks similarity.
- In the Data screen and in the Versions screen, the Similarity column can now be used for sorting.
Version Comparison Improvements
- New design for Versions Screen, allowing to alternate between environments or expand a single version on click, to see all of its environments. Versions can now be compared across multiple properties.
- Version's metadata can be added when creating a new version or editing a version. The version description is viewable upon hover in comparison screen, other fields are viewable upon opening the edit mode.
- When choosing to see different interactions across versions, interactions can now be browsed to together (when scrolling and exploring different interaction sections)
Expanded Insights Mechanism
- Insights for weak segments detection now run on all characteristics, including: topics, custom properties, and LLM properties, in addition to the built-in properties as before.
Create Application via SDK
- Either as part of the init of the Deepchecks Client (see relevant section in SDK Quickstart), or with the create_application function in the SDK.