July 2025 Release Notes
by Pranav ShikarpurNew Features:
- Added support for Multimodal CV evals with metrics + visualizing inferences in the Arthur Platform.
-
- Users can now optionally configure attributes to segment over when defining metrics.
-
- Engine Installation flow now supports non docker installation methods.
-
- Support for consuming OTEL traces emitted from LLM + Agentic Applications.
- Support for segmenting metrics on values (inc. model version id, prompt version id, etc.)
- Support for experiment tracking in the Arthur Dashboard. Now offers the ability to segment metrics (eg: by prompt-version, model-version).
- New navigation bar to improve usability and discoverability of platform functionality.
- Support for non-docker installation methods for the Arthur Engine.
Enhancements:
- Made significant performance improvements to the PII detection model, resulting in fewer false positives.
- The inference deep dive table now returns up to 50 rows per page.
- Improved hallucination detection for numbered lists and other structured formats.
- Introduced configurable max-token limit for hallucination checks, helping users fine-tune thresholds for context.
- Metrics task ID now exposed in the GenAI model UX
- Filters UI fixes for Inference Deep Dive.