August 2025 Release Notes

by Pranav Shikarpur

New Platform Features

  • Sneek Peak: Support for Agentic AI is now available in the Arthur Platform

Engine Release Notes

New Features

  • Agentic monitoring is now supported in the GenAI Engine: Building on the recently added /traces/ API, this release introduces support for monitoring agentic behavior:
    • Tasks now include an is_agentic flag to enable targeted analysis and evaluation.
    • Metrics and traces APIs have been upgraded to support structured outputs, trace reconstruction, and intelligent defaults.
    • The engine selectively computes metrics for agentic tasks, improving the precision of evaluations.
  • Added support for new Database connector: We’ve introduced a new ODBC-based Database connector with support for MSSQL, PostgreSQL, Oracle, and MySQL. This includes enhanced configuration options (e.g., table name, dialect) and standardized field naming for easier integration and future extensibility.

Enhancements/Bug Fixes

  • Added CloudFormation launch button with pre-populated client ID

  • Addressed API key validation latencies for users with large numbers of API keys.

  • Converted hallucination LLM call to structured output to improve accuracy

  • Added possible_segmentation tag to improve model segmentation diagnostics.

  • Addressed a bug related to incorrect function renaming after a refactor.

  • Guardrails Enterprise

    • Vulnerability Fix:Patched pillow vulnerability CVE-2025-48379
    • Enhancement:Introduced a feature flag for the PII rule that enables the administrators to toggle between the standard and the "strict" mode
    • Bug fix:Fixed the email address PII detection issue that was not catching addresses with a certain format

July 2025 Release Notes

by Pranav Shikarpur

New Features:

  • Added support for Multimodal CV evals with metrics + visualizing inferences in the Arthur Platform.
  • Users can now optionally configure attributes to segment over when defining metrics.
  • Engine Installation flow now supports non docker installation methods.
  • Support for consuming OTEL traces emitted from LLM + Agentic Applications.
  • Support for segmenting metrics on values (inc. model version id, prompt version id, etc.)
  • Support for experiment tracking in the Arthur Dashboard. Now offers the ability to segment metrics (eg: by prompt-version, model-version).
  • New navigation bar to improve usability and discoverability of platform functionality.
  • Support for non-docker installation methods for the Arthur Engine.

Enhancements:

  • Made significant performance improvements to the PII detection model, resulting in fewer false positives.
  • The inference deep dive table now returns up to 50 rows per page.
  • Improved hallucination detection for numbered lists and other structured formats.
  • Introduced configurable max-token limit for hallucination checks, helping users fine-tune thresholds for context.
  • Metrics task ID now exposed in the GenAI model UX
  • Filters UI fixes for Inference Deep Dive.