April 2026 Release Notes

Arthur Engine & Toolkit

Evaluation & Continuous Monitoring

  • Unified Evaluators interface. Consolidated two-tab layout combines evaluator and continuous eval management with inline controls and staleness warnings for better workflow efficiency.
  • Bulk evaluation testing. Run continuous evaluations against multiple trace IDs simultaneously with dedicated test run history tracking.
  • Automated compliance scheduling. Models with compliance policies now receive automatic 24-hour periodic checks with independent on-demand testing.
  • Policy violation metrics. New policy_alert_rule_check_count metric tracks individual rule violations with detailed dimensions for historic trend analysis.
  • Direct trace evaluation. Select specific traces from the traces table and launch targeted evaluation test runs through intuitive picker dialogs.

Trace Management & Observability

  • Trace retention policies. Configure organization-wide automatic deletion of expired traces and spans with background batch processing and circuit breaker protection.
  • Enhanced trace filtering. Substring matching for user IDs and improved token counting accuracy for multi-entry API responses.
  • GenAI Engine task ID propagation. OTLP spans now include task ID attributes for better correlation in external observability platforms.
  • Improved agent instrumentation. Better visibility into agents, skills, and subagent context propagation across tool spawns.

User Experience & Interface

  • Interactive AI assistant. Engine Chatbot with intelligent query capabilities, automatic model provider detection, and natural language resource management commands.
  • Enhanced navigation. Infinite scroll on All Tasks page removes 50-task display limits and improves task browsing experience.
  • Tag-based prompt filtering. Platform Management displays tags on prompts with multi-select server-side filtering for large prompt libraries.
  • Form protection dialogs. Confirmation prompts prevent accidental loss of transform builder configurations and dataset mappings.

Developer Experience & SDK

  • Configurable deployment options. Support for multiple genai-engine stacks on single machines with customizable ports and CORS policies.
  • Graceful model compatibility. API calls succeed with unknown model pricing by defaulting to $0.00 instead of throwing exceptions.
  • Apple Silicon support. Resolved MPS device compatibility issues for SentenceTransformer inference on Apple Silicon Macs.

Security & Infrastructure

  • Enhanced dependency security. Locked LiteLLM to version 1.80.0 with 3-day minimum release age for automated upgrades.
  • Standards-compliant HTTP. Chunked transfer encoding by default with proper header handling for 1xx and 204 status codes per RFC specifications.
  • Helm chart distribution. Enabled publishing to Docker Hub for open-source self-hosted deployments.

Bug Fixes

  • Fixed continuous evaluation results pagination displaying only single pages instead of full record counts.
  • Resolved system task bootstrap race conditions in multi-worker environments.
  • Corrected URL parameter handling with proper FastAPI path validation to prevent silent parameter swapping.
  • Fixed policy alert rule check metric to report accurate violation counts instead of always showing 1.0.