March 2026 Release Notes

Arthur Platform

Custom aggregation precision. Tests now return exact float values instead of rounded integers, maintaining full numerical accuracy for better insights into model performance data.

Project-based access enforcement. Users can now only view models and alerts in projects they have permission to access, preventing unauthorized data exposure across workspaces.

Enhanced AWS integration. Updated boto3 dependencies and authentication libraries improve cloud service compatibility and security.
Component synchronization. Frontend and application plane components are now properly aligned for consistent user experience across all platform interfaces.

Unified navigation system. All major product areas now use streamlined tabbed interfaces, replacing scattered navigation with intuitive single-entry points for RAG, Prompt, Evaluate, and Test features.
Dark mode improvements. Fixed contrast issues and converted all components to MUI theme colors for automatic dark mode support and better accessibility.
Settings reorganization. Moved global settings like Model Providers and API Keys from task sidebar to dedicated settings gear menu.

Visual span selector. Users can now select data directly from trace viewer for continuous eval creation instead of manual typing.
Inline eval creation. Create evaluations directly from trace viewer with side-by-side span inspection and streamlined workflow.
Clickable trace links. Experiment test cases now include trace ID links that open detailed trace viewer in new tab.

Advanced task filtering. All Tasks page now includes comprehensive filter, sort, and visibility controls with activity window filters.
Task archival system. Users can archive and unarchive tasks with proper rule and metrics handling for better organization.
Enriched task metadata. Agent information now includes tools, sub-agents, models, and infrastructure details for better visibility.

Enhanced dataset search. Search now queries full dataset instead of only current page results for more comprehensive discovery.
Wildcard transform support. Both UI and backend now support wildcard transforms with visual configuration options.
Functional table sorting. Sort arrows now work properly across traces, spans, sessions, and users tables.

Fixed GCP span kind storage issues in span_kind column.
Improved prompt playground to fetch all paginated prompts instead of only first page.
Enhanced error handling for Anthropic API compatibility requirements.
Updated pypdf, NLTK, and Flask to address critical security vulnerabilities.