June 2026 Release Notes
about 3 hours ago by Ashley Nader
Arthur Platform
Governance & Compliance
- Organization-level governance views. New org-scoped endpoints aggregate policy assignments, compliance status, and unregistered agent discovery across all workspaces, eliminating workspace-by-workspace navigation.
- Organization-level agent governance. New endpoints surface agent tools and LLM models across an entire organization in a single request, with full org-level permission gating.
- Per-action RBAC gating. Governance policy management now enforces fine-grained access control across create, edit, delete, assign, compliance check, and attestation flows.
- Role-based governance personas. Support for Governance Admin, Workspace Policy Manager, Policy Viewer, and Workspace Policy Assignment Manager ensures users only see and perform role-appropriate actions.
- Reduced-scope governance view. Separate org-level and workspace-level governance routing lets administrators view governance data at both global and workspace scopes.
- Granular permissions gating. Action buttons for alert rules and attestation rules are now hidden from users without the appropriate permissions.
Alerts & Monitoring
- Alert rule logging. A new alert logs table and API routes provide historical audit trails of alert rule executions, tracking invocation outcomes and status over configurable intervals.
- Alert rule drawer. A new side panel opens from the alerts timeline to display detailed rule information, performance metrics, and alert history without navigating away.
- Alert rule access from violations. Users can open the alert rule drawer directly from the Current Violations tab to inspect or modify rules in context.
Analytics & Dashboards
- Dashboard import/export. New functionality lets users export and import dashboards.
- TimescaleDB automated upgrade. An automated upgrade process supports the 2.14.2 → 2.27.2 migration path with lock-safe mechanisms, safety checks, and supervised reconciliation.
Cloud Storage & Integrations
- GCS image support. Connector listing permissions now enable extracting and working with images stored in Google Cloud Storage.
Prompts Playground
- Floating labels and matching animations. The highlighted text box component now matches other prompt message boxes with consistent labels and animations.
Platform Security
- reCAPTCHA protection. reCAPTCHA is now configured for the production environment, adding bot protection for production users.
- Starlette upgrade. Updated to 1.3.1 in the app-plane service to remediate CVE-2026-48710 (missing Host header validation), with a minimum secure version enforced.
Frontend Architecture
- Pro component sub-bundle. MUI X Pro components are split into a dedicated
@arthur/shared-components/proentry point, allowing unlicensed consumers to build and install the core package without an MUI X Pro license. - Trace drawer customization. A new
renderBelowAnnotationBarslot on TraceDrawerBody enables injecting custom content below the annotation bar.
Bug Fixes
- The application dashboard no longer crashes or freezes when rendering large alert datasets, via lazy-loading of the AlertsTimeline component.
- The workspace homepage no longer freezes from a 422 retry storm caused by a stale permission enum reference.
- Console errors, prop warnings, and React key warnings during normal app usage have been resolved.
- Users with workspace reader, workspace policy manager, and engine workspace roles can now access the list models endpoint without governance admin privileges.
- Permission checks on the applications page now apply correctly within individual project contexts.
- Editing cloned or template-based dashboards no longer modifies the original dashboard.
- Configuring model steps no longer crashes when an engine lacks an internal connector.
Arthur Engine & Toolkit
Multi-Tenancy
- Multi-tenancy support. First-class multi-tenancy enables organizations to securely isolate workloads and data across distinct tenants within a single deployment, with tenant-scoped data and access controls.
- Tenant access restrictions. Admin-only endpoints and the create task button are now disabled for tenant users, enforcing role-based access on settings and configuration.
- Organization-level token limits. Deployments can enforce per-org lifetime LLM token caps via an opt-in environment variable; exhausted organizations see an out-of-credits dialog and receive a
429 TOKEN_LIMIT_EXCEEDEDresponse.
Guardrails
- Guardrails management UI. A comprehensive UI for creating rules, viewing rule cards, listing all rules, and testing prompts against guardrail behavior.
- Persistent guardrail states. Saved guardrail configurations now persist on tasks and carry across sessions.
- Guardrail trace spans. Stateful validation now emits trace spans capturing execution time and outcomes for prompt and response validation flows.
- Guardrail invocation visualization. The trace viewer now shows which guardrails fired, their results, and timing via a summary bar and individual invocation rows in the trace drawer.
Evaluators & Validation
- First-class ML evaluator support. Built-in model-based scorers for PII detection (GLiNER + Presidio), toxicity classification, and prompt injection detection are unified alongside LLM evaluators with CRUD management, version control, and continuous eval integration.
- Prompt injection validation endpoint. A new
/validateendpoint enables standalone, lightweight prompt injection checks without full pipeline configuration. - PII detection improvements. PERSON detections containing digits are dropped in V1, V2 name validation is tightened to reduce false positives, and passport entity classification now consistently processes through GLiNER for migration consistency.
Onboarding & Guided Tours
- Config-driven tour engine. An interactive tour system with persistence, analytics tracking, and overlays guides users through platform workflows, including a comprehensive "Evals 101" task tour.
- Engine onboarding flow. A guided setup experience walks new users through initial configuration with contextual onboarding prompts and self-service skill management.
- Improved tour layout and reliability. A draggable side-panel checklist with persisted collapsed state, auto-scroll for highlighted elements and active checklist steps, scrollable modals with fixed footers, and accessible color contrast.
- Demo completion certificates. Completion certificates are persisted in Postgres and served via new API endpoints, enabling users to generate and share stable certificate links for the Intro to Evals walkthrough.
Datasets & Observability
- GCS and S3 image support. Datasets can reference images stored in Google Cloud Storage and Amazon S3, with automatic conversion for inline display.
- Span error visibility. A new SpanErrorPanel displays parsed error messages from trace spans in a dedicated panel within the traces UI.
- Alert rule status logging. Check results (okay, fired, or no data) are recorded at regular intervals, enabling historical tracking of alert performance per model and rule.
- Kubernetes audit logging. New PVC-backed persistent audit log storage with Helm templates and shared-filesystem provisioner documentation.
- Search in tasks list. Task search with memoized TaskCard components improves rendering performance.
- Task analytics aggregation. New backend aggregation routes move task overview and analyze computation to the database for faster loads, with success rate using CE pass rate and corrected time bucketing.
Integrations & Deployment
- Azure integrations. A new Azure provider icon in the model providers UI and an Azure Blob Storage connector for ml-engine enable native storage connectivity.
- Truefoundry integration. Documentation and assets enable connecting and monitoring Truefoundry deployments.
- TLS private-cert support for LLM endpoints. New
GENAI_ENGINE_OPENAI_PRIVATE_CERT_DOWNLOAD_URLandGENAI_ENGINE_OPENAI_VERIFY_SSLsettings enable connections to LLM endpoints secured with private CA or self-signed certificates. - Flexible deployment configuration. New options including
SCOPE_FE_INGRESS_URI,FETCH_RAW_DATA_ENABLEDin CloudFormation, EFS-backed PVC instructions, and optional Amplitude analytics and reCAPTCHA Enterprise support. - Configurable session replay sampling. A new
VITE_AMPLITUDE_REPLAY_SAMPLE_RATEbuild argument controls Amplitude session replay sampling rates during deployment.
Prompts Playground
- Formatted agentic messages. The playground displays formatted assistant and tool messages with support for tool calls, variables in arguments, and complete agentic workflow simulation.
Security & Dependency Updates
- Critical CVE patches. Patched over a dozen high-severity vulnerabilities across LangChain, starlette, cryptography, pyjwt, form-data, urllib3, PyArrow, pypdf, python-multipart, pydantic-settings, and SQLAlchemy, and replaced the unmaintained gray-matter library.
Bug Fixes
- Closed multiple multi-tenant security gaps including reCAPTCHA fail-open rejection, notebook ownership validation, org-scoped session trace pagination, and token-count credential handling.
- Restored VALIDATION-USER and CHAT-USER access on trace ingestion and default-validation endpoints after a multitenancy regression.
- Feedback for task-less inference now lands in the correct system org.
- Claude Code sub-agent span hierarchy now correctly represents nested agent invocations, and tool execution spans are restored in demo agent traces.
- Restored the
definitionfield onTraceTransformResponsefor backward compatibility. - Fixed task list sorting mismatches for consistent ordering across views.
- Corrected trace-to-dataset column schema display when a transform is selected.
- Fixed the Eval Name field on the New Continuous Eval page showing premature validation errors.
- Fixed broken documentation, privacy policy, and terms links on the welcome page, and corrected 404 page hard refresh behavior.
- Resolved numerous tour interaction issues including disabled submit buttons, anchored highlights, dropdown layering, target self-healing, premature section-complete dialogs, occluder registration, and debounced "target lost" hints.
- Fixed the Variables panel being obscured by floating input labels and improved the "Open in Playground" button loading state.
- Improved dark mode contrast and readability.