June 2026 Release Notes

Arthur Platform

Governance & Access Control

Organization-level governance views. New org-scoped endpoints aggregate policy assignments, compliance status, unregistered agents, agent tools, and LLM models across all workspaces, eliminating workspace-by-workspace navigation for admins.
Per-action RBAC gating for governance. Fine-grained permissions now cover create, edit, delete, assign, compliance, and attestation actions across Governance Admin, Workspace Policy Manager, Policy Viewer, and Assignment Manager personas.
Alert rule drawer. A new side panel opens from the alerts timeline and from the Current Violations tab, displaying rule details, performance metrics, and alert history without navigating away.
Permissions gating across governance features. Alert rules, attestation rules, and governance action buttons are now hidden from users who lack the required permissions; a RolesDebugPanel and enhanced PermissionDebugWidget support role validation.

Security & Account Protection

reCAPTCHA Enterprise for sign-up. Token verification is now applied to POST /api/v1/sign-up, protecting account registration against automated abuse.
Starlette upgraded to 1.3.1. Remediates CVE-2026-48710 (missing Host header validation) in the app-plane service.
arthur-client dependency patches. urllib3 bumped to 2.7.0 (CVE-2026-44431, CVE-2026-44432) and cryptography to 49.0.0 (GHSA-537c-gmf6-5ccf).
Renovate lockFileMaintenance enabled. Weekly automated lockfile refreshes keep transitive dependencies patched without manual intervention.

Analytics & Dashboards

Dashboard import/export. Users can now import and export dashboards via the updated Upsolve SDK.
TimescaleDB automated upgrade. A new migration path supports the 2.14.2 → 2.27.2 upgrade with lock-safe mechanisms, safety checks, and supervised reconciliation.
AlertsTimeline lazy loading. The application dashboard no longer crashes or freezes when rendering large alert datasets; the component now loads on viewport entry.
GCS image support. Connector permissions are now granted, allowing users to extract and display images stored in Google Cloud Storage in their workflows.

Trace Visualization & UI

renderBelowAnnotationBar slot. A new extensibility point in the TraceDrawerBody component lets developers inject custom content below the annotation bar.
Prompts Playground visual consistency. Floating labels and matching animations have been added to highlighted text fields, aligning them with assistant message components.
Onboarding tour moved to shared components. Tour engine, intro dialog, side panel, and spotlight widgets are now packaged as a reusable shared library for use across Arthur applications.

Bug Fixes

Fixed workspace list-models permission so reader, policy manager, and engine workspace roles can access the endpoint without governance admin privileges.
Fixed permission checks on the applications page for individual project contexts.
Fixed editing cloned or template-based dashboards inadvertently modifying the original dashboard.
Fixed graceful fallback for engines without an internal connector, preventing crashes during model step configuration.
Fixed a permissions check bug causing a 422 retry storm on the workspace homepage due to a stale WorkspaceListAlertRules enum reference.
Fixed an idempotent post-bootstrap migration that backfills the missing engine_internal connector for existing data plane associations, resolving "Connector ID not found" errors in the task-creation wizard.

Arthur Engine & Toolkit

Onboarding & Guided Tours

"Evals 101" in-app task tour. A config-driven, step-by-step guided tour walks new users through experiment creation, dataset review, prompt engineering, and evaluation workflows, with a draggable checklist, persistent progress, and auto-scroll to active steps.
Side panel checklist layout. The onboarding checklist was refactored into a side panel with collapsed-state persistence in local storage, preventing the widget from re-expanding between tour sections.
Auto-scroll for highlighted elements. Action elements automatically scroll into view during task tours, accounting for the side panel width so targets remain visible with the drawer open.
Tour UI state management. Setup and cleanup is now enforced for every tour step, preventing state leakage and visual glitches between steps via occlusion management and step prerequisites validation.
Tour analytics instrumentation. Certificate actions (download, share, dialog interactions) and time-to-complete and abandonment metrics are now tracked per tour step.

Guardrails

Guardrails management UI. Users can create rules, view rule cards, list all rules, and test prompts against guardrail behavior from a new end-to-end UI.
Guardrail trace spans. Execution time and outcomes for prompt and response validation are now emitted as trace spans, with a new summary bar and invocation rows visible directly in the trace viewer.
Guardrail state persistence on tasks. Saved guardrail configurations carry across sessions via new draft management hooks.

Security & Validation

Dedicated /validate endpoint. Standalone prompt injection checks are now available without full pipeline configuration, enabling lightweight guardrail integration.
Private CA / self-signed TLS support for LLM endpoints. Two new settings (GENAI_ENGINE_OPENAI_PRIVATE_CERT_DOWNLOAD_URL and GENAI_ENGINE_OPENAI_VERIFY_SSL) allow the engine to trust private certificates, eliminating SSL errors when using LLM proxies behind self-signed certs.
reCAPTCHA Enterprise support. Operators can now optionally enable reCAPTCHA protection for deployments via environment variables; existing deployments without these values remain unaffected.
PII detection improvements. PERSON detections containing digits are now correctly dropped in V1, and V2 name validation is tightened to reduce false positives. Passport entity classification now consistently processes through GLiNER for alignment with shield's implementation.

Multi-Tenancy & Access Control

Multi-tenancy support. Tenant-scoped data and access controls are now supported end-to-end, allowing organizations to securely isolate workloads within a single deployment.
Organization-level token limits. Per-org lifetime LLM token caps can be enforced via environment variable; exhausted orgs receive a "out of credits" dialog and subsequent LLM calls return a structured 429 TOKEN_LIMIT_EXCEEDED response.
Admin-only endpoint enforcement. Settings menu, application configuration, and task creation are now restricted from tenant users.

Evaluators & ML Scoring

First-class ML evaluator support. Model-based scorers for PII detection (GLiNER + Presidio), toxicity, and prompt injection are now unified alongside LLM evaluators with CRUD management, version control, and continuous eval integration.
Backend aggregation for task analytics. New database-side aggregation routes power the task overview and analyze pages, with corrected success rate calculations (CE pass rate) and time bucketing (start_time).

Data & Storage

S3 and GCS image support for datasets. Images stored in S3 or Google Cloud Storage buckets are automatically extracted and converted to base64 for inline display alongside other image formats.
Azure Blob Storage connector. Direct data ingestion and artifact storage from Azure environments is now supported in ml-engine.
Demo completion certificates. Users who complete the introductory Evals walkthrough can generate and share a permanent certificate link (PNG) to LinkedIn or X, stored in Postgres and served via two new API endpoints.

Observability

Guardrail invocation visualization in the trace viewer. A summary bar and per-invocation rows show which guardrails fired, their results, and timing directly within the trace drawer.
SpanErrorPanel component. Parsed error messages from trace spans are now displayed in a structured panel within the traces UI.
Alert rule status logging. Check results (okay, fired, no data) are recorded at regular intervals, enabling historical tracking of alert performance per model and rule.
Kubernetes audit logging. A new PVC for persistent audit log storage and Helm templates are included, with documentation for shared-filesystem provisioner configuration.
Advisory container vulnerability scanning. Docker Scout and Trivy scan all customer-facing images on every build and on a daily schedule, with SARIF reports, SBOMs, and VEX justification documents published automatically.

Security & Dependency Updates

LangChain upgraded to 1.x / 1.3.11. Patches CVE-2026-34070 (path traversal in load_prompt) and GHSA-gr75-jv2w-4656 (sandbox escape in file-search middleware).
transformers upgraded to v5.0.0. Addresses critical code-execution CVEs CVE-2025-14926 and CVE-2025-14927 in malicious model checkpoint conversion.
authlib upgraded to v1.6.12. Fixes critical OAuth 2.0 open redirect vulnerability CVE-2026-41479.
litellm upgraded to v1.89.3. Remediates CVE-2026-40217, a guardrail sandbox-escape RCE in the LiteLLM proxy.
Container base-OS hardening. Fixable HIGH/CRITICAL CVEs in libssl3, libc6, and libexpat1 patched; unused Python 3.11 runtime removed; model-upload images migrated to Python 3.12.
Additional CVE patches. starlette, cryptography, pyjwt, form-data (CVE-2026-12143), urllib3 (CVE-2026-44431), PyArrow (CVE-2026-25087), pypdf (CVE-2026-48156, CVE-2026-48155), python-multipart (CVE-2026-42561), requests (CVE-2026-25645), python-dotenv (CVE-2026-28684), pydantic-settings, hf-xet (RUSTSEC-2026-0104), and multiple frontend advisories (axios, vite, vitest, react-router-dom).

Bug Fixes

Fixed Claude Code sub-agent span hierarchy so nested agent invocations are correctly represented in trace views.
Restored the definition field on TraceTransformResponse to maintain backward compatibility.
Fixed missing tool execution spans in demo agent traces.
Fixed feedback for task-less inference incorrectly landing in the wrong org.
Restored VALIDATION-USER and CHAT-USER access on trace ingestion and default-validation endpoints, fixing a regression from a multitenancy refactor.
Fixed the Eval Name field on the New Continuous Eval page showing premature validation errors on load.
Fixed trace-to-dataset column schema display to correctly reflect dataset columns when a transform is selected.
Fixed a version regression where a merge conflict reverted the dev branch and caused duplicate Docker image overwrites.
Fixed URL validators not running in the GenAI Engine due to incorrect decorator order.
Fixed the guided tour popover overlapping the docked checklist panel.
Fixed premature "Section complete" dialog in the task tour by adding async-aware review gates.
Fixed the Variables panel in the prompts playground being obscured by floating input labels.
Fixed the red "target lost" hint flashing on nearly every click during task tour Step 3 with a debounced gate.