What Is Arthur AI?
Arthur AI is a platform for monitoring, evaluating, and governing AI systems at scale. As organizations move from experimenting with AI to running it in production, they face a new class of operational problems: models drift, agents behave unexpectedly, outputs violate policy, and compliance teams have no audit trail. Arthur solves this by sitting between your AI systems and your stakeholders—collecting inference data, evaluating outputs against configurable metrics and policies, surfacing alerts when something goes wrong, and providing governance workflows to keep humans in the loop.
Whether you're running a single predictive model or a fleet of LLM-powered agents, Arthur gives you the observability infrastructure to operate AI with confidence.
Key Capabilities
Arthur is organized around four core capability areas:
🔍 Model & Agent Observability
Track the real-time and historical behavior of every application, in your workspace. Arthur ingests inference data, computes performance and data quality metrics, and surfaces anomalies automatically.
- Monitor all models across a workspace from a single dashboard
- Discover and inventory agentic systems—including their tools, sub-agents, LLM models, and data sources
- Drill into individual application behavior over time
🚨 Alerting
Define alert rules on any metric or model behavior, and get notified the moment a threshold is crossed. Alerts can be scoped to individual models or aggregated across a workspace.
- Create and manage alert rules per model
- View grouped alerts across an entire workspace
- Validate alert rule queries before deploying them
📋 Governance & Policy
Arthur's governance layer lets you define organization-wide policies, attach them to applications, and run automated compliance checks. Attestation rules ensure that human reviewers sign off on AI behavior at configurable intervals.
- Create multi-step policies with alert rules and attestation requirements
- Track compliance status across all applications
- Manage role-based access control at the organization, workspace, and project level
📊 Analytics & Reporting
Aggregate metrics across projects and workspaces to understand AI performance trends, policy compliance rates, and operational health at a glance.
Architecture Overview
Arthur is organized into a hierarchy of resources. Understanding this structure helps you navigate the platform and the API.
graph TD
ORG["Organization"]
WS["Workspace"]
PROJ["Project"]
APP["Application"]
POLICY["Policy"]
ALERT["Alert Rule"]
METRIC["Custom Metric"]
ORG --> WS
ORG --> POLICY
WS --> PROJ
PROJ --> APP
POLICY --> ALERT
APP --> ALERT
APP --> METRIC
| Layer | Description |
|---|---|
| Organization | The top-level tenant. Roles, policies, and role bindings can be defined here and cascade downward. |
| Workspace | A logical grouping of projects and applications. Most observability dashboards are scoped to a workspace. |
| Project | A collection of related applications, useful for organizing by team, use case, or product area. |
| Application | A deployed AI model or agent. Arthur collects traces, computes metrics, and enforces policies at this level. |
| Policy | An organization-level governance object that defines alert rules, attestation requirements, and compliance checks. |
Arthur connects to your existing infrastructure through Connectors—integrations with model serving platforms, data warehouses, and LLM providers. Once connected, Arthur automatically discovers agents and models running in your environment.
Who Is This For?
Arthur is designed for teams that are operating AI in production, not just experimenting with it. Here's how different roles use the platform:
| Role | How They Use Arthur |
|---|---|
| ML Engineers | Monitor model performance, set up drift alerts, and debug production regressions using inference data. |
| AI Platform / MLOps Teams | Manage the full inventory of models and agents, configure connectors, and maintain workspace organization. |
| Compliance & Risk Officers | Define governance policies, review attestation workflows, and audit AI behavior against organizational standards. |
| Engineering Managers | Use analytics dashboards to track AI health across projects and report on policy compliance to stakeholders. |
| Security Teams | Manage role-based access control, review permissions, and configure webhooks for integration with SIEM or ticketing systems. |
Arthur is particularly valuable for organizations that:
- Run multiple AI models or agents across different teams and need centralized visibility
- Operate in regulated industries (finance, healthcare, insurance) where AI decisions require audit trails
- Are scaling from prototype to production and need operational guardrails before broad deployment
- Have adopted agentic AI patterns (LLM agents with tools and sub-agents) and need to understand what those systems are actually doing
Next Steps
Now that you understand what Arthur AI is and what it can do, here's how to get started:
- Quickstart Guide — Connect your first model or agent to Arthur and see data flowing in minutes.
- Core Concepts — Deep-dive into workspaces, applications, models, and policies.
- Set Up Connectors — Learn how to connect Arthur to your model serving infrastructure.
- Create Your First Alert Rule — Configure alerting so you're notified when model behavior changes.
- Define a Governance Policy — Set up organization-level policies and attach them to your applications.
- Manage Users & Permissions — Invite your team and configure role-based access control.
New to Arthur? Start with the Quickstart Guide to get a working integration in under 15 minutes. You can explore governance and policy features once your first application is connected.
Updated about 22 hours ago