Self-Host the Arthur Engine
Overview
The Arthur Engine is a self-hosted evaluation and guardrails engine for LLM applications. This page covers how to deploy it, configure authentication, and create your first task.
The recommended path is to deploy through the Arthur Platform UI, which generates a pre-configured install command for your environment. If you need full manual control — for example, to customise environment variables before first boot — the Docker Compose path is covered below.
What Is the Arthur Engine?
The Arthur Engine is a self-hosted evaluation and guardrails engine for LLM applications. It supports real-time guardrails, async LLM-as-a-judge evaluators, continuous production monitoring, RAG evaluation, prompt management, transforms, and agent evaluations.
Key concepts for this guide:
| Concept | Description |
|---|---|
| Task | A named context that groups guardrails, evaluators, traces, and results for a specific LLM application. |
| Admin Key | A bootstrap credential (GENAI_ENGINE_ADMIN_KEY) set at deploy time. Used only for administrative operations like creating user API keys. |
| User API Key | A scoped credential issued by the engine. Used by your application for all runtime calls. |
Admin key vs. user API keyThese are two distinct credential types and are not interchangeable. The admin key is a static secret you set in your environment — think of it as a root password. User API keys are dynamic, revocable tokens your application uses for day-to-day API calls.
Prerequisites
Before you begin, confirm you have the following:
- curl or an HTTP client for testing API calls
- Python ≥ 3.11 (optional, for SDK examples)
- Outbound internet access if your evaluation rules call external model endpoints
Docker is only required if you choose the manual Docker Compose deployment path. The Arthur Platform UI supports other deployment targets (AWS, Kubernetes, GCP, Azure) that do not require Docker locally.
Install the Python SDK now if you plan to follow the SDK code examples:
pip install "arthur-observability-sdk[openai]"Deploy via the Arthur Platform
The fastest way to deploy the Arthur Engine is through the Platform UI. It walks you through configuration and generates a pre-configured install command tailored to your environment.
-
Navigate to Engines Management in your workspace:
https://platform.arthur.ai/workspaces/{your_workspace_id}/engines -
Click + ENGINE and step through the wizard.
-
On the Select Install Method step, choose your target environment:
Method Details Docker Ideal for local development or single-server deployments AWS CloudFormation-based deploy — GPU or CPU stack. See AWS deployment guide Kubernetes Helm chart deploy on any Kubernetes distribution. See Kubernetes deployment guide GCP Deploy to Google Cloud Platform using Cloud Run Azure Deploy to Azure using Container Instances -
On the Install step, the platform provides your pre-configured install command (for Docker) or links to deployment documentation with your generated client secret (for AWS and Kubernetes). Run the command or follow the linked guide.
-
Once the engine connects, click Continue to Project Setup.
Configure Authentication
Use the admin key to create a user API key. This is the credential your application uses for all subsequent calls. The raw key is returned once — store it immediately.
from arthur_observability_sdk import Arthur
arthur = Arthur(
api_key="<USER_API_KEY>",
base_url="http://localhost:3030",
task_id="<YOUR_TASK_ID>",
)const response = await fetch("http://localhost:3030/auth/api_keys/", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${process.env.GENAI_ENGINE_ADMIN_KEY}`,
},
body: JSON.stringify({ name: "my-app-key" }),
});
const { api_key } = await response.json();
// Store this value — it is shown only once.curl -X POST http://localhost:3030/auth/api_keys/ \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${GENAI_ENGINE_ADMIN_KEY}" \
-d '{ "name": "my-app-key" }'Store the key in an environment variable:
export ARTHUR_API_KEY="ak_live_xxxxxxxxxxxxxxxxxxxx"From this point forward, all API calls use
ARTHUR_API_KEY, notGENAI_ENGINE_ADMIN_KEY.
Create Your First Task
A task is the top-level organisational unit in the Arthur Engine. You must create one before attaching guardrails, evaluators, or sending traces.
from arthur_observability_sdk import Arthur
arthur = Arthur(
api_key="ak_live_xxxxxxxxxxxxxxxxxxxx",
base_url="http://localhost:3030",
task_id="<YOUR_TASK_ID>",
)const response = await fetch("http://localhost:3030/api/v2/tasks", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${process.env.ARTHUR_API_KEY}`,
},
body: JSON.stringify({ name: "my-first-task" }),
});
const task = await response.json();
console.log("Task ID:", task.id);curl -X POST http://localhost:3030/api/v2/tasks \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${ARTHUR_API_KEY}" \
-d '{ "name": "my-first-task" }'Next Steps
With the engine running and a task created, here's where to go next:
- Quickstart: Evaluate Your First LLM Call → — Instrument your application and send your first trace.
- Guardrails → — Set up real-time rules to block or flag bad prompts and responses.
- Evaluators & Continuous Evals → — Score production traces with LLM-as-a-judge evaluators.
- Connect Your Application → — Instrument your LLM framework with the Arthur SDK.
Tip: For production deployments, setGENAI_ENGINE_ADMIN_KEYusing your secrets manager (AWS Secrets Manager, HashiCorp Vault, etc.) rather than a plain.envfile.
Updated about 22 hours ago