Evals Engine Overview
What is the Arthur Evals Engine?
Federated Data Architecture
Strong Data Security and Governance is a central tenant to the Arthur Platform, and we've designed the Platform to have strong infrastructure controls around data access. This is implemented using a federated data architecture, which includes a centralized Control Plane and federated Data Planes.

Security Guarantees with the Federated Data Plane
- Data does not leave the environment
- The Data Plane is not addressable - there are no in-bound connections to Engines
- Engines can be configured to turn off Inference Deep Dive Jobs (jobs which allow users to view/interact with raw data)
- Only aggregations are stored - no raw data
- The Control Plane only keeps track of aggregated metric values + associated metadata
- Inference Deep Dive Jobs store raw data on-demand in an ephemeral data store with a Time To Live (TTL) of 30 seconds (and this is only available for Engines which are configured to send Raw Data)
- Metric Computation happens in the same environment where the data is governed
- Data and Information Security Analysts have complete control over where/how data is accessed by the Platform
- Administrators can manage data access in-environment, and at any point can revoke access to the Engine
How the Engine works
The Arthur Engine is a distributed and highly available job-runner which executes jobs that are created and managed within the Arthur Platform.
For example:
- User starts a new Metrics Calculation Job in the Dashboard UI for a specific model
- The Arthur Platform APIs enqueue a job, keyed on the Engine Identifier, indicating that the Engine should compute new metrics for a specific model (defined by the Job Specification)
- The Engine polls the Platform multiple times per-second to determine if there are new jobs to execute - it determines that there is a new Metrics Calculation Job
- The Engine runs the Metrics Calculation Job and uploads results from the job to the Arthur Platform
- The user can now see the newly computed metrics in their Model Dashboard UI
The Arthur Engine can be managed across multiple data-centers and scales up to meet demand.
Updated about 2 months ago