Evals Engine Overview

What is the Arthur Evals Engine?

Federated Data Architecture

Strong Data Security and Governance is a central tenant to the Arthur Platform, and we've designed the Platform to have strong infrastructure controls around data access. This is implemented using a federated data architecture, which includes a centralized Control Plane and federated Data Planes.

Security Guarantees with the Federated Data Plane

  1. Data does not leave the environment
    1. The Data Plane is not addressable - there are no in-bound connections to Engines
    2. Engines can be configured to turn off Inference Deep Dive Jobs (jobs which allow users to view/interact with raw data)
  2. Only aggregations are stored - no raw data
    1. The Control Plane only keeps track of aggregated metric values + associated metadata
    2. Inference Deep Dive Jobs store raw data on-demand in an ephemeral data store with a Time To Live (TTL) of 30 seconds (and this is only available for Engines which are configured to send Raw Data)
  3. Metric Computation happens in the same environment where the data is governed
    1. Data and Information Security Analysts have complete control over where/how data is accessed by the Platform
    2. Administrators can manage data access in-environment, and at any point can revoke access to the Engine

How the Engine works

The Arthur Engine is a distributed and highly available job-runner which executes jobs that are created and managed within the Arthur Platform.

For example:

  • User starts a new Metrics Calculation Job in the Dashboard UI for a specific model
  • The Arthur Platform APIs enqueue a job, keyed on the Engine Identifier, indicating that the Engine should compute new metrics for a specific model (defined by the Job Specification)
  • The Engine polls the Platform multiple times per-second to determine if there are new jobs to execute - it determines that there is a new Metrics Calculation Job
  • The Engine runs the Metrics Calculation Job and uploads results from the job to the Arthur Platform
  • The user can now see the newly computed metrics in their Model Dashboard UI

The Arthur Engine can be managed across multiple data-centers and scales up to meet demand.