Product DocumentationAPI and Python SDK ReferenceRelease Notes
Schedule a Demo
Schedule a Demo

User-Defined Metrics

Track and communicate the unique ways your internal team or external stakeholders define performance

Beyond Arthur's UI capabilities and APIs, Arthur's Python SDK has a built-in Query Functionality. This SQL-like query structure allows teams the ability to turn any functions they create into alert-able metrics within the Arthur platform.

Pieces of User-Defined Metrics

There are three pieces to define when creating user-defined metrics:

Metric Name: The name refers to how the user identifies or calls the metric.

Metric Type: The Arthur platform has four unique metric types. These are:

Metric Type
Performance
DataDrift
DataBound
ModelOutput

Typically, Arthur infers the type of metric from the function provided. However, for more advanced metrics that Arthur cannot infer or for teams that want to ensure a specific metric type, teams should define the specific metric type.

Arthur Query Function: Finally, the most important aspect of defining a metric is creating the mathematical function the metric will track. These functions are built with the Arthur Query structure. More about building out this Arthur Query function can be found below:

Building an Arthur Query Function

A more in-depth querying guide can be found in the query section. However, there are some key things to keep in mind:

  1. The query must be built with the information contained within Arthur (per inference). Additional information beyond model inputs (features) and outputs (predictions) can be added to Arthur as Non-Input Attributes.
  2. The query should return a single value. For example, the query should not return a row for each inference, time-series data, or score for multiple attributes.
  3. The query should generally not include filters unless it is to define a very specific segment you wish to track (and not a further filter). This is because filters can easily be defined when evaluating a metric. Keeping the metric definition general allows different segmentations to be easily applied.
  4. The query may include parameters, which are denoted by {{ param_name }} template values in the query definition and have corresponding entries in the parameters field on the metric definition.

Custom Data Drift Metrics

Of note for teams looking to create custom metrics for data drift, data drift uses a special query structure. So, these metrics would need to be defined with this special structure. Teams also need to add in is_data_drift = True to their definition.

Defining Metrics Examples

Using Python SDK

We typically recommend that teams create custom metrics within a notebook with our Python SDK. This is because teams can easily craft and validate their queries in a notebook before onboarding them to Arthur.

Follow this notebook example to learn more: Coming Soon

Using API

You can define a custom metric for your model by sending a POST request to the dedicated metrics endpoint at /models/{model_id}/metrics. Check out the API Reference guide for the full specification.