An alert is a message notifying you that something has occurred with your model. With alerts, Arthur makes it easy to provide a continuous view of your model by highlighting important changes in model performance. When defining alerts in Arthur, there are a few things you need to consider:
Two alert severities are available in Arthur; they are warning and critical.
Teams can set their own severity for alerts. We typically recommend that teams set two different thresholds for the same value, marking the less severe as Warning and the more as Critical.
An alert is triggered based on an alert rule, which you define using a metric and a threshold. So to create an alert in Arthur, you need to:
A metric in Arthur is a function for evaluating model performance. These can be common functions that data scientists or ML teams are familiar with, such as accuracy or F1 Score. Or they can be functions specific to a model's use case, such as Fairness Metrics or User Defined Metrics. This means that any metric you can create in Arthur (including segmentations, filters, or logical functions) can be transformed into an alert.
For the latter, following the proper steps to create the metric within Arthur for your specific model is important. This is because it must first be created for your model to alert on a metric.
After creating your metric, it is time to decide what level of underperformance you would like to be alerted to. This numeric value is called the alert threshold. In Arthur, alert threshold values have to be manually set. Users must also define whether they want to be alerted when their function exceeds that numeric threshold. This is the upper or lower bound of the alert.
For batch models, alerts are calculated per batch of data. However, teams that are running streaming models need to decide how often they would like alerts to be calculated and how much data. To make these decisions, they have to clarify two values:
Lookback Period: How much data do they want to aggregate in their function? (Do they want to be alerted when the average of just one minute of data has passed the metric threshold, or do they only care if it's affected the average of a day or week).
Alert Wait Time: How often do you want to be alerted that something is happening? This is how often they would like the alert to be calculated (and triggered if the function threshold is met).
Default data drift alerts for Feature and Prediction Drift are automatically created for every feature once reference and inference data are sent to Arthur. These alerts are created with dynamic threshold values specific to your reference dataset from the “data_drift” endpoint with “metric”: “Thresholds.”
The UI provides a clickable walk-through guide for teams to make common performance, drift, and data-bound alerts. The common practice of alerting on segments (or filters) of your data when calculating the metric is also added as an optional step.
See below an example of defining an accuracy alert rule for women in the UI to be alerted every hour for the last 24 hours of inferences.
Our predefined alerting structure is not the only way teams can create alerts. Alerts can be made for any customizable User Defined Metric teams create for their model in Arthur. Teams must first create the user-defined metric, and then they can easily set the alert from their Python SDK notebook.
The latest alerts in your organization are shown on the homepage of your Arthur organization. However, beyond being highlighted in the online Arthur UI, alerts can be delivered to teams via email and/or via integrations such as PagerDuty and Slack. You can learn more about setting up those integrations here.
The dedicated alerts endpoint is
Updated about 2 months ago