Querying Data Drift
Querying Drift in Python
The basic format of a drift query using the Python SDK involves specifying that the
query_type
parameter has the value 'drift':
query = {...}
arthur_model.query(query, query_type='drift')
Data Drift Endpoint
Data drift has a dedicated endpoint at Query Data Drift.
Returns the data drift metric between a base
dataset with a target
dataset. This endpoint can support up to 100 properties in one request.
num_bins
- Specifies the granularity of bucketing for continuous distributions and will be ignored if the attribute is categorical.metric
- Specify one metric among {ref}the data drift metrics Arthur offers <glossary_data_drift>
.filter
- Optional blocks specific to either reference or inference set to specify which data should be used in the data drift calculation.group_by
- Global and applies to both the base and target data.rollup
- Optional parameter aggregating the calculated data drift value by the supported time dimension.
For HypothesisTest
, the returned value is transformed as -log_10(P_value) to maintain directional parity with the other data drift metrics. A lower P_value is more significant and implies data drift, reflected in a higher -log_10(P_value). Further mathematical details are in the Glossary
Query Request:
{
"properties": [
"<attribute1_name> [string]",
"<attribute2_name> [string]",
"<attribute3_name> [string]"
],
"num_bins": "<num_bins> [int]",
"metric": "[PSI|KLDivergence|JSDivergence|HellingerDistance|HypothesisTest]",
"base": {
"source": "[inference|reference]",
"filter [Optional]": [
{
"property": "<filter_attribute_name> [string]",
"comparator": "<comparator> [string]",
"value": "<filter_threshold_value> [string|int|float]"
}
]
},
"target": {
"source": "[inference|reference|ground_truth]",
"filter [Optional]": [
{
"property": "<filter_attribute_name> [string]",
"comparator": "<comparator> [string]",
"value": "<filter_threshold_value> [string|int|float]"
}
]
},
"group_by [Optional]": [
{
"property": "<group_by_attribute_name> [string]"
}
],
"rollup [Optional]": "minute|hour|day|month|year|batch_id"
}
Query Response:
{
"query_result": [
{
"<attribute1_name>": "<attribute1_data_drift> [float]",
"<attribute2_name>": "<attribute2_data_drift> [float]",
"<attribute3_name>": "<attribute3_data_drift> [float]",
"<group_by_attribute_name>": "<group_by_attribute_value> [string|int|null]",
"rollup": "<rollup_attribute_value> [string|null]"
}
]
}
Example: Reference vs. Inference
Sample Request: Calculate data drift for males, grouped by country, rolled up by hour.
{
"properties": [
"age"
],
"num_bins": 10,
"metric": "PSI",
"base": {
"source": "reference",
"filter": [
{
"property": "gender",
"comparator": "eq",
"value": "male"
}
]
},
"target": {
"source": "inference",
"filter": [
{
"property": "gender",
"comparator": "eq",
"value": "male"
},
{
"property": "inference_timestamp",
"comparator": "gte",
"value": "2020-07-22T10:00:00Z"
},
{
"property": "inference_timestamp",
"comparator": "lt",
"value": "2020-07-23T10:00:00Z"
}
]
},
"group_by": [
{
"property": "country"
}
],
"rollup": "hour"
}
Sample Response:
{
"query_result": [
{
"age": 2.3,
"country": "Canada",
"rollup": "2020-07-22T10:00:00Z"
},
{
"age": 2.4,
"country": "United States",
"rollup": "2020-07-22T10:00:00Z"
}
]
}
Example: Inference vs. Inference
Sample Request: Compare data drift between two batches, with no grouping, filters, or rollups.
{
"properties": [
"age"
],
"num_bins": 10,
"metric": "PSI",
"base": {
"source": "inference",
"filter": [
{
"property": "batch_id",
"comparator": "eq",
"value": "5"
}
]
},
"target": {
"source": "inference",
"filter": [
{
"property": "batch_id",
"comparator": "eq",
"value": "6"
}
]
}
}
Sample Response:
{
"query_result": [
{
"age": 2.3
}
]
}
Example: Reference vs. Ground Truth
Sample Request: Calculate data drift for individual ground truth class prediction probabilities, rolled up by hour.
{
"properties": [
"gt_1"
],
"num_bins": 10,
"metric": "PSI",
"base": {
"source": "reference"
},
"target": {
"source": "ground_truth",
"filter": [
{
"property": "ground_truth_timestamp",
"comparator": "gte",
"value": "2020-07-22T10:00:00Z"
},
{
"property": "ground_truth_timestamp",
"comparator": "lt",
"value": "2020-07-23T10:00:00Z"
}
]
},
"rollup": "hour"
}
Sample Response:
{
"query_result": [
{
"gt_1": 0.03,
"rollup": "2020-07-22T10:00:00Z"
},
{
"gt_1": 0.4,
"rollup": "2020-07-22T11:00:00Z"
}
]
}
Data Drift PSI Bucket Table Values
This metric has a dedicated endpoint at Query PSI Bucket Table.
Returns the PSI scores by bucket using the reference set data. This query for this endpoint omits the need for metric
and takes in a single property
but otherwise is identical to the data drift endpoint
Note when using this endpoint with categorical features, the bucket_min
and bucket_max
fields will not be
returned in the response. Instead, the bucket
field will contain the category name.
Query Request:
{
"property": "<attribute_name> [string]",
"num_bins": "<num_bins> [int]",
"base": {
"source": "[inference|reference]",
"filter [Optional]": [
{
"property": "<filter_attribute_name> [string]",
"comparator": "<comparator> [string]",
"value": "<filter_threshold_value> [string|int|float]"
}
]
},
"target": {
"source": "[inference|reference]",
"filter [Optional]": [
{
"property": "<filter_attribute_name> [string]",
"comparator": "<comparator> [string]",
"value": "<filter_threshold_value> [string|int|float]"
}
]
},
"group_by [Optional]": [
{
"property": "<group_by_attribute_name> [string]"
}
],
"rollup [Optional]": "minute|hour|day|month|year|batch_id"
}
Query Response:
{
"query_result": [
{
"bucket": "string",
"rollup": "string|null",
"group_by_property_1": "string|null",
"base_bucket_max": "number",
"base_bucket_min": "number",
"base_count_per_bucket": "number",
"base_ln_probability_per_bucket": "number",
"base_probability_per_bucket": "number",
"base_total": "number",
"target_bucket_max": "number",
"target_bucket_min": "number",
"target_count_per_bucket": "number",
"target_ln_probability_per_bucket": "number",
"target_probability_per_bucket": "number",
"target_total": "number",
"probability_difference": "number",
"ln_probability_difference": "number",
"psi": "number"
}
]
}
Sample Request: Calculate data drift bucket components for males, grouped by country, rolled up by hour.
{
"property": "age",
"num_bins": 2,
"base": {
"source": "reference",
"filter": [
{
"property": "gender",
"comparator": "eq",
"value": "male"
}
]
},
"target": {
"source": "inference",
"filter": [
{
"property": "gender",
"comparator": "eq",
"value": "male"
},
{
"property": "inference_timestamp",
"comparator": "gte",
"value": "2020-07-22T10:00:00Z"
},
{
"property": "inference_timestamp",
"comparator": "lt",
"value": "2020-07-23T10:00:00Z"
}
]
},
"group_by": [
{
"property": "country"
}
],
"rollup": "hour"
}
Sample Response:
{
"query_result": [
{
"bucket": "bucket_1",
"rollup": "2020-01-01T00:00:00Z",
"country": "Canada",
"base_bucket_max": 0.9999971182990177,
"base_bucket_min": 0.5009102069226075,
"base_count_per_bucket": 4988,
"base_ln_probability_per_bucket": -0.6955500651756032,
"base_probability_per_bucket": 0.4988,
"base_total": 10000,
"target_bucket_max": 0.9999971182990177,
"target_bucket_min": 0.5009102069226075,
"target_count_per_bucket": 2487,
"target_ln_probability_per_bucket": -0.6701670131762315,
"target_probability_per_bucket": 0.5116231228142357,
"target_total": 4861,
"probability_difference": -0.012823122814235699,
"ln_probability_difference": -0.025383051999371742,
"psi": 0.00032548999318807485
},
{
"bucket": "bucket_2",
"rollup": "2020-01-01T00:00:00Z",
"country": "United States",
"base_bucket_max": 0.9999971182990177,
"base_bucket_min": 0.5009102069226075,
"base_count_per_bucket": 4988,
"base_ln_probability_per_bucket": -0.6955500651756032,
"base_probability_per_bucket": 0.4988,
"base_total": 10000,
"target_bucket_max": 0.9999971182990177,
"target_bucket_min": 0.5009102069226075,
"target_count_per_bucket": 2487,
"target_ln_probability_per_bucket": -0.6701670131762315,
"target_probability_per_bucket": 0.5116231228142357,
"target_total": 4861,
"probability_difference": -0.012823122814235699,
"ln_probability_difference": -0.025383051999371742,
"psi": 0.00032548999318807485
},
{
"bucket": "bucket_1",
"rollup": "2020-01-01T01:00:00Z",
"country": "Canada",
"base_bucket_max": 0.9999971182990177,
"base_bucket_min": 0.5009102069226075,
"base_count_per_bucket": 4988,
"base_ln_probability_per_bucket": -0.6955500651756032,
"base_probability_per_bucket": 0.4988,
"base_total": 10000,
"target_bucket_max": 0.9999971182990177,
"target_bucket_min": 0.5009102069226075,
"target_count_per_bucket": 2487,
"target_ln_probability_per_bucket": -0.6701670131762315,
"target_probability_per_bucket": 0.5116231228142357,
"target_total": 4861,
"probability_difference": -0.012823122814235699,
"ln_probability_difference": -0.025383051999371742,
"psi": 0.00032548999318807485
},
{
"bucket": "bucket_2",
"rollup": "2020-01-01T01:00:00Z",
"country": "United States",
"base_bucket_max": 0.9999971182990177,
"base_bucket_min": 0.5009102069226075,
"base_count_per_bucket": 4988,
"base_ln_probability_per_bucket": -0.6955500651756032,
"base_probability_per_bucket": 0.4988,
"base_total": 10000,
"target_bucket_max": 0.9999971182990177,
"target_bucket_min": 0.5009102069226075,
"target_count_per_bucket": 2487,
"target_ln_probability_per_bucket": -0.6701670131762315,
"target_probability_per_bucket": 0.5116231228142357,
"target_total": 4861,
"probability_difference": -0.012823122814235699,
"ln_probability_difference": -0.025383051999371742,
"psi": 0.00032548999318807485
}
]
}
Sample Request: Compare data drift bucket components between two batches, with no grouping, no filters, and no rollups.
{
"property": "age",
"num_bins": 10,
"base": {
"source": "inference",
"filter": [
{
"property": "batch_id",
"comparator": "eq",
"value": "5"
}
]
},
"target": {
"source": "inference",
"filter": [
{
"property": "batch_id",
"comparator": "eq",
"value": "6"
}
]
}
}
Sample Response:
{
"query_result": [
{
"bucket": "bucket_1",
"base_bucket_max": 0.9999971182990177,
"base_bucket_min": 0.5009102069226075,
"base_count_per_bucket": 4988,
"base_ln_probability_per_bucket": -0.6955500651756032,
"base_probability_per_bucket": 0.4988,
"base_total": 10000,
"target_bucket_max": 0.9999971182990177,
"target_bucket_min": 0.5009102069226075,
"target_count_per_bucket": 2487,
"target_ln_probability_per_bucket": -0.6701670131762315,
"target_probability_per_bucket": 0.5116231228142357,
"target_total": 4861,
"probability_difference": -0.012823122814235699,
"ln_probability_difference": -0.025383051999371742,
"psi": 0.00032548999318807485
},
{
"bucket": "bucket_2",
"base_bucket_max": 0.9999971182990177,
"base_bucket_min": 0.5009102069226075,
"base_count_per_bucket": 4988,
"base_ln_probability_per_bucket": -0.6955500651756032,
"base_probability_per_bucket": 0.4988,
"base_total": 10000,
"target_bucket_max": 0.9999971182990177,
"target_bucket_min": 0.5009102069226075,
"target_count_per_bucket": 2487,
"target_ln_probability_per_bucket": -0.6701670131762315,
"target_probability_per_bucket": 0.5116231228142357,
"target_total": 4861,
"probability_difference": -0.012823122814235699,
"ln_probability_difference": -0.025383051999371742,
"psi": 0.00032548999318807485
}
]
}
Data Drift for Classification Outputs
For classification outputs, one may want to examine drift among a collection of different classes, i.e., the system of outputs, instead of the drift of the probability predictions of a single class. The query uses one of "predicted_classes": ["*"]
or "ground_truth_classes": ["*"]
but otherwise is identical to a standard data drift query. Rather than using the star operator to select all prediction or ground truth classes, respectively, in a model, a list of string classes can be provided for looking at the drift of a subset of multiclass outputs.
predicted_classes
- Specifies which prediction classes to use forpredictedClass
data drift.ground_truth_classes
- Specifies which prediction classes to use forgroundTruthClass
data drift.
properties
can be included in the same query as long as the target source
corresponds to the classification output tag. For example, one can query drift on input attributes and predictedClass
in the same query with target source
of inference
; one can query drift on individual ground truth labels and groundTruthClass
in the same query with target source
of ground_truth
.
Query Request:
{
"properties [Optional]": [
"<attribute1_name> [string]",
"<attribute2_name> [string]",
"<attribute3_name> [string]"
],
"[predicted_classes|ground_truth_classes]": [
"<class0_name> [string]"
"<class1_name> [string]"
],
"num_bins": "<num_bins> [int]",
"metric": "[PSI|KLDivergence|JSDivergence|HellingerDistance|HypothesisTest]",
"base": {
"source": "[inference|reference]",
"filter [Optional]": [
{
"property": "<filter_attribute_name> [string]",
"comparator": "<comparator> [string]",
"value": "<filter_threshold_value> [string|int|float]"
}
]
},
"target": {
"source": "[inference|reference|ground_truth]",
"filter [Optional]": [
{
"property": "<filter_attribute_name> [string]",
"comparator": "<comparator> [string]",
"value": "<filter_threshold_value> [string|int|float]"
}
]
},
"group_by [Optional]": [
{
"property": "<group_by_attribute_name> [string]"
}
],
"rollup [Optional]": "minute|hour|day|month|year|batch_id"
}
Query Response:
{
"query_result": [
{
"<attribute1_name>": "<attribute1_data_drift> [float]",
"<attribute2_name>": "<attribute2_data_drift> [float]",
"<attribute3_name>": "<attribute3_data_drift> [float]",
"[predictedClass|groundTruthClass]": "<classification_data_drift> [float]",
"<group_by_attribute_name>": "<group_by_attribute_value> [string|int|null]",
"rollup": "<rollup_attribute_value> [string|null]"
}
]
}
Sample Request: Calculate data drift on all prediction classes.
{
"predicted_classes": [
"*"
],
"num_bins": 20,
"base": {
"source": "reference"
},
"target": {
"source": "inference"
},
"metric": "PSI"
}
Sample Response:
{
"query_result": [
{
"predictedClass": 0.021
}
]
}
Sample Request: Calculate data drift on ground truth using the first and third ground truth classes.
{
"predicted_classes": [
"gt_1",
"gt_3"
],
"num_bins": 20,
"base": {
"source": "reference"
},
"target": {
"source": "ground_truth"
},
"metric": "PSI"
}
Sample Response:
{
"query_result": [
{
"groundTruthClass": 0.021
}
]
}
Automated Data Drift Thresholds
What is a sufficiently high data drift value to suggest that the target data has actually drifted from the base data? For HypothesisTest
, we can reverse engineer -log_10(P_value) and plug in the conventional .05 alpha level to establish a lower bound of -log_10(.05).
For the other data drift metrics, pining a constant is insufficient. We abstract this away for the user and allow queries to obtain automatically generated data drift thresholds (lower bounds) based on a model's data. These thresholds can be used in alerting. For more information, see: Automating Data Drift Thresholding in Machine Learning Systems.
The query uses the"metric": "Thresholds"
and does not require nor use "target"
and "rollup"
fields but otherwise is identical to a standard data drift query.
Query Request:
{
"properties": [
"<attribute1_name> [string]",
"<attribute2_name> [string]",
"<attribute3_name> [string]"
],
"num_bins": "<num_bins> [int]",
"metric": "Thresholds",
"base": {
"source": "reference",
"filter [Optional]": [
{
"property": "<filter_attribute_name> [string]",
"comparator": "<comparator> [string]",
"value": "<filter_threshold_value> [string|int|float]"
}
]
},
"group_by [Optional]": [
{
"property": "<group_by_attribute_name> [string]"
}
]
}
Query Response:
{
"query_result": [
{
"<attribute1_name>": {
"HellingerDistance": "<threshold> [float]",
"JSDivergence": "<threshold> [float]",
"KLDivergence": "<threshold> [float]",
"PSI": "<threshold> [float]"
},
"<attribute2_name>": {
"HellingerDistance": "<threshold> [float]",
"JSDivergence": "<threshold> [float]",
"KLDivergence": "<threshold> [float]",
"PSI": "<threshold> [float]"
}
}
]
}
Sample Request:
{
"properties": [
"AGE"
],
"num_bins": 20,
"base": {
"source": "reference"
},
"metric": "Thresholds"
}
Sample Response:
{
"query_result": [
{
"AGE": {
"HellingerDistance": 0.00041737395239735647,
"JSDivergence": 2.959228131592643,
"KLDivergence": 0.001893866910388703,
"PSI": 0.0018945640055550161
}
}
]
}
Updated 9 months ago