Querying Drift in Python

The basic format of a drift query using the Python SDK involves specifying that the
query_type parameter has the value 'drift':

query = {...}

arthur_model.query(query, query_type='drift')

Data Drift Endpoint

Data drift has a dedicated endpoint at Query Data Drift.

Returns the data drift metric between a base dataset with a target dataset. This endpoint can support up to 100 properties in one request.

num_bins - Specifies the granularity of bucketing for continuous distributions and will be ignored if the attribute is categorical.
metric - Specify one metric among {ref}the data drift metrics Arthur offers <glossary_data_drift>.
filter - Optional blocks specific to either reference or inference set to specify which data should be used in the data drift calculation.
group_by - Global and applies to both the base and target data.
rollup - Optional parameter aggregating the calculated data drift value by the supported time dimension.

For HypothesisTest, the returned value is transformed as -log_10(P_value) to maintain directional parity with the other data drift metrics. A lower P_value is more significant and implies data drift, reflected in a higher -log_10(P_value). Further mathematical details are in the Glossary

Query Request:

{
    "properties": [
        "<attribute1_name> [string]",
        "<attribute2_name> [string]",
        "<attribute3_name> [string]"
    ],
    "num_bins": "<num_bins> [int]",
    "metric": "[PSI|KLDivergence|JSDivergence|HellingerDistance|HypothesisTest]",
    "base": {
        "source": "[inference|reference]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "target": {
        "source": "[inference|reference|ground_truth]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "group_by [Optional]": [
        {
            "property": "<group_by_attribute_name> [string]"
        }
    ],
    "rollup [Optional]": "minute|hour|day|month|year|batch_id"
}

Query Response:

{
    "query_result": [
        {
            "<attribute1_name>": "<attribute1_data_drift> [float]",
            "<attribute2_name>": "<attribute2_data_drift> [float]",
            "<attribute3_name>": "<attribute3_data_drift> [float]",
            "<group_by_attribute_name>": "<group_by_attribute_value> [string|int|null]",
            "rollup": "<rollup_attribute_value> [string|null]"
        }
    ]
}

Example: Reference vs. Inference

Sample Request: Calculate data drift for males, grouped by country, rolled up by hour.

{
    "properties": [
        "age"
    ],
    "num_bins": 10,
    "metric": "PSI",
    "base": {
        "source": "reference",
        "filter": [
            {
                "property": "gender",
                "comparator": "eq",
                "value": "male"
            }
        ]
    },
    "target": {
        "source": "inference",
        "filter": [
            {
                "property": "gender",
                "comparator": "eq",
                "value": "male"
            },
            {
                "property": "inference_timestamp",
                "comparator": "gte",
                "value": "2020-07-22T10:00:00Z"
            },
            {
                "property": "inference_timestamp",
                "comparator": "lt",
                "value": "2020-07-23T10:00:00Z"
            }
        ]
    },
    "group_by": [
        {
            "property": "country"
        }
    ],
    "rollup": "hour"
}

Sample Response:

{
    "query_result": [
        {
            "age": 2.3,
            "country": "Canada",
            "rollup": "2020-07-22T10:00:00Z"
        },
        {
            "age": 2.4,
            "country": "United States",
            "rollup": "2020-07-22T10:00:00Z"
        }
    ]
}

Example: Inference vs. Inference

Sample Request: Compare data drift between two batches, with no grouping, filters, or rollups.

{
    "properties": [
        "age"
    ],
    "num_bins": 10,
    "metric": "PSI",
    "base": {
        "source": "inference",
        "filter": [
            {
                "property": "batch_id",
                "comparator": "eq",
                "value": "5"
            }
        ]
    },
    "target": {
        "source": "inference",
        "filter": [
            {
                "property": "batch_id",
                "comparator": "eq",
                "value": "6"
            }
        ]
    }
}

Sample Response:

{
    "query_result": [
        {
            "age": 2.3
        }
    ]
}

Example: Reference vs. Ground Truth

Sample Request: Calculate data drift for individual ground truth class prediction probabilities, rolled up by hour.

{
    "properties": [
        "gt_1"
    ],
    "num_bins": 10,
    "metric": "PSI",
    "base": {
        "source": "reference"
    },
    "target": {
        "source": "ground_truth",
        "filter": [
            {
                "property": "ground_truth_timestamp",
                "comparator": "gte",
                "value": "2020-07-22T10:00:00Z"
            },
            {
                "property": "ground_truth_timestamp",
                "comparator": "lt",
                "value": "2020-07-23T10:00:00Z"
            }
        ]
    },
    "rollup": "hour"
}

Sample Response:

{
    "query_result": [
        {
            "gt_1": 0.03,
            "rollup": "2020-07-22T10:00:00Z"
        },
        {
            "gt_1": 0.4,
            "rollup": "2020-07-22T11:00:00Z"
        }
    ]
}

Data Drift PSI Bucket Table Values

This metric has a dedicated endpoint at Query PSI Bucket Table.

Returns the PSI scores by bucket using the reference set data. This query for this endpoint omits the need for metric and takes in a single property but otherwise is identical to the data drift endpoint

Note when using this endpoint with categorical features, the bucket_min and bucket_max fields will not be
returned in the response. Instead, the bucket field will contain the category name.

Query Request:

{
    "property": "<attribute_name> [string]",
    "num_bins": "<num_bins> [int]",
    "base": {
        "source": "[inference|reference]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "target": {
        "source": "[inference|reference]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "group_by [Optional]": [
        {
            "property": "<group_by_attribute_name> [string]"
        }
    ],
    "rollup [Optional]": "minute|hour|day|month|year|batch_id"
}

Query Response:

{
    "query_result": [
        {
            "bucket": "string",
            "rollup": "string|null",
            "group_by_property_1": "string|null",
            "base_bucket_max": "number",
            "base_bucket_min": "number",
            "base_count_per_bucket": "number",
            "base_ln_probability_per_bucket": "number",
            "base_probability_per_bucket": "number",
            "base_total": "number",
            "target_bucket_max": "number",
            "target_bucket_min": "number",
            "target_count_per_bucket": "number",
            "target_ln_probability_per_bucket": "number",
            "target_probability_per_bucket": "number",
            "target_total": "number",
            "probability_difference": "number",
            "ln_probability_difference": "number",
            "psi": "number"
        }
    ]
}

Sample Request: Calculate data drift bucket components for males, grouped by country, rolled up by hour.

{
    "property": "age",
    "num_bins": 2,
    "base": {
        "source": "reference",
        "filter": [
            {
                "property": "gender",
                "comparator": "eq",
                "value": "male"
            }
        ]
    },
    "target": {
        "source": "inference",
        "filter": [
            {
                "property": "gender",
                "comparator": "eq",
                "value": "male"
            },
            {
                "property": "inference_timestamp",
                "comparator": "gte",
                "value": "2020-07-22T10:00:00Z"
            },
            {
                "property": "inference_timestamp",
                "comparator": "lt",
                "value": "2020-07-23T10:00:00Z"
            }
        ]
    },
    "group_by": [
        {
            "property": "country"
        }
    ],
    "rollup": "hour"
}

Sample Response:

{
    "query_result": [
        {
            "bucket": "bucket_1",
            "rollup": "2020-01-01T00:00:00Z",
            "country": "Canada",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        },
        {
            "bucket": "bucket_2",
            "rollup": "2020-01-01T00:00:00Z",
            "country": "United States",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        },
        {
            "bucket": "bucket_1",
            "rollup": "2020-01-01T01:00:00Z",
            "country": "Canada",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        },
        {
            "bucket": "bucket_2",
            "rollup": "2020-01-01T01:00:00Z",
            "country": "United States",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        }
    ]
}

Sample Request: Compare data drift bucket components between two batches, with no grouping, no filters, and no rollups.

{
    "property": "age",
    "num_bins": 10,
    "base": {
        "source": "inference",
        "filter": [
            {
                "property": "batch_id",
                "comparator": "eq",
                "value": "5"
            }
        ]
    },
    "target": {
        "source": "inference",
        "filter": [
            {
                "property": "batch_id",
                "comparator": "eq",
                "value": "6"
            }
        ]
    }
}

Sample Response:

{
    "query_result": [
        {
            "bucket": "bucket_1",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        },
        {
            "bucket": "bucket_2",
            "base_bucket_max": 0.9999971182990177,
            "base_bucket_min": 0.5009102069226075,
            "base_count_per_bucket": 4988,
            "base_ln_probability_per_bucket": -0.6955500651756032,
            "base_probability_per_bucket": 0.4988,
            "base_total": 10000,
            "target_bucket_max": 0.9999971182990177,
            "target_bucket_min": 0.5009102069226075,
            "target_count_per_bucket": 2487,
            "target_ln_probability_per_bucket": -0.6701670131762315,
            "target_probability_per_bucket": 0.5116231228142357,
            "target_total": 4861,
            "probability_difference": -0.012823122814235699,
            "ln_probability_difference": -0.025383051999371742,
            "psi": 0.00032548999318807485
        }
    ]
}

Data Drift for Classification Outputs

For classification outputs, one may want to examine drift among a collection of different classes, i.e., the system of outputs, instead of the drift of the probability predictions of a single class. The query uses one of "predicted_classes": ["*"] or "ground_truth_classes": ["*"] but otherwise is identical to a standard data drift query. Rather than using the star operator to select all prediction or ground truth classes, respectively, in a model, a list of string classes can be provided for looking at the drift of a subset of multiclass outputs.

predicted_classes - Specifies which prediction classes to use for predictedClass data drift.
ground_truth_classes - Specifies which prediction classes to use for groundTruthClass data drift.

properties can be included in the same query as long as the target sourcecorresponds to the classification output tag. For example, one can query drift on input attributes and predictedClass in the same query with target source of inference; one can query drift on individual ground truth labels and groundTruthClass in the same query with target source of ground_truth.

Query Request:

{
    "properties [Optional]": [
        "<attribute1_name> [string]",
        "<attribute2_name> [string]",
        "<attribute3_name> [string]"
    ],
    "[predicted_classes|ground_truth_classes]": [
        "<class0_name> [string]"
        "<class1_name> [string]"
        ],
    "num_bins": "<num_bins> [int]",
    "metric": "[PSI|KLDivergence|JSDivergence|HellingerDistance|HypothesisTest]",
    "base": {
        "source": "[inference|reference]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "target": {
        "source": "[inference|reference|ground_truth]",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "group_by [Optional]": [
        {
            "property": "<group_by_attribute_name> [string]"
        }
    ],
    "rollup [Optional]": "minute|hour|day|month|year|batch_id"
}

Query Response:

{
    "query_result": [
        {
            "<attribute1_name>": "<attribute1_data_drift> [float]",
            "<attribute2_name>": "<attribute2_data_drift> [float]",
            "<attribute3_name>": "<attribute3_data_drift> [float]",
            "[predictedClass|groundTruthClass]": "<classification_data_drift> [float]",
            "<group_by_attribute_name>": "<group_by_attribute_value> [string|int|null]",
            "rollup": "<rollup_attribute_value> [string|null]"
        }
    ]
}

Sample Request: Calculate data drift on all prediction classes.

{
    "predicted_classes": [
        "*"
    ],
    "num_bins": 20,
    "base": {
        "source": "reference"
    },
    "target": {
        "source": "inference"
    },
    "metric": "PSI"
}

Sample Response:

{
    "query_result": [
        {
            "predictedClass": 0.021
        }
    ]
}

Sample Request: Calculate data drift on ground truth using the first and third ground truth classes.

{
    "predicted_classes": [
        "gt_1",
        "gt_3"
    ],
    "num_bins": 20,
    "base": {
        "source": "reference"
    },
    "target": {
        "source": "ground_truth"
    },
    "metric": "PSI"
}

Sample Response:

{
    "query_result": [
        {
            "groundTruthClass": 0.021
        }
    ]
}

Automated Data Drift Thresholds

What is a sufficiently high data drift value to suggest that the target data has actually drifted from the base data? For HypothesisTest, we can reverse engineer -log_10(P_value) and plug in the conventional .05 alpha level to establish a lower bound of -log_10(.05).

For the other data drift metrics, pining a constant is insufficient. We abstract this away for the user and allow queries to obtain automatically generated data drift thresholds (lower bounds) based on a model's data. These thresholds can be used in alerting. For more information, see: Automating Data Drift Thresholding in Machine Learning Systems.

The query uses the"metric": "Thresholds" and does not require nor use "target" and "rollup" fields but otherwise is identical to a standard data drift query.

Query Request:

{
    "properties": [
        "<attribute1_name> [string]",
        "<attribute2_name> [string]",
        "<attribute3_name> [string]"
    ],
    "num_bins": "<num_bins> [int]",
    "metric": "Thresholds",
    "base": {
        "source": "reference",
        "filter [Optional]": [
            {
                "property": "<filter_attribute_name> [string]",
                "comparator": "<comparator> [string]",
                "value": "<filter_threshold_value> [string|int|float]"
            }
        ]
    },
    "group_by [Optional]": [
        {
            "property": "<group_by_attribute_name> [string]"
        }
    ]
}

Query Response:

{
    "query_result": [
        {
            "<attribute1_name>": {
                "HellingerDistance": "<threshold> [float]",
                "JSDivergence": "<threshold> [float]",
                "KLDivergence": "<threshold> [float]",
                "PSI": "<threshold> [float]"
            },
            "<attribute2_name>": {
                "HellingerDistance": "<threshold> [float]",
                "JSDivergence": "<threshold> [float]",
                "KLDivergence": "<threshold> [float]",
                "PSI": "<threshold> [float]"
            }
            
        }
    ]
}

Sample Request:

{
    "properties": [
        "AGE"
    ],
    "num_bins": 20,
    "base": {
        "source": "reference"
    },
    "metric": "Thresholds"
}

Sample Response:

{
    "query_result": [
        {
            "AGE": {
                "HellingerDistance": 0.00041737395239735647,
                "JSDivergence": 2.959228131592643,
                "KLDivergence": 0.001893866910388703,
                "PSI": 0.0018945640055550161
            }
        }
    ]
}