Enrichments

In addition to tracking and aggregating user supplied information (inferences, ground truth, predicted values, etc), Arthur can also enrich data by computing additional metrics. Examples of enrichments include anomaly detection to generate multivariate anomaly scores and explainability to generate feature importance scores.

This guide will outline how to enable, disable, and configure Enrichments.

For a list of all available enrichments, their configuration options, and example usage, see Enrichment List.

General Usage

Every enrichment can be disabled/enabled independently, and may also expose and/or require configuration options.

Viewing Current Enrichments

You can use the SDK to fetch current enrichment settings.

model = connection.get_model("credit_risk", id_type="partner_model_id")
model.get_enrichments()

This will return a dictionary containing the configuration for all available enrichments:

{
    "anomaly_detection": {
        "enabled": true,
        "config": {}
    },
    "explainability": {
        "enabled": false,
        "config": {}
    }
}

You can also fetch the configuration for just a single enrichment at a time.

from arthurai.common.constants import Enrichment

model.get_enrichment(Enrichment.AnomalyDetection)

Returns:

{
    "enabled": true,
    "config": {}
}

Updating Enrichments

You can configure multiple enrichments at once

enrichment_configs = {
    Enrichment.Explainability: {'enabled': False, 'config': {}},
    Enrichment.AnomalyDetection: {'enabled': True, 'config': {}}
}
model.update_enrichments(enrichment_configs)

Or you can edit only the configuration for only a single enrichment.

ad_config = {}
enabled = True
model.update_enrichment(Enrichment.AnomalyDetection, enabled, ad_config)

Some enrichments can be configured using specialized helper functions. See the next section of this guide for specifics on configuring each enrichment.

Enrichment List

This table outlines all enrichments currently available.

Enrichment

Constant

Description

Anomaly Detection

Enrichment.AnomalyDetection

Calculates a multivariate anomaly score on each inference. Requires reference set to be uploaded.

Bias Mitigation

Enrichment.BiasMitigation

Calculates possible sets of group-conditional thresholds that may be used to produce fairer classifications.

Explainability

Enrichment.Explainability

Generates feature importance scores for inferences. Requires user to provide model files.


Anomaly Detection

Anomaly detection requires a reference set to be uploaded. We train a model on the reference set, and then use that model to score new inferences. See the explanation of our anomaly detection functionality from an algorithms perspective here. If anomaly detection is enabled, but no reference set has been uploaded, no anomaly scores will be generated. The reference set can be a subset of the model’s training data or possibly a dataset that was used during model testing. If anomaly detection is enabled, but no reference set has been uploaded, anomaly scores will not be generated for the inferences you send to Arthur. However, once a reference set has been uploaded, if anomaly detection has already been enabled, anomaly scores will automatically start to be calculated.

Compatiblity

Anomaly Detection can be enabled for models with Tabular or Image input types, and a reference set uploaded to Arthur.

Usage

# view current configuration
model.get_enrichment(Enrichment.AnomalyDetection)

# enable
model.update_enrichment(Enrichment.AnomalyDetection, True, {})

# disable
model.update_enrichment(Enrichment.AnomalyDetection, False, {})

Configuration

There is currently no additional configuration for Anomaly Detection.


Bias Mitigation

Once bias has been detected in your model – either pre or post deployment – you may be interested in mitigating that bias to improve your model in the future. Bias mitigation requires a reference set to be uploaded. See the explanation of our current mitigation methods from an algorithms perspective here.

Compatiblity

Bias Mitigation can be enabled for binary models of any input type, as long as at least one attribute is marked as monitor_for_bias=True, and a reference set uploaded to Arthur. When the

Usage

# view current configuration
model.get_enrichment(Enrichment.BiasMitigation)

# enable
model.update_enrichment(Enrichment.BiasMitigation, True, {})
# or
model.enable_bias_mitigation()

Enabling Bias Mitigation will automatically train a mitigation model for all attributes marked as monitor_for_bias=True, for the constraints demographic parity, equalized odds, and equal opportunity.

Configuration

There is currently no additional configuration for Bias Mitigation.


Explainability

The Explainability enrichment will generate explanations (feature importance scores) for inferences. This requires providing model files for Arthur to run. See the required setup here.

The Explainability enrichment exposes some configuration options which are outlined below.

Compatibility

Explainability is supported for all models except object detection.

Usage

To enable, we advise using the helper function model.enable_explainability() which provide named parameters and automatically specifying some required settings automatically such as sdk_version and python_version. Once enabled, you can use the generic functions (model.update_enrichment() or model.update_enrichments()) to update and change configuration, or disable explainability.

# view configuration
model.get_enrichment(Enrichment.Explainability)

# enable
model.enable_explainability(
    df=X_train.head(50),
    project_directory="/path/to/model_code/",
    requirements_file="example_requirements.txt",
    user_predict_function_import_path="example_entrypoint"
)

# update configuration
config_to_update = {
    'explanation_algo': 'shap',
    'streaming_explainability_enabled': False
}
model.update_enrichment(Enrichment.Explainability, True, config_to_update)

# disable
model.update_enrichment(Enrichment.Explainability, False, {})

When To Provide Required Settings

When going from disabled to enabled, you will need to include the required configuration settings. Once the enrichment has been enabled, you can update the non-required configuration settings without re-supplying required fields. When disabling the enrichment, you are not required to pass in any config settings.

Configuration

Setting

Required

Description

df

X

The dataframe passed to the explainer. Should be similar to, or a subset of, the training data. Typically small, ~50-100 rows.

project_directory

X

The path to the directory containing your predict function, requirements file, model file, and any other resources need to support the predict function.

user_predict_function_import_path

X

The name of the file containing the predict function. Do not include .py extension. Used to import the predict function.

requirements_file

X

The name of the file containing pip requirements for predict function

python_version

X

The Python version to use when executing the predict function. This is automatically set to the current python version when using model.enable_explainability().

sdk_version

X

The arthurai version used to make the enable request. This is automatically set to the currently installed SDK version when using model.enable_explainability().

explanation_algo

The explanation algorithm to use. Valid options are 'lime' or 'shap'. Default value of 'lime'.

explanation_nsamples

The number perturbed samples used to generate the explanation. For a smaller number of samples, the result will be calculated more quickly but may be less robust. It is recommended to use at least 100 samples. Default value of 2000.

inference_consumer_score_percent

Number between 0.0 and 1.0 that sets the percent of inferences to compute an explanation score for. Only applicable when streaming_explainability_enabled is set to true. Default value of 1.0 (all inferences explained)

streaming_explainability_enabled

If true, every inference will have an explanation generated for it. If false, explanations are available on-demand only.

ignore_dirs

List of paths to directories within project_directory that will not be bundled and included with the predict function. Use to prevent including irrelevant code or files in larger directories.