Enrichments

In addition to tracking and aggregating user supplied information (inferences, ground truth, predicted values, etc), Arthur can also enrich data by computing additional metrics. Examples of enrichments include anomaly detection to generate multivariate anomaly scores and explainability to generate feature importance scores.

This guide will outline how to enable, disable, and configure Enrichments.

For a list of all available enrichments, their configuration options, and example usage, Enrichment List

General Usage

Every enrichment can be disabled/enabled independently, and may also expose and/or require configuration options.

Viewing Current Enrichments

You can use the SDK to fetch current enrichment settings.

model = connection.get_model("credit_risk", id_type="partner_model_id")
model.get_enrichments()

This will return a dictionary containing the configuration for all available enrichments:

{
    "anomaly_detection": {
        "enabled": true,
        "config": {}
    },
    "explainability": {
        "enabled": false,
        "config": {}
    }
}

You can also fetch the configuration for just a single enrichment at a time.

from arthurai.common.constants import Enrichment

model.get_enrichment(Enrichment.AnomalyDetection)

Returns:

{
    "enabled": true,
    "config": {}
}

Updating Enrichments

You can configure multiple enrichments at once

enrichment_configs = {
    Enrichment.Explainability: {'enabled': False, 'config': {}},
    Enrichment.AnomalyDetection: {'enabled': True, 'config': {}}
}
model.update_enrichments(enrichment_configs)

Or you can edit only the configuration for only a single enrichment.

ad_config = {}
enabled = True
model.update_enrichment(Enrichment.AnomalyDetection, enabled, ad_config)

Some enrichments can be configured using specialized helper functions. See the next section of this guide for specifics on configuring each enrichment.

Enrichment List

This table outlines all enrichments currently available.

Enrichment

Constant

Description

Anomaly Detection

Enrichment.ANOMALY_DETECTION

Calculates a multivariate anomaly score on each inference. Requires reference set to be uploaded.

Explainability

Enrichment.EXPLAINABILITY

Generates feature importance scores for inferences. Requires user to provide model files.


Anomaly Detection

Anomaly detection requires a reference set to be uploaded. We train a model on the reference set, and then use that model to score new inferences. If anomaly detection is enabled, but no reference set has been uploaded, no anomaly scores will be generated. The reference set can be a subset of the model’s training data or possibly a dataset that was used during model testing. If anomaly detection is enabled, but no reference set has been uploaded, anomaly scores will not be generated for the inferences you send to Arthur. However, once a reference set has been uploaded, if anomaly detection has already been enabled, anomaly scores will automatically start to be calculated.

Compatiblity

Anomaly Detection can be enabled for models with Tabular or Image input types, and a reference set uploaded to Arthur.

Usage

# view current configuration
model.get_enrichment(Enrichment.AnomalyDetection)

# enable
model.update_enrichment(Enrichment.AnomalyDetection, True, {})

# disable
model.update_enrichment(Enrichment.AnomalyDetection, False, {})

Configuration

There is currently no additional configuration for Anomaly Detection


Explainability

The Explainability enrichment will generate explanations (feature importance scores) for inferences. This requires providing model files for Arthur to run. Explainability can be configured to automatically explain every inference, or only calculate explanation values on-demand. On-demand explanations can be generated in the UI by selecting an inference and clicking on the “Generate Explanation” button (Inference tab).

“What-If” explanations are on demand explanations with altered inference data. For example, if an inference was uploaded with a value of 10 for feature A, that value can be temporarily changed to 15 to see how explainability scores may be adjusted. You can generate “what-if” explanations in the UI by selecting an inference, clicking and editing one of the attribute values, then pressing enter. An explanation will be generated for the new values.

The Explainability enrichment exposes some configuration options which are outlined below.

Compatibility

Explainability is supported for all models. However “what-if” functionality is only available for Tabular models.

Usage

To enable, we advise using the helper function model.enable_explainability() which provide named parameters and automatically specifying some required settings automatically such as sdk_version and python_version. Once enabled, you can use the generic functions (model.update_enrichment() or model.update_enrichments()) to update and change configuration, or disable explainability.

# view configuration
model.get_enrichment(Enrichment.Explainability)

# enable
model.enable_explainability(
    df=X_train.head(50),
    project_directory="/path/to/model_code/",
    requirements_file="example_requirements.txt",
    user_predict_function_import_path="example_entrypoint"
)

# update configuration
config_to_update = {
    'explanation_algo': 'shap',
    'streaming_explainability_enabled': False
}
model.update_enrichment(Enrichment.Explainability, True, config_to_update)

# disable
model.update_enrichment(Enrichment.Explainability, False, {})

When To Provide Required Settings

When going from disabled to enabled, you will need to include the required configuration settings. Once the enrichment has been enabled, you can update the non-required configuration settings without re-supplying required fields. When disabling the enrichment, you are not required to pass in any config settings.

Configuration

Setting

Required

Description

df

X

The dataframe passed to the explainer. Should be similar to, or a subset of, the training data. Typically small, ~50-100 rows.

project_directory

X

The path to the directory containing your predict function, requirements file, model file, and any other resources need to support the predict function.

user_predict_function_import_path

X

The name of the file containing the predict function. Do not include .py extension. Used to import the predict function.

requirements_file

X

The name of the file containing pip requirements for predict function

python_version

X

The Python version to use when executing the predict function. This is automatically set to the current python version when using model.enable_explainability().

sdk_version

X

The arthurai version used to make the enable request. This is automatically set to the currently installed SDK version when using model.enable_explainability().

explanation_algo

The explanation algorithm to use. Valid options are 'lime' or 'shap'. Default value of 'lime'.

explanation_nsamples

The number perturbed samples used to generate the explanation. For a smaller number of samples, the result will be calculated more quickly but may be less robust. It is recommended to use at least 100 samples. Default value of 2000.

inference_consumer_score_percent

Number between 0.0 and 1.0 that sets the percent of inferences to compute an explanation score for. Only applicable when streaming_explainability_enabled is set to true. Default value of 1.0 (all inferences explained)

streaming_explainability_enabled

If true, every inference will have an explanation generated for it. If false, explanations are available on-demand only.

ignore_dirs

List of paths to directories within project_directory that will not be bundled and included with the predict function. Use to prevent including irrelevant code or files in larger directories.