Arthur Attribute

Arthur Attribute Types

An Arthur Attribute can primarily be configured by setting its stage and categorical flag. For a more in depth look at parameters that can be set on an Arthur Attribute see the ArthurAttribute Class.

Stage

An attributes Stage will refer to where it is used in the model inference process. Possible values to set an attributes stage as are defined in arthurai.common.constants.Stage.

  • ModelPipelineInput - Input to the entire model pipeline. This will most commonly be the Stage used to represent all model inputs. Will contain base input features that are familiar to the data scientist: categorical and continuous columns of a tabular dataset.

  • NonInputData - Ancillary data that can be associated with each inference, but not necessarily a direct input to the model. For example, sensitive attributes like age, sex, or race might not be direct model inputs, but will useful to associate with each prediction.

  • PredictedValue - The predictions produced by the model.

  • GroundTruth - The ground truth (or target) attribute for a model.

Categorical Attributes

Categorical attributes are marked by ArthurAttribute.categorical = true. Attributes in any stage can be marked as categorical. Optionally a domain of categories which an attribute can be equal to can be associated with the ArthurAttribute.

Continuous Attributes

Continuous attributes are marked by setting the categorical flag to false (ArthurAttribute.categorical = false). Continuous attributes can set explicit min and max range value via ArthurAttribute.min_range and ArthurAttribute.max_range.

Unique Attributes

Attributes can also be marked as unique. A unique attribute is a special case of a categorical attribute where each inferences will have a unique value of this attribute. An example of this is an ID attribute. These attributes most commonly are in the NonInputData stage.

Creating Arthur Attributes

Setting Output Attributes

Regression Models:

from arthurai.common.constants import ValueType
prediction_to_ground_truth_attribute_map = {
    "pred_temperature": "gt_temperature",
}

# add the ground truth and predicted attributes to the model
arthur_model.add_regression_output_attributes(
    pred_to_ground_truth_map = prediction_to_ground_truth_attribute_map,
    value_type = ValueType.Float
)

Binary Classification Models:

prediction_to_ground_truth_attribute_map = {
    "pred_high_utilization": "gt_high_utilization",
    "pred_low_utilization": "gt_low_utilization"
}

# add the ground truth and predicted attributes to the model
arthur_model.add_binary_classifier_output_attributes(
    positive_predicted_attr = 'pred_high_utilization',
    pred_to_ground_truth_map = prediction_to_ground_truth_attribute_map,
    threshold = 0.5
)

Multiclass Classification Models:

prediction_to_ground_truth_attribute_map = {
    "dog": "dog_gt",
    "cat": "cat_gt",
    "horse": "horse_gt"
}

# add the ground truth and predicted attributes to the model
arthur_model.add_multiclass_classifier_output_attributes(
    pred_to_ground_truth_map = prediction_to_ground_truth_attribute_map
)

Setting Input Attributes

Adding attributes directly to the model:

from arthurai.common.constants import Stage, ValueType

arthur_model.add_attribute(
    name = "attribute_a",
    value_type = ValueType.Float,
    categorical = False,
    stage = Stage.ModelPipelineInput
)

Inferring Input Attributes From a Dataset

from arthurai.common.constants import Stage
import pandas as pd
training_data = pd.read_csv('./training_data.csv')

# setting input attributes
input_feature_columns = ['feature_1', 'feature_2', 'feature_3', 'feature_4', 'feature_5']
arthur_model.from_dataframe(training_data[input_feature_columns], Stage.ModelPipelineInput)

Updating Arthur Attributes

Arthur attributes can only be updated on a model prior to the model being saved ArthurModel.save().

# ArthurAttribute.set() can be used to set one or many of the properties exposed on the ArthurAttribute object.
arthur_model.get_attribute(name="attribute_a").set(categorical=False, bins=[None, 18, 55, 65, None])

# or 

arthur_model.get_attribute(name="attribute_a").categorical=False
arthur_model.get_attribute(name="attribute_a").bins=[None, 18, 55, 65, None]