arthurai.core package

Submodules

arthurai.core.alerts module

class arthurai.core.alerts.Alert(id: str, timestamp: str, metric_value: float, message: str, model_id: str, status: str, alert_rule: arthurai.core.alerts.AlertRule, window_start: Union[str, NoneType] = None, window_end: Union[str, NoneType] = None, batch_id: Union[str, NoneType] = None)

Bases: arthurai.core.base.ArthurBaseJsonDataclass

alert_rule: arthurai.core.alerts.AlertRule
batch_id: Optional[str] = None
id: str
message: str
metric_value: float
model_id: str
status: str
timestamp: str
window_end: Optional[str] = None
window_start: Optional[str] = None
class arthurai.core.alerts.AlertRule(bound: arthurai.core.alerts.AlertRuleBound, threshold: Union[int, float], metric_id: str, severity: arthurai.core.alerts.AlertRuleSeverity, name: Union[str, NoneType] = None, lookback_period: Union[int, float, NoneType] = None, subsequent_alert_wait_time: Union[int, float, NoneType] = None, enabled: bool = True, id: Union[str, NoneType] = None, metric_name: Union[str, NoneType] = None)

Bases: arthurai.core.base.ArthurBaseJsonDataclass

bound: arthurai.core.alerts.AlertRuleBound
enabled: bool = True
id: Optional[str] = None
lookback_period: Optional[Union[int, float]] = None
metric_id: str
metric_name: Optional[str] = None
name: Optional[str] = None
severity: arthurai.core.alerts.AlertRuleSeverity
subsequent_alert_wait_time: Optional[Union[int, float]] = None
threshold: Union[int, float]
class arthurai.core.alerts.AlertRuleBound

Bases: arthurai.common.constants.ListableStrEnum

Lower = 'lower'
Upper = 'upper'
class arthurai.core.alerts.AlertRuleSeverity

Bases: arthurai.common.constants.ListableStrEnum

Critical = 'critical'
Warning = 'warning'
class arthurai.core.alerts.AlertStatus

Bases: arthurai.common.constants.ListableStrEnum

Acknowledged = 'acknowledged'
Resolved = 'resolved'
class arthurai.core.alerts.Metric(id: str, name: str, query: Dict[str, Any], is_default: bool, type: Union[arthurai.core.alerts.MetricType, NoneType] = None, attribute: Union[str, NoneType] = None)

Bases: arthurai.core.base.ArthurBaseJsonDataclass

attribute: Optional[str] = None
id: str
is_default: bool
name: str
query: Dict[str, Any]
type: Optional[arthurai.core.alerts.MetricType] = None
class arthurai.core.alerts.MetricType

Bases: arthurai.common.constants.ListableStrEnum

ModelDataDriftMetric = 'model_data_drift_metric'
ModelInputDataMetric = 'model_input_data_metric'
ModelOutputMetric = 'model_output_metric'
ModelPerformanceMetric = 'model_performance_metric'

arthurai.core.attributes module

class arthurai.core.attributes.ArthurAttribute(name: str, value_type: arthurai.common.constants.ValueType, stage: arthurai.common.constants.Stage, id: Optional[str] = None, label: Optional[str] = None, position: Optional[int] = None, categorical: Optional[bool] = False, min_range: Optional[Union[int, float]] = None, max_range: Optional[Union[int, float]] = None, monitor_for_bias: bool = False, categories: Optional[List[arthurai.core.attributes.AttributeCategory]] = None, bins: Optional[List[arthurai.core.attributes.AttributeBin]] = None, is_unique: bool = False, is_positive_predicted_attribute: bool = False, attribute_link: Optional[str] = None)

Bases: arthurai.core.base.ArthurBaseJsonDataclass

ArthurAttribute Object encapsulates data associated with a model attribute

Parameters:attribute_link (Optional[str]) – Only applicable for GroundTruth or PredictedValue staged attributes.

If stage is equal to GroundTruth, this represents the associated PredictedValue attribute and vice versa :type is_positive_predicted_attribute: bool :param is_positive_predicted_attribute: Only applicable for PredictedValue attributes on a Binary Classification model. Should be set to True on the positive predicted value attribute. :type is_unique: bool :param is_unique: Boolean value used to signal if the values of this attribute are unique. :type bins: Optional[List[AttributeBin]] :param bins: List of bin cut-offs used to discretize continuous attributes. Use None as an open ended value.

[None, 18, 65, None] represents the three following bins: value < 18, 18 < value < 65, value > 65
Parameters:
  • monitor_for_bias (bool) – boolean value set to True if the attribute should be monitored for bias
  • max_range (Union[int, float, None]) – Max value for a continuous attribute
  • min_range (Union[int, float, None]) – Min value for a continuous attribute
  • categorical (Optional[bool]) – Boolean value set to True if the attribute has categorical values.
  • position (Optional[int]) – The array position of attribute within the stage. Required in the PREDICT_FUNCTION_INPUT stage.
  • label (Optional[str]) – Label for attribute. If attribute has an encoded name, a more readable label can be set.
  • stage (Stage) – arthurai.common.constants.Stage of this attribute in the model pipeline
  • value_type (ValueType) – arthurai.common.constants.ValueType associated with this attributes values
  • name (str) – Name of the attribute. Attribute names can only contain alpha-numeric characters and underscores

and cannot start with a number. :type categories: Optional[List[AttributeCategory]] :param categories: [Only for Categorical Attributes] If the attribute is categorical, this will contain the attribute’s categories. It is required only if the categorical flag is set to true.

bins: Optional[List[arthurai.core.attributes.AttributeBin]] = None
categorical: Optional[bool] = False
categories: Optional[List[arthurai.core.attributes.AttributeCategory]] = None
id: Optional[str] = None
is_positive_predicted_attribute: bool = False
is_unique: bool = False
label: Optional[str] = None
max_range: Optional[Union[int, float]] = None
min_range: Optional[Union[int, float]] = None
monitor_for_bias: bool = False
name: str
position: Optional[int] = None
set(**kwargs)

Set one or many of the available properties of the ArthurAttribute class

stage: arthurai.common.constants.Stage
value_type: arthurai.common.constants.ValueType
class arthurai.core.attributes.AttributeBin(continuous_start: Optional[float] = None, continuous_end: Optional[float] = None)

Bases: arthurai.core.base.ArthurBaseJsonDataclass

A list of the attribute’s bins. An attribute will only have bins if it is not categorical. The bin start is
exclusive and the end is inclusive, (continuous_start, continuous_end]. Use Null to represent an open end of a bin.
continuous_end: Optional[float] = None
continuous_start: Optional[float] = None
class arthurai.core.attributes.AttributeCategory(value: str, label: Optional[str] = None)

Bases: arthurai.core.base.ArthurBaseJsonDataclass

A list of the attribute’s categories. An attribute will only have categories if it is marked as categorical.

label: Optional[str] = None
value: str

arthurai.core.base module

class arthurai.core.base.ArthurBaseJsonDataclass

Bases: dataclasses_json.api.DataClassJsonMixin

static clean_nones(d)
to_dict(skip_none=True)

Creates a dictionary representation of the attribute object

Return type:dict
Returns:Dictionary object of attribute data
to_json(skip_none=True)
Return type:str

arthurai.core.data_service module

class arthurai.core.data_service.DatasetService

Bases: object

COUNTS = 'counts'
DEFAULT_MAX_IMAGE_DATA_BYTES = 300000000
FAILURE = 'failure'
FAILURES = 'failures'
SUCCESS = 'success'
TOTAL = 'total'
static chunk_parquet_image_set(directory_path, image_attribute, max_image_data_bytes=300000000)

Takes in a directory path with parquet files containing image attributes. Divides images up into 300MB chunks, then zipped, the parquet file is also split up to match. Parquet files will have random filename, and image zips will have matching name.

Return type:str
static convert_dataframe(model_id, stage, df, max_rows_per_file=500000)

Convert a dataframe to parquet named {model.id}-{stage}.parquet in the system tempdir

Parameters:
  • model_id (str) – a model id
  • stage (Optional[Stage]) – the Stage
  • df (DataFrame) – the dataframe to convert
  • max_rows_per_file – the maximum number of rows per parquet file
Returns:
The filename of the parquet file that was created
Return type:str
static files_size(parquet_files, model_input_type)
Return type:int
static send_parquet_files_from_dir_iteratively(model, directory_path, endpoint, upload_file_param_name, additional_form_params=None, retries=0)

Sends parquet files iteratively from a specified directory to a specified url for a given model

Parameters:
  • retries (int) – Number of times to retry the request if it results in a 400 or higher response code
  • model (ArthurModel) – the arthurai.client.apiv2.model.ArthurModel
  • directory_path (str) – local path containing parquet files to send
  • endpoint (str) – POST url endpoint to send files to
  • upload_file_param_name (str) – name to use in body with attached files
  • additional_form_params (Optional[Dict[str, Any]]) – dictionary of additional form file params to send along with parquet file
Raises:

MissingParameterError – the request failed

:returns A list of files which failed to upload

class arthurai.core.data_service.ImageZipper

Bases: object

add_file(path)
get_zip()

arthurai.core.decorators module

arthurai.core.decorators.log_prediction(arthur_model)

Decorator to log the inputs and prediction of a model to Arthur.

Parameters:arthur_model (ArthurModel) – A previously-saved ArthurModel object

Note, the prediction function to be decorated can take optional arguments for logging; these should be passed as kwargs into the function to be decorated.

Parameters:
  • inference_timestamp – A timestamp in ISO 8601 format
  • partner_inference_id – A unique identifier for an inference
Return type:

Callable[[Any], Any]

Returns:

Tuple of (model_prediction, inference_id)

arthurai.core.inferences module

arthurai.core.inferences.add_inference_metadata_to_dataframe(df, model_attributes, ignore_join_errors=False)

Adds timestamp and/or partner_inference_id fields to the DataFrame. :type df: DataFrame :param df: :type model_attributes: Collection[ArthurAttribute] :param model_attributes: :type ignore_join_errors: bool :param ignore_join_errors: if True, allow inference data without partner_inference_id`s or ground truth data :rtype: :py:class:`~pandas.core.frame.DataFrame :return: the input DataFrame if no updates are needed, otherwise a shallow copy with the new columns :raise UserValueError: if inference data is supplied without `partner_inference_id`s or ground truth data, and

ignore_join_errors is False.
arthurai.core.inferences.add_predictions_or_ground_truth(inference_data, new_data, attributes, stage)

Add prediction or ground truth data to inference data in place. :type inference_data: List[Dict[str, Any]] :param inference_data: the inference data as a List of Dicts as expected by the Arthur API :type new_data: Union[List[Dict[str, Any]], Dict[str, List[Any]], DataFrame, Iterable[Any]] :param new_data: the new data to add in, as a List of Dicts, Dict of Lists, DataFrame, or Sequence (if a single column) :type attributes: List[ArthurAttribute] :param attributes: the model’s attributes :type stage: Stage :param stage: the Stage of the new data, either PredictedValue or GroundTruth :return: None (modifies inference_data in place)

arthurai.core.inferences.nest_inference_and_ground_truth_data(data, attributes)

Reformat List of Dicts inference data to nest inference and ground truth data as expected by the Arthur API. For example:

[
    {
        "input_attr_1": 1.0,
        "prediction_1": 0.95,
        "inference_timestamp": "2021-06-03T19:44:33.169334+00:00",
        "ground_truth_1": 1,
        "ground_truth_timestamp": "2021-06-03T19:44:56.892019+00:00"
    }
]

Will become:

[
    {
        "inference_data":
        {
            "input_attr_1": 1.0,
            "prediction_1": 0.95
        },
        "ground_truth_data":
        {
            "ground_truth_1": 1
        },
        "inference_timestamp": "2021-06-03T19:44:33.169334+00:00",
        "ground_truth_timestamp": "2021-06-03T19:44:56.892019+00:00"
    }
]
Parameters:
  • data (List[Dict[str, Any]]) – the input data to reformat, either already nested or flat
  • attributes (List[ArthurAttribute]) – the model’s attributes
Return type:

List[Dict[str, Any]]

Returns:

the nested data

arthurai.core.inferences.parse_stage_attributes(data, attributes, stage)

Parses data for a single stage into the standard List of Dicts format. If the stage contains only a single attribute, data can be a single list-like column.

See also

Similar to dataframe_like_to_list_of_dicts, but an expected column (attribute)-aware and supports single columns

Parameters:
  • data (Union[List[Dict[str, Any]], Dict[str, List[Any]], DataFrame, Iterable[Any]]) –
  • attributes (List[ArthurAttribute]) –
  • stage (Stage) –
Returns:

parsed data in List of Dicts formt

arthurai.core.model_utils module

arthurai.core.model_utils.check_attr_is_bias(model, attr_name)
arthurai.core.model_utils.check_has_bias_attrs(model)
arthurai.core.model_utils.get_positive_predicted_class(model)

Checks if model is a binary classifier. Returns False if multiclass, otherwise returns the name of the positive predicted attribute

arthurai.core.models module

class arthurai.core.models.ArthurModel(partner_model_id: str, input_type: arthurai.common.constants.InputType, output_type: arthurai.common.constants.OutputType, client: dataclasses.InitVar = None, explainability: Optional[arthurai.core.models.ExplainabilityParameters] = None, id: Optional[str] = None, display_name: Optional[str] = None, description: Optional[str] = None, is_batch: bool = False, archived: bool = False, created_at: Optional[str] = None, updated_at: Optional[str] = None, attributes: Optional[List[arthurai.core.attributes.ArthurAttribute]] = None, tags: Optional[List[str]] = None, classifier_threshold: Optional[float] = None, text_delimiter: Optional[arthurai.common.constants.TextDelimiter] = None, expected_throughput_gb_per_day: Optional[int] = None, pixel_height: Optional[int] = None, pixel_width: Optional[int] = None, image_class_labels: Optional[List[str]] = None, reference_dataframe: Optional[pandas.core.frame.DataFrame] = None)

Bases: arthurai.core.base.ArthurBaseJsonDataclass

Arthur Model class represents the metadata which represents a registered model in the application

Parameters:
  • client (Optional[InitVar]) – arthurai.client.Client object which manages data storage
  • partner_model_id (str) – Client provided unique id to associate with the model. This field must be unique across all active models cannot be changed once set.
  • input_type (InputType) – arthurai.common.constants.InputType representing the model’s input data type.
  • output_type (OutputType) – arthurai.common.constants.InputType representing the model’s output data format.
  • explainability (Optional[ExplainabilityParameters]) – arthurai.core.models.ExplainabilityParameters object representing parameters that will be used to create inference explanations.
  • id (Optional[str]) – The auto-generated unique UUID for the model. Will be overwritten if set by the user.
  • display_name (Optional[str]) – An optional display name for the model.
  • description (Optional[str]) – Optional description of the model.
  • is_batch (bool) – Boolean value to determine whether the model sends inferences in batch or streaming format. Defaults to False.
  • archived (bool) – Indicates whether or not a model has been archived, defaults to false.
  • created_at (Optional[str]) – UTC timestamp in ISO8601 format of when the model was created. Will be overwritten if set by the user.
  • updated_at (Optional[str]) – UTC timestamp in ISO8601 format of when the model was last updated. Will be overwritten if set by the user.
  • attributes (Optional[List[ArthurAttribute]]) – List of arthurai.core.attributes.ArthurAttribute objects registered to the model
  • tags (Optional[List[str]]) – List of string keywords to associate with the model.
  • classifier_thresholds – Threshold value for classification models, default is 0.5.
  • text_delimiter (Optional[TextDelimiter]) – Only valid for models with input_type equal to arthurai.common.constants.InputType.NLP. Represents the text delimiter to divide input strings.
  • expected_throughput_gb_per_day (Optional[int]) – Expected amount of throughput.
  • pixel_height (Optional[int]) – Only valid for models with input_type equal to arthurai.common.constants.InputType.Image. Expected image height in pixels.
  • pixel_width (Optional[int]) – Only valid for models with input_type equal to arthurai.common.constants.InputType.Image. Expected image width in pixels.
add_attribute(name=None, value_type=None, stage=None, label=None, position=None, categorical=False, min_range=None, max_range=None, monitor_for_bias=False, categories=None, bins=None, is_unique=False, is_positive_predicted_attribute=False, attribute_link=None, arthur_attribute=None, gt_pred_attrs_map=None)

Adds a new attribute to the model and returns the attribute.

Also validates that Stage is not PredictedValue or GroundTruth. Additionally, attribute names must contain only letters, numbers, and underscores, and cannot begin with a number.

Parameters:attribute_link (Optional[str]) – Only applicable for GroundTruth or PredictedValue staged attributes.

If stage is equal to GroundTruth, this represents the associated PredictedValue attribute and vice versa :type is_positive_predicted_attribute: bool :param is_positive_predicted_attribute: Only applicable for PredictedValue attributes on a Binary Classification model. Should be set to True on the positive predicted value attribute. :type is_unique: bool :param is_unique: Boolean value used to signal if the values of this attribute are unique. :type bins: Optional[List[Union[int, float, AttributeBin]]] :param bins: List of bin cut-offs used to discretize continuous attributes. Use None as an open ended value.

[None, 18, 65, None] represents the three following bins: value < 18, 18 < value < 65, value > 65
Parameters:
  • monitor_for_bias (bool) – boolean value set to True if the attribute should be monitored for bias
  • max_range (Union[int, float, None]) – Max value for a continuous attribute
  • min_range (Union[int, float, None]) – Min value for a continuous attribute
  • categorical (bool) – Boolean value set to True if the attribute has categorical values.
  • position (Optional[int]) – The array position of attribute within the stage. Required in the PREDICT_FUNCTION_INPUT stage.
  • label (Optional[str]) – Label for attribute. If attribute has an encoded name, a more readable label can be set.
  • stage (Optional[Stage]) – arthurai.common.constants.Stage of this attribute in the model pipeline
  • value_type (Optional[ValueType]) – arthurai.common.constants.ValueType associated with this attributes values
  • name (Optional[str]) – Name of the attribute. Attribute names can only contain alpha-numeric characters and underscores

and cannot start with a number. :type categories: Optional[List[Union[str, AttributeCategory]]] :param categories: [Only for Categorical Attributes] If the attribute is categorical, this will contain the attribute’s categories. It is required only if the categorical flag is set to true. :type arthur_attribute: Optional[ArthurAttribute] :param arthur_attribute: Optional ArthurAttribute to add to the model :type gt_pred_attrs_map: Optional[Dict] :param gt_pred_attrs_map: .. deprecated:: version 2.0.0 Use ArthurModel.add_[model_type]_output_attributes()

instead
Return type:List[ArthurAttribute]
Returns:ArthurAttribute Object
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
add_binary_classifier_output_attributes(positive_predicted_attr, pred_to_ground_truth_map, threshold=0.5)

Registers ground truth and predicted attribute parameters and their thresholds.

This function will create a predicted value and ground truth attribute for each mapping specified in pred_to_ground_truth_map.

For binary models, GroundTruth is always an integer, and PredictedAttribute is always a float. Additionally, PredictedAttribute is expected to be a probability (e.g. the output of a scikit-learn model’s predict_proba method), rather than a classification to 0/1.

This assumes that separate columns for predicted values and ground truth values have already been created, and that they have both been broken into two separate (pseudo-onehot) columns: for example, the column ground_truth_label becomes ground_truth_label=0 and ground_truth_label=1, and the column pred_prob becomes pred_prob=0 and pred_prob=1. The pandas function pd.get_dummies() can be useful for reformatting the ground truth column, but be sure that the datatype is specified correctly as an int.

Parameters:
  • positive_predicted_attr (str) – string name of the predicted attribute to register as the positive predicted attribute
  • pred_to_ground_truth_map (Dict[str, str]) – Map of predicted value attributes to their corresponding ground truth attribute names. The names provided in the dictionary will be used to register the one-hot encoded version of the attributes. For example: {‘pred_0’: ‘gt_0’, ‘pred_1’: ‘gt_1’}, Ensure the ordering of items in this dictionary is an accurate representation of how model predictions (probability vectors) will be generated.
  • threshold (float) – Threshold to use for the classifier model, defaults to 0.5
Return type:

Dict[str, ArthurAttribute]

Returns:

Mapping of added attributes string name -> ArthurAttribute Object

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

add_image_attribute(name=None)

Wraps add_attribute for images

Return type:List[ArthurAttribute]
Returns:ArthurAttribute Object
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
add_multiclass_classifier_output_attributes(pred_to_ground_truth_map)

Registers ground truth and predicted attribute parameters. This function will create a predicted value and ground truth attribute for each mapping specified in pred_to_ground_truth_map.

Parameters:pred_to_ground_truth_map (Dict[str, str]) – Map of predicted value attributes to their corresponding ground truth attribute names. The names provided in the dictionary will be used to register the one-hot encoded version of the attributes. Ensure the ordering of items in this dictionary is an accurate representation of how model predictions (probability vectors) will be generated.
Return type:Dict[str, ArthurAttribute]
Returns:Mapping of added attributes string name -> ArthurAttribute Object
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
add_object_detection_output_attributes(predicted_attr_name, gt_attr_name, image_class_labels)

Registers ground truth and predicted value attributes for an object detection model, as well as setting the image class labels.

This function will create a predicted value attribute and ground truth attribute using the names provided, giving each a value type of Bounding Box. Image class labels are also set on the model object. The index of each label in the list should correspond to a class_id the model outputs.

Ex: image_class_labels = [‘cat’, ‘dog’, ‘person’] So a bounding box with class_id of 0 would have label ‘cat’, class_id of 2 would have label ‘person’

Parameters:
  • predicted_attr_name (str) – The name of the predicted value attribute
  • gt_attr_name (str) – The name of the ground truth attribute
  • image_class_labels (List[str]) – The labels for each class the model can predict, ordered by their class_id
Return type:

Dict[str, ArthurAttribute]

Returns:

Mapping of added attributes string name -> ArthurAttribute Object

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

add_regression_output_attributes(pred_to_ground_truth_map, value_type)

Registers ground truth and predicted attribute parameters for regression models. This function will register two ArthurAttribute objects to the model, a predicted value attribute and ground truth attribute.

Parameters:
  • pred_to_ground_truth_map (Dict[str, str]) – Map of predicted value attributes to their corresponding ground truth attribute names. The names provided in the dictionary will be used to register the one-hot encoded version of the attributes.
  • value_type (ValueType) – Value type of regression model output attribute (usually either ValueType.Integer or ValueType.Float)
Return type:

Dict[str, ArthurAttribute]

Returns:

Mapping of added attributes string name -> ArthurAttribute Object

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

archive()

Archives the model with a DELETE request.

Returns:the server response
Raise:Exception: the model has no ID, or the model has not been archived
archived: bool = False
attributes: Optional[List[arthurai.core.attributes.ArthurAttribute]] = None
binarize(attribute_value)

Creates a binary class probability based on classes defined in a ModelType.Multiclass model.

Parameters:attribute_value – a mapping of the name of a predicted value attribute to its value
Returns:A two-value dictionary with probabilities for both predicted classes.
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
build(data, pred_to_ground_truth_map, positive_predicted_attr=None, non_input_columns=None, set_reference_data=True)

Build a model from a reference DataFrame, inferring the attribute metadata and registering the reference data to be stored with Arthur. Note that this will remove any previously existing attributes.

Combines calls to ArthurModel.infer_schema() and (if set_reference_data is True) ArthurModel.set_reference_data() :type data: DataFrame :param data: a reference DataFrame to build the model from :type pred_to_ground_truth_map: Dict[str, str] :param pred_to_ground_truth_map: a mapping from predicted column names to their corresponding ground truth

column names
Parameters:
  • positive_predicted_attr (Optional[str]) – name of the predicted attribute to register as the positive predicted attribute
  • non_input_columns (Optional[List[str]]) – list of columns that contain auxiliary data not directly passed into the model
  • set_reference_data – if True, register the provided DataFrame as the model’s reference dataset
Return type:

DataFrame

Returns:

a DataFrame summarizing the inferred types

check_attr_is_bias(attr_name)
check_has_bias_attrs()
classifier_threshold: Optional[float] = None
client: dataclasses.InitVar = None
close_batch(batch_id, num_inferences=None)

Closes the specified batch, optionally can supply the number of inferences that are contained in the batch

Parameters:
  • batch_id (str) – String batch_id associated with the batch that will be closed
  • num_inferences (Optional[int]) – Optional number of inferences that are contained in the batch
Return type:

Dict

Returns:

Response of the batch close rest call

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

create_alert_rule(metric_id, bound, threshold, severity, name=None, lookback_period=None, subsequent_alert_wait_time=None)

Creates alert rules for the current model. :type metric_id: str :param metric_id: UUID of the metric to use to create an alert rule. :type name: Optional[str] :param name: A name for the alert rule, a default will be generated if this is not supplied. :type bound: AlertRuleBound :param bound: Either AlertRuleBound.Upper or AlertRuleBound.Lower :type threshold: Union[int, float] :param threshold: Threshold of alert rule :type severity: AlertRuleSeverity :param severity: AlertRuleSeverity of the alert which gets triggered when the metric violates the threshold of

the alert rule.
Parameters:
  • lookback_period (Union[int, float, None]) – The lookback time or “window length” in minutes to use when calculating the alert rule metric. For example, a lookback period of 5 minutes for an alert rule on average prediction will calculate average prediction for the past 5 minutes in a rolling window format. This will default to 5 minutes
  • subsequent_alert_wait_time (Union[int, float, None]) – If metric continues to pass threshold this is the time in minutes to wait before triggering another alert. This defaults to 0. This does not need to be set for batch alerts.
Return type:

AlertRule

Returns:

the created alert rule

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

create_metric(name, query, is_data_drift=False)

Creates a metric registered to this model and returns the UUID assigned to the newly created metric. This metric can be used to create alert rules on. :type name: str :param name: Name of the metric to create. :type query: Dict[str, Any] :param query: Query which makes up the metric :type is_data_drift: bool :param is_data_drift: Boolean to signal whether this query is a data drift metric or not. :rtype: str :return: UUID of the newly created metric

created_at: Optional[str] = None
delete_explainer()

Spin down the model explainability server.

Return type:None
Returns:the server response
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
description: Optional[str] = None
display_name: Optional[str] = None
enable_bias_mitigation()
enable_explainability(df=None, project_directory=None, user_predict_function_import_path=None, streaming_explainability_enabled=True, requirements_file='requirements.txt', python_version=None, sdk_version='3.14.1', model_server_num_cpu=None, model_server_memory=None, model_server_max_replicas=None, inference_consumer_num_cpu=None, inference_consumer_memory=None, inference_consumer_thread_pool_size=None, inference_consumer_score_percent=None, explanation_nsamples=None, explanation_algo=None, ignore_dirs=None)

Enable explainability for this model.

Parameters:
  • df (Optional[DataFrame]) – a dataframe containing the Stage.ModelPipelineInput values for this model. Required for non-image models.
  • project_directory (Optional[str]) – the name of the directory containing the model source code. Required.
  • user_predict_function_import_path (Optional[str]) – the name of the file that implements or wraps the predict function. Required.
  • streaming_explainability_enabled (Optional[bool]) – Defaults to true. flag to turn on streaming explanations which will explain every inference sent to the platform. If false, explanations will need to be manually generated for each inference via the Arthur API. Set to false if worried about compute cost.
  • requirements_file (str) – the name of the file that contains the pip requirements (default: requirements.txt)
  • python_version (Optional[str]) – the python version (default: sys.version). Should be in the form of <major>.<minor>
  • sdk_version (str) – the version of the sdk to initialize the model servier with
  • model_server_num_cpu (Optional[str]) – string number of CPUs to provide to model server docker container. If not provided, 1 CPU is used. Specified in the format of Kubernetes CPU resources. ‘1’, ‘1.5’, ‘100m’, etc. (default: None)
  • model_server_memory (Optional[str]) – The amount of memory to allocate to the model server docker container. Provided in the format of kubernetes memory resources “1Gi” or “500Mi” (default: None).

:param model_server_max_replicas The max number of model servers to create :type inference_consumer_num_cpu: Optional[str] :param inference_consumer_num_cpu: string number of CPUs to provide to inference consumer docker container. If not provided, 1 CPU is used. Specified in the format of Kubernetes CPU resources. ‘1’, ‘1.5’, ‘100m’, etc. (default: ‘1’) :type inference_consumer_memory: Optional[str] :param inference_consumer_memory: The amount of memory to allocate to the model server docker container. Provided in the format of kubernetes memory resources “1Gi” or “500Mi” (default: ‘1G’). :param inference_consumer_thread_pool_size The number of inference consumer workers, this determines how many requests to the model server can be made in parallel. Default of 5. If increasing, CPU should be increased as well. :param inference_consumer_score_percent What percent of inferences should get scored. Should be a value between 0.0 and 1.0. Default 1.0 (everything is scored) :type explanation_nsamples: Optional[int] :param explanation_nsamples: number of predictions to use in the explanation. For SHAP the default is 2048 + 2(num features). For LIME, the default is 5000. (default: None) :type explanation_algo: Optional[str] :param explanation_algo: the algorithm to use for explaining inferences. Valid values are ‘lime’ and ‘shap’. Defaults to ‘lime’. :type ignore_dirs: Optional[List] :param ignore_dirs: a list of directories within the project_directory that you do not want to include when uploading the model. Path is relative to project_directory. :rtype: None :return: None :raise: ArthurUserError: failed due to user error :raise: ArthurInternalError: failed due to an internal error

enable_hotspots()
expected_throughput_gb_per_day: Optional[int] = None
explainability: Optional[arthurai.core.models.ExplainabilityParameters] = None
find_hotspots(metric='accuracy', threshold=0.5, batch_id=None, date=None, ref_set_id=None)

Retrieve hotspots from the model :type metric: AccuracyMetric :param metric: accuracy metric used to filter hotspots tree by, defaults to “accuracy” :type threshold: float :param threshold: threshold for of performance metric used for filtering hotspots, defaults to 0.5 :type batch_id: Optional[str] :param batch_id: string id for the batch to find hotspots in, defaults to None :type date: Optional[str] :param date: string used to define date, defaults to None :type ref_set_id: Optional[str] :param ref_set_id: string id for the reference set to find hotspots in, defaults to None :raise: ArthurUserError: failed due to user error :raise: ArthurInternalError: failed due to an internal error

Return type:Dict[str, Any]
from_dataframe(data, stage)

Auto-generate attributes based on input data.

Deprecated since version 3.12.0: Please use ArthurModel.infer_schema() to add fields from a DataFrame to a model.

Note that this does not automatically set reference data; this method only reads the passed-in data, and then infers attribute names, types, etc. and sets them up within the ArthurModel.

See also

To also set your data as reference data, see ArthurModel.build()

For PredictedValue and GroundTruth stages, use the correct add_<modeltype>_output_attributes() method instead.

Parameters:
  • data (Union[DataFrame, Series]) – the data to infer attribute metadata from
  • stage (Stage) – Stage of the data
Return type:

None

Returns:

a DataFrame summarizing the inferred types

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

get_alert_rules(page=1, page_size=20)

Returns a paginated list of alert rules registered to this model

Parameters:
  • page (int) – page of alert rules to retrieve, defaults to 1
  • page_size (int) – number of alert rules to return per page, defaults to 20
Return type:

List[AlertRule]

Returns:

List of arthurai.client.apiv3.AlertRule objects

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

get_alerts(page=1, page_size=500, status=None, alert_rule_id=None, batch_id=None, start_time=None, end_time=None)

Returns a paginated list of alert registered to this model.

Parameters:
  • page (int) – page of alert rules to retrieve, defaults to 1
  • page_size (int) – number of alert rules to return per page, defaults to 500
  • status (Optional[str]) – status of alert rule
  • alert_rule_id (Optional[str]) – id of alert rule
  • batch_id (Optional[str]) – constrain returned alert rules to this batch id
  • start_time (Optional[str]) – constrain returned alert rules to after this time
  • end_time (Optional[str]) – constrain returned alert rules to before this time
Return type:

List[Alert]

Returns:

List of arthurai.client.apiv3.Alert objects

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

get_attribute(name, stage=None)

Retrieves an attribute by name and stage

Parameters:
  • name (str) – string name of the attribute to retrieve
  • stage (Optional[Stage]) – Optional Stage of attribute to retrieve
Return type:

ArthurAttribute

Returns:

ArthurAttribute Object

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

get_attribute_names(stage)

Returns a list of all attribute names. If stage is supplied it will only return the attribute names which are in the specified stage.

Parameters:stage (Optional[Stage]) – arthurai.common.constants.Stage to filter by
Return type:List[str]
Returns:List of string attribute names
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
get_attributes(stage)

Returns a list of attributes for the specified stage

Parameters:stage (Optional[Stage]) – arthurai.common.constants.Stage to filter by
Return type:Optional[List[ArthurAttribute]]
Returns:List of arthurai.attributes.ArthurAttribute
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
get_enrichment(enrichment)

Returns configuration for the specified enrichment.

Parameters:enrichment (Enrichment) – Enrichment constant
Return type:Dict[str, Any]
Returns:Enrichment config
{
    "enabled": true,
    "config": {
        "python_version": "3.7",
        "sdk_version": "3.0.11",
        "streaming_explainability_enabled": false,
        "user_predict_function_import_path": "entrypoint",
        "shap_expected_values": "[0.7674405187893311, 0.23255948121066888]",
        "model_server_cpu": "2",
        "model_server_memory": "1Gi",
        "model_server_max_replicas": "5",
        "explanation_nsamples": 1000,
        "explanation_algo": "lime",
        "inference_consumer_cpu": "100m",
        "inference_consumer_memory": "512Mi",
        "inference_consumer_score_percent": "1.0",
        "inference_consumer_thread_pool_size": "1",
        "service_account_id": "8231affb-c107-478e-a1b4-e24e7f1f6619"
    }
}
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
get_enrichments()

Returns configuration for all enrichments.

Return type:Dict[str, Any]
Returns:Upload status response in the following format:
{
    "anomaly_detection": {
        "enabled": false
    },
    "explainability": {
        "config": {
            "python_version": "3.7",
            "sdk_version": "3.0.11",
            "streaming_explainability_enabled": false,
            "user_predict_function_import_path": "entrypoint",
            "shap_expected_values": "[0.7674405187893311, 0.23255948121066888]",
            "model_server_cpu": "2",
            "model_server_memory": "1Gi",
            "model_server_max_replicas": "5",
            "explanation_nsamples": 1000,
            "explanation_algo": "lime",
            "inference_consumer_cpu": "100m",
            "inference_consumer_memory": "512Mi",
            "inference_consumer_score_percent": "1.0",
            "inference_consumer_thread_pool_size": "1",
            "service_account_id": "8231affb-c107-478e-a1b4-e24e7f1f6619"
        },
        "enabled": true
    }
}
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
get_image(image_id, save_path, type='raw_image')

Saves the image specified by image_id to a file

Parameters:
  • image_id (str) – id of image in model
  • save_path (str) – directory to save the downloaded image to
  • type (ImageResponseType) – type of response data
Return type:

str

Returns:

location of downloaded image file

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

get_image_attribute()

Returns the attribute with value_type=Image for input_type=Image models

Return type:ArthurAttribute
Returns:ArthurAttribute Object
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
get_metrics(default_metrics=False, metric_type=None, metric_id=None, metric_name=None, attribute_name=None)

Retrieves metrics associated with the current model. Can add optional filters to search with function parameters. :type default_metrics: bool :param default_metrics: If set to True will return only metrics that are automatically created by default for your model :type metric_type: Optional[MetricType] :param metric_type: MetricType to filter metric query with :type metric_id: Optional[str] :param metric_id: Metric UUID to use in filtering metric search :type metric_name: Optional[str] :param metric_name: Metric name filter to use in metric search :type attribute_name: Optional[str] :param attribute_name: Attribute name filter to use in metric search :rtype: List[Metric] :return: list of metrics returned from metric search

get_positive_predicted_class()

Checks if model is a binary classifier. Returns False if multiclass, otherwise returns the name of the positive predicted attribute

id: Optional[str] = None
image_class_labels: Optional[List[str]] = None
infer_schema(data, stage)

Auto-generate attributes based on input data.

Note that this does not automatically set reference data; this method only reads the passed-in data, and then infers attribute names, types, etc. and sets them up within the ArthurModel.

See also

To also set your data as reference data, see ArthurModel.set_reference_data().

To infer schemas for all stages and set reference data in a single call, see ArthurModel.build().

For PredictedValue and GroundTruth stages, use the correct add_<modeltype>_output_attributes() method instead.

Parameters:
  • data (Union[DataFrame, Series]) – the data to infer attribute metadata from
  • stage (Stage) – Stage of the data
Return type:

None

Returns:

a DataFrame summarizing the inferred types

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

input_type: arthurai.common.constants.InputType
is_batch: bool = False
model_is_saved()
Return type:bool
one_hot_encode(value)

Creates a one hot encoding of a class label based on classes defined in a ModelType.Multiclass model.

Parameters:value – the ground truth value
Returns:A dictionary with a one hot encoding of possible ground truth values.
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
output_type: arthurai.common.constants.OutputType
partner_model_id: str
pixel_height: Optional[int] = None
pixel_width: Optional[int] = None
query(body, query_type='base')

Execute query against the model’s inferences. For full description of possible functions, aggregations, and transformations, see https://docs.arthur.ai/api-query-guide/ For queries pertaining to datadrift metrics (‘drift’ or ‘drift_psi_bucket_table’ query types), please see https://docs.arthur.ai/api-query-guide/data_drift.html

Parameters:
  • body (Dict[str, Any]) – dict
  • query_type – str Can be either ‘base’, ‘drift’, or ‘drift_psi_bucket_table’
body = {
           "select":[
              {"property":"batch_id"},
              {
                 "function":"count",
                 "alias":"inference_count"
              }
           ],
           "group_by":[
              {"property":"batch_id"}
           ]
        }
body = {
           "select":[,
              {"property":"batch_id"},
              {
                 "function":"rate",
                 "alias":"positive_rate",
                 "parameters":{
                    "property":"predicted_1",
                    "comparator":"gt",
                    "value":0.75
                 }
              }
           ],
           "group_by":[
              {"property":"batch_id"}
           ]
        }
Returns:the query response as documented in https://docs.arthur.ai/api-query-guide/
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
reference_dataframe: Optional[pandas.core.frame.DataFrame] = None
rename_attribute(old_name, new_name, stage)

Renames an attribute by name and stage

Parameters:
  • old_name (str) – string name of the attribute to rename
  • new_name (str) – string new name of the attribute
  • stage (Stage) – Stage of attribute
Return type:

ArthurAttribute

Returns:

ArthurAttribute Object

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

review(stage=None, props=None, print_df=False)

Prints a summary of the properties of all attributes in the model.

Parameters:
  • stage (Optional[Stage]) – restrict the output to a particular Stage (defaults to all stages)
  • props (Optional[List[str]]) – a list of properties to display in the summary valid properties are data_type, categorical, is_unique, categories, cutoffs, range, monitor_for_bias, position (defaults to data_type, categorical, is_unique)
  • print_df – boolean value whether to print df or return it, defaults to False
Return type:

Optional[DataFrame]

Returns:

a DataFrame summarizing the inferred types; or None if print_df is True

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

save()

Check and save this model. Sets and returns the ID on successful upload.

Return type:str
Returns:The model id
Raise:Exception: the model has already been saved
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
send_batch_ground_truths(directory_path=None, data=None)

Deprecated since version 3.10.0: Please use ArthurModel.send_bulk_ground_truths() for both streaming and batch data.

Parameters:
  • directory_path (Optional[str]) – file path to a directory of parquet files containing ground truth data
  • data (Union[DataFrame, Series, None]) – a DataFrame or Series containing the reference data for the Stage
Returns:

Upload status response in the following format:

{
    "counts": {
        "success": 1,
        "failure": 0,
        "total": 1
    },
    "results": [
        {
            "message": "success",
            "status": 200
        }
    ]
}

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

send_batch_inferences(batch_id, directory_path=None, data=None, complete_batch=True)

Deprecated since version 3.10.0: Use ArthurModel.send_inferences() to send batch or streaming data synchronously (recommended fewer than 100,000 rows), or ArthurModel.send_bulk_inferences() to send many inferences or Parquet files.

Parameters:
  • batch_id (Optional[str]) – string id for the batch to upload; if supplied, this will override any batch_id column specified in the provided dataset
  • data (Optional[DataFrame]) – a DataFrame containing the reference data.
  • directory_path (Optional[str]) – file path to a directory of parquet files containing ground truth data
  • complete_batch (bool) – Defaults to true and will automatically close a batch once it is sent
Returns:

A tuple of the batch upload response and the close batch response.

The batch upload response is in the following format:

{
    "counts": {
        "success": 0,
        "failure": 0,
        "total": 0
    },
    "failures": []
}
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
send_bulk_ground_truths(directory_path=None, data=None)

Uploads a DataFrame or directory containing parquet files to the Arthur bulk inferences ingestion endpoint.

Parameters:
  • directory_path (Optional[str]) – file path to a directory of parquet files containing ground truth data. Required if data is not provided, and cannot be populated if data is provided.
  • data (Union[DataFrame, Series, None]) – a DataFrame or Series containing the ground truth data. Required if directory_path is not provided, and cannot be populated it directory_path is not provided.
Returns:

Upload status response in the following format:

{
    "counts": {
        "success": 0,
        "failure": 0,
        "total": 0
    },
    "results": [
        {
            "message": "success",
            "status": 200
        }
    ]
}

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

send_bulk_inferences(batch_id=None, directory_path=None, data=None, complete_batch=True, ignore_join_errors=False)

Validates and uploads parquet files containing columns for inference data, partner_inference_id, inference_timestamp, and optionally a batch_id. Either directory_path or data must be specified.

See also

To send ground truth for your inferences, see ArthurModel.send_bulk_ground_truth()

The columns for predicted attributes should follow the column format specified in add_<modeltype>_classifier_output_attributes(). Additionally, partner_inference_id, must be specified for all inferences unless ignore_join_errors is True.

Parameters:
  • batch_id (Optional[str]) – string id for the batch to upload; if supplied, this will override any batch_id column specified in the provided dataset
  • directory_path (Optional[str]) – file path to a directory of parquet files containing inference data. Required if data is not provided, and cannot be populated if data is provided.
  • data (Optional[DataFrame]) – a DataFrame or Series containing the inference data. Required if directory_path is not provided, and cannot be populated it directory_path is not provided.
  • complete_batch (bool) – Defaults to true and will automatically close a batch once it is sent
  • ignore_join_errors (bool) – if True, allow inference data without `partner_inference_id`s or ground truth data
Returns:

A tuple of the batch upload response and the close batch response.

The batch upload response is in the following format:

{
    "counts": {
        "success": 1,
        "failure": 0,
        "total": 1
    },
    "results": [
        {
            "message": "success",
            "status": 200
        }
    ]
}
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
send_inference(inference_timestamp, partner_inference_id='', model_pipeline_input=None, non_input_data=None, predicted_value=None, ground_truth=None)

Uploads an inference with or without ground truth; this sends a single inference.

All inferences should follow the column format specified in add_<modeltype>_classifier_output_attributes(). Additionally, external_id and inference_timestamp must be provided.

Parameters:
  • inference_timestamp (Union[str, datetime]) – timestamp for inference to send; generated by external partner (not Arthur)
  • partner_inference_id (str) – an external id (partner_inference_id) to assign to the inferences
  • model_pipeline_input – a mapping of the name of pipeline input attributes to their value
  • non_input_data – a mapping of the name of non-input data attributes to their value
  • predicted_value – a mapping of the name of predicted value attributes to their value
  • ground_truth – a mapping of the name of ground truth attributes to their value
Returns:

Upload status response in the following format:

{
    "counts": {
        "success": 0,
        "failure": 0,
        "total": 0
    },
    "results": [
        {
            "message": "success",
            "status": 200
        }
    ]
}

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

send_inferences(inferences, predictions=None, inference_timestamps=None, ground_truths=None, ground_truth_timestamps=None, partner_inference_ids=None, batch_id=None, fail_silently=False, complete_batch=True)

Send inferences to the Arthur API. The inferences parameter may contain all the inference data, or only the input data if predictions and metadata are supplied separately. At a minimum, input data and predictions should be passed in: partner_inference_id, inference_timestamp, and (if ground truth data is supplied) ground_truth_timestamp fields are required by the Arthur API, but these will be generated if not supplied.

See also

To send large amounts of data or Parquet files, see ArthurModel.send_bulk_inferences()

Examples:

An input dataframe and predicted probabilities array, leaving the partner inference IDs and timestamps to be auto-generated:

input_df = pd.DataFrame({"input_attr": [2]})
pred_array = my_sklearn_model.predict_proba(input_df)
arthur_model.send_inferences(input_df, predictions=pred_array, batch_id='batch1')

All data in the inferences parameter in the format expected by the Arthur API:

inference_data = [
    {
        "inference_timestamp": "2021-06-16T16:52:11Z",
        "partner_inference_id": "inf1",
        "batch_id": "batch1",
        "inference_data": {
            "input_attr": 2,
            "predicted_attr": 0.6
        },
        "ground_truth_timestamp": "2021-06-16T16:53:45Z",
        "ground_truth_data": {
            "ground_truth_attr": 1
        }
    }
]
arthur_model.send_inferences(inference_data)

A list of dicts without nested inference_data or ground_truth fields:

inference_data = [
    {
        "inference_timestamp": "2021-06-16T16:52:11Z",
        "partner_inference_id": "inf1",
        "batch_id": "batch1",
        "input_attr": 2,
        "predicted_attr": 0.6,
        "ground_truth_timestamp": "2021-06-16T16:53:45Z",
        "ground_truth_attr": 1
    }
]
arthur_model.send_inferences(inference_data)
Parameters:
  • inferences (Union[List[Dict[str, Any]], Dict[str, List[Any]], DataFrame]) – inference data to send, containing at least input values and optionally predictions, ground truth, timestamps, partner inference ids, and batch IDs.
  • predictions (Union[List[Dict[str, Any]], Dict[str, List[Any]], DataFrame, Sequence[Any], None]) – the model predictions, in a table-like format for one or more columns or a list-like format if the model has only one predicted column. overwrites any predictions supplied in the inferences parameter
  • inference_timestamps (Optional[Sequence[Union[datetime, str]]]) – the inference timestamps, in a list-like format as ISO-8601 strings or datetime objects. if no timestamps are supplied in inferences or this parameter, they will be generated from the current time. overwrites any timestamps in the inferences parameter.
  • ground_truths (Union[List[Dict[str, Any]], Dict[str, List[Any]], DataFrame, Sequence[Any], None]) – the optional ground truth data (true labels), in a table-like format for one or more columns or a list-like format if the model has only one ground truth column. overwrites any ground truth values supplied in the inferences parameter
  • ground_truth_timestamps (Optional[Sequence[Union[datetime, str]]]) – the ground truth timestamps, in a list-like format as ISO-8601 strings or datetime objects. if no ground truth timestamps are supplied in inferences or this parameter but ground truth data is supplied, they will be generated from the current time. overwrites any timestamps in the inferences parameter.
  • partner_inference_ids (Optional[Sequence[str]]) – partner_inference_ids to be attached to these inferences, which can be used to send ground truth data later or retrieve specific inferences, in a list-like format containing strings. if no partner_inference_ids are supplied in inference or this parameter, they will be auto-generated.
  • batch_id (Optional[str]) – a single batch ID to use for all inferences supplied. overwrites any batch IDs in the inferences parameter
  • fail_silently (bool) – if True, log failed inferences but do not raise an exception
  • complete_batch (bool) – if True, mark all batches in this dataset as completed
Returns:

Upload status response in the following format:

{
  "counts": {
    "success": 1,
    "failure": 0,
    "total": 1
  },
  "results": [
    {
      "partner_inference_id" "inf-id",
      "message": "success",
      "status": 200
    }
  ]
}

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

set_attribute_as_sensitive(attribute_name, attribute_stage=None)

Sets the passed-in attribute to be sensitive by setting attr.monitor_for_bias = True.

You will need to call self.save() or self.update() after this method; we do not automatically call the API in this method.

Parameters:
  • attribute_name (str) – Name of attribute to set as sensitive.
  • attribute_stage (Optional[Stage]) – Stage of attribute to set as sensitive.
Return type:

None

Returns:

None

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

set_attribute_labels(attribute_name, labels, attribute_stage=None)

Sets labels for individual categories of a specific attribute :type attribute_name: str :param attribute_name: Attribute name to set the categories labels :type attribute_stage: Optional[Stage] :param attribute_stage: Optional stage of the attribute which is being updated :type labels: Dict[Union[int, str], str] :param labels: Dictionary where the key is the categorical value and the value is the string categorical label :return: None :raise: ArthurUserError: failed due to user error :raise: ArthurInternalError: failed due to an internal error

set_predict_function_input_order(attributes)

Sets the expected order of attributes used by the prediction function.

Parameters:attributes (List[str]) – a list of attribute names
Return type:None
Returns:None
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
set_reference_data(directory_path=None, data=None)

Validates and sets the reference data for the given stage to the provided data.

Either directory_path or data must be provided. Additionally, there must be one column per ModelPipelineInput and NonInput attribute.

For Image models, the image file path should be included as the image atribute value, in either the parquet files specified by directory_path or the DataFrame provided.

Parameters:
  • directory_path (Optional[str]) – file path to a directory of parquet files to upload for batch data
  • data (Union[DataFrame, Series, None]) – a DataFrame or Series containing the ground truth data
Returns:

Returns a tuple, the first variable is the response from sending the reference set and the second

is the response from closing the dataset.

Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
tags: Optional[List[str]] = None
text_delimiter: Optional[arthurai.common.constants.TextDelimiter] = None
update()

Update the the current model object

Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
update_alert(status, alert_id)

Updates alert to have a particular status.

Parameters:
  • status (AlertStatus) – one of “resolved” or “acknowledged”
  • alert_id (str) – alert id
Return type:

Alert

Returns:

updated alert object

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

update_alert_rule(alert_rule, alert_rule_id=None)

Updates alert rule fields included in the alert_rule object for the specified alert rule id. If the alert rules id field is present in the alert_rule parameter that is used otherwise alert_rule_id must be supplied

Parameters:
  • alert_rule (AlertRule) – Object which contains fields to update on the specified alert rule
  • alert_rule_id (Optional[str]) – If the alert rule id is not specified in the alert_rule_to_update object then this must be provided to determine which alert rule to update.
Returns:

Updates alert rule object

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

update_enrichment(enrichment, enabled=None, config=None)

Update the configuration for a single enrichment. :type enrichment: Enrichment :param enrichment: the enrichment to update :type enabled: Optional[bool] :param enabled: whether the enrichment should be enabled or disabled :type config: Optional[Dict[str, Any]] :param config: the configuration for the enrichment, None by default :raise: ArthurUserError: failed due to user error :raise: ArthurInternalError: failed due to an internal error

Return type:Dict[str, Any]
update_enrichments(enrichment_configs)

Update the configuration for 1 or more enrichments. See the enrichments guide at http://docs.arthur.ai/guides/enrichments.html.

Parameters:enrichment_configs (Dict[Union[str, Enrichment], Any]) –

Dict containing the configuration for each enrichment


{
“anomaly_detection”: {
“enabled”: false

}, “explainability”: {

”config”: {
“streaming_explainability_enabled”: false, “explanation_nsamples”: 1000, “explanation_algo”: “lime”, “inference_consumer_score_percent”: “1.0”

}, “enabled”: true

} “hotspots”: {

”enabled”: false

}

}

Return type:Dict[str, Any]
Returns:the resulting enrichments configuration
Raise:ArthurUserError: failed due to user error
Raise:ArthurInternalError: failed due to an internal error
update_inference_ground_truths(ground_truths, partner_inference_ids=None, ground_truth_timestamps=None, fail_silently=False)

Updates inferences with the supplied ground truth values.

The ground_truth parameter may contain all the required data, or only the data for the attributes from Stage.GroundTruth if metadata is supplied separately. At a minimum, Stage.GroundTruth attribute data and partner_inference_id should be passed in, either along with the attribute data in the ground_truths parameter, or in the partner_inference_ids parameter. Additionally, a ground_truth_timestamp field is required by the Arthur API, but this will be generated if not supplied.

See also

To send large amounts of data or Parquet files, see ArthurModel.send_bulk_ground_truths()

Examples:

A DataFrame containing all required values:

y_test = [1, 0, 1]
existing_inference_ids = [f"batch_1-inf_{i}" for i in len(y_test)]
ground_truth_df = pd.DataFrame({"ground_truth_positive_labels": y_test,
                                "ground_truth_negative_labels": 1 - y_test,
                                "partner_inference_id": existing_inference_ids})
arthur_model.update_inference_ground_truths(ground_truth_df)

A single list of values, with partner_inference_ids supplied separately:

y_test = [14.3, 19.6, 15.7]
existing_inference_ids = [f"batch_1-inf_{i}" for i in len(y_test)]
arthur_model.update_inference_ground_truths(y_test, partner_inference_ids=existing_inference_ids)

All data in the inferences parameter in the format expected by the Arthur API:

ground_truth_data = [
    {
        "partner_inference_id": "inf1",
        "ground_truth_timestamp": "2021-06-16T16:53:45Z",
        "ground_truth_data": {
            "ground_truth_attr": 1
        }
    }
]
arthur_model.update_inference_ground_truths(ground_truth_data)

A list of dicts without nested ground_truth fields:

inference_data = [
    {
        "partner_inference_id": "inf1",
        "ground_truth_timestamp": "2021-06-16T16:53:45Z",
        "ground_truth_attr": 1
    }
]
arthur_model.send_inferences(inference_data)
Parameters:
  • ground_truths (Union[List[Dict[str, Any]], Dict[str, List[Any]], DataFrame, Sequence[Any]]) – ground truth data to send, containing at least values for the ground truth attributes, and optionally `ground_truth_timestamp`s and `partner_inference_id`s.
  • partner_inference_ids (Optional[Sequence[str]]) – partner_inference_ids for the existing inferences to be updated, in a list-like format as strings. Required if not a field in ground_truths.
  • ground_truth_timestamps (Optional[Sequence[Union[datetime, str]]]) – the ground truth timestamps, in a list-like format as ISO-8601 strings or datetime objects. if no ground truth timestamps are supplied in inferences or this parameter, they will be generated from the current time. overwrites any timestamps in the ground_truths parameter.
  • fail_silently (bool) – if True, log failed inferences but do not raise an exception.
Returns:

Upload status response in the following format:

{
    "counts": {
        "success": 1,
        "failure": 0,
        "total": 1
    },
    "results": [
        {
            "message": "success",
            "status": 200
        }
    ]
}

Raise:

ArthurUserError: failed due to user error

Raise:

ArthurInternalError: failed due to an internal error

updated_at: Optional[str] = None
class arthurai.core.models.ExplainabilityParameters(enabled: bool, explanation_algo: Union[str, NoneType] = None, model_server_cpu: Union[str, NoneType] = None, model_server_memory: Union[str, NoneType] = None, explanation_nsamples: Union[int, NoneType] = None)

Bases: arthurai.core.base.ArthurBaseJsonDataclass

enabled: bool
explanation_algo: Optional[str] = None
explanation_nsamples: Optional[int] = None
model_server_cpu: Optional[str] = None
model_server_memory: Optional[str] = None

arthurai.core.util module

class arthurai.core.util.NumpyEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)

Bases: json.encoder.JSONEncoder

Special json encoder for numpy types

static convert_value(obj)

Converts the given object from a numpy data type to a python data type, if the object is already a python data type it is returned

Parameters:obj – object to convert
Returns:python data type version of the object
arthurai.core.util.dataframe_like_to_list_of_dicts(data)

Standardize data in a List of Dicts format (e.g. [{‘a’: 1, ‘b’: 2}, {‘a’: 3, ‘b’: 4}]). Input can be formatted as a List of Dicts, Dict of Lists (e.g. {‘a’: [1, 3], ‘b’: [2, 4]}), or a Pandas DataFrame. May return the same object as input if it already matches the correct format. :type data: Union[List[Dict[str, Any]], Dict[str, List[Any]], DataFrame] :param data: the input data to format :return: the data restructured as a Dict of Lists :raise UserTypeError: if the data is in an unexpected format

arthurai.core.util.intersection_is_non_empty(iter1, iter2)

Returns True if the two iterables share at least one element :type iter1: Iterable[Any] :param iter1: :type iter2: Iterable[Any] :param iter2: :return:

arthurai.core.util.retrieve_parquet_files(directory_path)

Checks whether a given directory and its subdirectories contain parquet files, if so this will return a list of the files

Parameters:directory_path (str) – local path to check files types
Return type:List[Path]
Returns:List of paths for parquet files that are found
arthurai.core.util.standardize_pd_obj(data, dropna, replacedatetime, attributes=None)

Standardize pandas object for nans and datetimes.

Standardization includes casting correct type for int columns that are float due to nans and for converting datetime objects into isoformatted strings.

Parameters:
  • data (Union[DataFrame, Series]) – the pandas data to standardize
  • dropna (bool) – if True, drop nans from numeric date columns
  • replacedatetime (bool) – if True, replace timestamps with isoformatted strings
  • attributes (Optional[Dict[str, Union[str, ValueType]]]) – if used for sending inferences, will handle column type conversions for columns with any nulls
Return type:

Union[DataFrame, Series]

Returns:

the standardized pandas data

Raise:

TypeError: timestamp is not of type datetime.datetime

Raise:

ValueError: timestamp is not timezone aware and no location data is provided to remedy

arthurai.core.util.update_column_in_list_of_dicts(data, target_column, column_values)

Adds column_values to target_column in a list of dicts in place. If values are present for target_column, they are overwritten. :type data: List[Dict[str, Any]] :param data: the List of Dict data to modify :type target_column: str :param target_column: the name of the column to write values into :type column_values: Sequence[Any] :param column_values: the values to write :rtype: None :return: None (in place) :raises UserValueError: if the lengths don’t match or aren’t retrievable

Module contents