Product DocumentationAPI and Python SDK ReferenceRelease Notes
Schedule a Demo
Product Documentation
Schedule a Demo

Object Detection

Image Object Detection Models within Arthur

Object detection models analyze and classify objects within an image by placing a bounding box over the objects. In Arthur, these models are listed under the object detection model type.

Some common examples of Object Detection are:

  • Animal detection with wildlife cameras
  • Quality control in manufacturing, detecting defective pieces on the factory line

Formatted Data in Arthur

Object detection models require two columns: image input and bounding box output. When onboarding a reference dataset (and setting a model schema), you need to specify the relationship between your prediction and ground truth bounding box columns. Many teams also choose to onboard metadata for the model (i.e. any information you want to track about your inferences) as non-input attributes.

Formatting Bounding Boxes

When using an Object Detection model, bounding boxes should be formatted as lists in the form:

[class_id, confidence, top_left_x, top_left_y, width, height]

The first two components of the bounding box list represent the classification being made within the bounding box. The class_id represents the ID of the class detected within the bounding box, and the confidence represents the % confidence the model has in this prediction (0.0 for completely unconfident and 1.0 for completely confident).

The next four components of the bounding box list represent the location of the bounding box within the image: the top_left_x and top_left_y represent the X and Y pixel coordinates of the top-left corner of the bounding box. These pixel coordinates are calculated from the origin, which is in the top left corner of the image. This means that each coordinate is calculated by counting pixels from the image's left or the top, respectively. The width represents the number of pixels the bounding box covers from left to right, and the height represents the number of pixels the bounding box covers from top to bottom.

Attribute (Image Input)Prediction (bounding boxes)Ground Truth (bounding boxes)Non-Input Attribute (numeric or categorical)
image_1.jpg45.3462.42High School Education
image_2.jpg55.153.2Graduate Degree

Predict Function and Mapping

Teams must provide the relationship between the prediction and ground truth column to onboard their object detection models. This is defined to help set up your Arthur environment to calculate default performance metrics.

## Single Column Ground Truth 
output_mapping = {
  'pred_bounding_box_column':'gt_bounding_box_column'} 

# Build Function for this technique
arthur_model.build(reference_data,
                   pred_to_ground_truth_map=output_mapping 
                   )

Available Metrics

When onboarding object detection models, several default metrics are available to you within the UI. You can learn more about each specific metric in the metrics section of the documentation.

Out-of-the-Box Metrics

The following metrics are automatically available in the UI (out-of-the-box) per class when teams onboard a object detection model. Learn more about these metrics in the Performance Metrics section.

MetricMetric Type
MAPEPerformance
Inference CountIngestion
Average ConfidencePerformance

Drift Metrics

In the platform, drift metrics are calculated compared to a reference dataset. So, once a reference dataset is onboarded for your model, these metrics are available out of the box for comparison. Learn more about these metrics in the Drift and Anomaly section.

Of note, for unstructured data types (like text and image), feature drift is calculated for non-input attributes. The actual input to the model (in this case, image) drift is calculated with multivariate drift to accommodate the multivariate nature/relationships within the data type.

PSIFeature Drift
KL DivergenceFeature Drift
JS DivergenceFeature Drift
Hellinger DistanceFeature Drift
Hypothesis TestFeature Drift
Prediction DriftPrediction Drift
Multivariate DriftMultivariate Drift
Average Raw Anomaly ScoreMultivariate Drift

Note: Teams can evaluate drift for inference data at different intervals with our Python SDK and query service (for example, data coming into the model now compared to a month ago).

User-Defined Metrics

Whether your team uses a different performance metric, wants to track defined data segments, or needs logical functions to create a metric for external stakeholders (like product or business metrics). Learn more about creating metrics with data in Arthur in the User-Defined Metrics section.

Available Enrichments

The following enrichments can be enabled for this model type:

Anomaly DetectionHot SpotsExplainabilityBias Mitigation
X