Object Detection
Image Object Detection Models within Arthur
Object detection models analyze and classify objects within an image by placing a bounding box over the objects. In Arthur, these models are listed under the object detection model type.
Some common examples of Object Detection are:
- Animal detection with wildlife cameras
- Quality control in manufacturing, detecting defective pieces on the factory line
Formatted Data in Arthur
Object detection models require two columns: image input and bounding box output. When onboarding a reference dataset (and setting a model schema), you need to specify the relationship between your prediction and ground truth bounding box columns. Many teams also choose to onboard metadata for the model (i.e. any information you want to track about your inferences) as non-input attributes.
Formatting Bounding Boxes
When using an Object Detection model, bounding boxes should be formatted as lists in the form:
[class_id, confidence, top_left_x, top_left_y, width, height]
The first two components of the bounding box list represent the classification being made within the bounding box. The class_id represents the ID of the class detected within the bounding box, and the confidence represents the % confidence the model has in this prediction (0.0 for completely unconfident and 1.0 for completely confident).
The next four components of the bounding box list represent the location of the bounding box within the image: the top_left_x and top_left_y represent the X and Y pixel coordinates of the top-left corner of the bounding box. These pixel coordinates are calculated from the origin, which is in the top left corner of the image. This means that each coordinate is calculated by counting pixels from the image's left or the top, respectively. The width represents the number of pixels the bounding box covers from left to right, and the height represents the number of pixels the bounding box covers from top to bottom.
Attribute (Image Input) | Prediction (bounding boxes) | Ground Truth (bounding boxes) | Non-Input Attribute (numeric or categorical) |
---|---|---|---|
image_1.jpg | 45.34 | 62.42 | High School Education |
image_2.jpg | 55.1 | 53.2 | Graduate Degree |
Predict Function and Mapping
Teams must provide the relationship between the prediction and ground truth column to onboard their object detection models. This is defined to help set up your Arthur environment to calculate default performance metrics.
## Single Column Ground Truth
output_mapping = {
'pred_bounding_box_column':'gt_bounding_box_column'}
# Build Function for this technique
arthur_model.build(reference_data,
pred_to_ground_truth_map=output_mapping
)
Available Metrics
When onboarding object detection models, several default metrics are available to you within the UI. You can learn more about each specific metric in the metrics section of the documentation.
Out-of-the-Box Metrics
The following metrics are automatically available in the UI (out-of-the-box) per class when teams onboard a object detection model. Learn more about these metrics in the Performance Metrics section.
Metric | Metric Type |
---|---|
MAPE | Performance |
Inference Count | Ingestion |
Average Confidence | Performance |
Drift Metrics
In the platform, drift metrics are calculated compared to a reference dataset. So, once a reference dataset is onboarded for your model, these metrics are available out of the box for comparison. Learn more about these metrics in the Drift and Anomaly section.
Of note, for unstructured data types (like text and image), feature drift is calculated for non-input attributes. The actual input to the model (in this case, image) drift is calculated with multivariate drift to accommodate the multivariate nature/relationships within the data type.
PSI | Feature Drift |
---|---|
KL Divergence | Feature Drift |
JS Divergence | Feature Drift |
Hellinger Distance | Feature Drift |
Hypothesis Test | Feature Drift |
Prediction Drift | Prediction Drift |
Multivariate Drift | Multivariate Drift |
Average Raw Anomaly Score | Multivariate Drift |
Note: Teams can evaluate drift for inference data at different intervals with our Python SDK and query service (for example, data coming into the model now compared to a month ago).
User-Defined Metrics
Whether your team uses a different performance metric, wants to track defined data segments, or needs logical functions to create a metric for external stakeholders (like product or business metrics). Learn more about creating metrics with data in Arthur in the User-Defined Metrics section.
Available Enrichments
The following enrichments can be enabled for this model type:
Anomaly Detection | Hot Spots | Explainability | Bias Mitigation |
---|---|---|---|
X |
Updated 11 months ago