Product DocumentationAPI and Python SDK ReferenceRelease Notes
Schedule a Demo
Schedule a Demo

CV Onboarding

This page shows the basics of setting up computer vision (CV) models and onboarding Arthur Scope to monitor vision-specific performance.

Getting Started

The first step is to import functions from the arthurai package and establish a connection with Arthur.

# Arthur imports
from arthurai import ArthurAI
from arthurai.common.constants import InputType, OutputType, Stage

arthur = ArthurAI(url="", 

Registering a CV Model

Each computer vision model is created with input_type = InputType.Image and with specified width and height
dimensions for the processed images. Here, we register a classification model on 1024x1024 images:

arthur_cv_model = arthur.model(name="ImageQuickstart",
as a resized copy in the model dimensions. If you enable explainability for your model, the resized versions will be 
passed to it to generate explanations.

The different OutputType values currently supported for computer vision models are classification, regression, and object detection.

Formatting Data

Computer vision models require the same structure as Tabular and NLP models. However, the attribute value for Image attributes should be a valid path to the image file for that inference.

Here is an example of a valid reference_data data frame to build an ArthurModel with:

    image_attr             pred_value   ground_truth    non_input_1   
0   'img_path/img_0.png'   0.1          0               0.2  
1   'img_path/img_1.png'   0.05         0              -0.3 
2   'img_path/img_2.png'   0.02         1               0.7     ...
3   'img_path/img_3.png'   0.8          1               1.2 
4   'img_path/img_4.png'   0.4          0              -0.5  

Non-Input Attributes

Any non-pixel features to be tracked in images for performance comparison or bias detection should be added as
non-input attributes. For example, metadata about people's identities captured in images for a CV model should be included as non-input attributes.

Reviewing the Model Schema

Before you call can call the model schema to check that your data is parsed correctly.

For an image model, the model schema should look like this:

     name           stage                 value_type    categorical   is_unique  
0    image_attr     PIPELINE_INPUT        IMAGE         False         True   
1    pred_value     PREDICTED_VALUE       FLOAT         False         False      ...
2    ground_truth   GROUND_TRUTH          INTEGER       True          False   
3    non_input_1    NON_INPUT_DATA        FLOAT         False         False   

Object Detection

Formatting Bounding Boxes

If using an Object Detection model, bounding boxes should be formatted as lists in the form:

[class_id, confidence, top_left_x, top_left_y, width, height]

The first two components of the bounding box list represent the classification being made within the bounding box. Theclass_id represents the ID of the class detected within the bounding box, and the confidence represents the % confidence the model has in this prediction (0.0 for completely unconfident and 1.0 for completely confident).

The next four components of the bounding box list represent the location of the bounding box within the image: the
top_left_x and top_left_y represent the X and Y pixel coordinates of the top-left corner of the bounding box. These pixel coordinates are calculated from the origin, which is in the top left corner of the image. This means that each coordinate is calculated by counting pixels from the image's left or the top, respectively. The width represents the number of pixels the bounding box covers from left to right and the height represents the number of pixels the bounding box covers from top to bottom.

So using the following model schema as an example:

	name	                stage	           value_type	
0	image_attr              PIPELINE_INPUT	   IMAGE	
1	label	                GROUND_TRUTH	   BOUNDING_BOX
2	objects_detected        PREDICTED_VALUE	   BOUNDING_BOX	

a valid dataset would look like

#    image_attr              objects_detected              ground_truth             non_input_1   
0,   'img_path/img_0.png',   [[0, 0.98, 12, 20, 50, 25],   [0, 1, 14, 22, 48, 29],  0.2
                             [1, 0.47, 92, 140, 80, 36]]     
1,   'img_path/img_1.png',   [[1, 0.22, 4, 5, 14, 32]]     [1, 1, 25, 43, 49, 25]   -0.3     #...
#                                ...

Finishing Onboarding

Once you have finished formatting your reference data and your model schema looks correct using, you are finished locally configuring your model and its attributes - so you are ready to complete onboarding your model.

To finish onboarding your CV model, the following steps apply, which is the same for CV models as it is for models
of any InputType and OutputType:


For an overview of configuring enrichments for image models, see the Enabling Enrichments section.

For a step-by-step walkthrough of setting up the explainability Enrichment for image models, see the Assets Required For Explainability section.