Getting Started

This section provides a Quick Start for our Python SDK, and description of core Arthur concepts. For more in depth documentation on how to use advanced features see the User Guide.

Install the SDK

Install the SDK via pip:

pip install arthurai

Quick Start

This Quick Start will walk through setting up a sample Titanic Survivorship model.

  1. Create a client connection to authenticate and interact with the Arthur API:

    from arthurai import ArthurAI
    access_key = "<YOUR_API_KEY>"
    arthur = ArthurAI(url="https://app.arthur.ai", access_key=access_key)
    

    Substitute in the "<YOUR_API_KEY>" value with an API Key obtained from the Organization Menu > Manage API Keys in the Arthur UI. For details see API Keys.

  2. Create an Arthur Model object. Our Titanic Survivorship model will use tabular input data and generate a binary class prediction of whether a passenger survived or perished:

    from arthurai.common.constants import InputType, OutputType
    arthur_model = arthur.model(partner_model_id="Titanic Survivorship Quick Start",
                                input_type=InputType.Tabular,
                                output_type=OutputType.Multiclass,
                                is_batch=True)
    
  3. Load our dataset and register the inputs and outputs with the model object we just created:

    import pandas as pd 
    from arthurai.common.constants import Stage
    
    # Define sample data
    titanic_df = pd.DataFrame({
        "pass_class": [2, 3, 2, 3, 3, 3, 1, 2, 3, 1],
        "sex": ["M", "M", "F", "F", "M", "M", "M", "M", "F", "F"],
        "age": [16, 19, 40, 18, 4, 33, 21, 32.5, 18, 24],
        "fare": [26.0, 7.775, 13.0, 9.8417, 27.9, 7.8958, 77.2875, 30.0708, 17.8, 263.0],
        "survived": [0, 0, 1, 1, 0, 0, 0, 0, 0, 1]})
    
    # Define inputs attributes and register them with the Arthur model
    input_feature_columns = ["pass_class", "sex", "age", "fare"]
    arthur_model.from_dataframe(titanic_df[input_feature_columns], Stage.ModelPipelineInput)
    
    # Define output attributes as a map from the predicted ground truth class labels
    prediction_to_ground_truth_attribute_map = {
        "pred_survived": "gt_survived",
        "pred_perished": "gt_perished"
    }
    # Register the output attributes with the Arthur model, noting the positive class
    arthur_model.add_binary_classifier_output_attributes(
       pred_to_ground_truth_map = prediction_to_ground_truth_attribute_map,
       positive_predicted_attr = 'pred_survived',
    )
    
  4. The SDK will infer metadata about your model attributes from the sample data provided. Review the model attributes to make sure that everything looks right:

    arthur_model.review()
    

    If some attributes do not have the correct metadata they can be updated, see the Arthur Attributes section for more information on how to do this. Once the model schema is correct ArthuModel.save() can be called to save the model to your dashboard:

    arthur_model.save()
    
  5. Send inferences: The model registered in this example is a batch model, so we’ll send three batches of data to Arthur. To learn more about different types of models see Arthur Model Types.

    from random import random, randint
    for batch in range(1, 4):
        # Sample the dataset and update columns to match the names we registered with Arthur
        batch_sample = titanic_df.sample(randint(2, 4))
        inferences = batch_sample.rename(columns={"survived": "gt_survived"})
        inferences["gt_perished"] = 1 - inferences["gt_survived"]
        # 
        # Make random predictions. Our model probably won't be very good!
        inferences['pred_survived'] = [random() for _ in range(len(batch_sample))]
        inferences['pred_perished'] = 1 - inferences['pred_survived']     
        # 
        # Send the inferences to Arthur
        arthur_model.send_inferences(inferences, batch_id=f"batch_{batch}")
    

Next Steps

For More examples see the Arthur Sandbox repository.

Or continue with the Getting Started section: