Getting Started

This section provides a Quick Start for our Python SDK, and description of core Arthur concepts. For more in depth documentation on how to use advanced features with Arthur see the User Guide, and to see more detailed explanation of the underlying algorithms to Arthur features, see Algorithms.

Install the SDK

Install the SDK via pip:

pip install arthurai

Quick Start

This Quick Start will walk through setting up a sample Titanic Survivorship model.

  1. Create a client connection to authenticate and interact with the Arthur API:

    from arthurai import ArthurAI
    access_key = "<YOUR_API_KEY>"
    arthur = ArthurAI(url="https://app.arthur.ai", access_key=access_key)
    

    Substitute in the "<YOUR_API_KEY>" value with an API Key obtained from the Organization Menu > Manage API Keys in the Arthur UI. For details see API Keys. You can also pass in login and password parameters in place of access_key.

  2. Create an Arthur Model object. Our Titanic Survivorship model will use tabular input data and generate a binary class prediction of whether a passenger survived or perished:

    from arthurai.common.constants import InputType, OutputType
    arthur_model = arthur.model(partner_model_id="Titanic Survivorship Quick Start",
                                input_type=InputType.Tabular,
                                output_type=OutputType.Multiclass,
                                is_batch=True)
    
  3. Load our dataset and build a simple prediction model:

    import pandas as pd
    import numpy as np
    
    # Define sample data
    titanic_df = pd.DataFrame({
        'pass_class': [1, 1, 3, 1, 3, 3, 3, 1, 3, 3],
        'sex': ['F', 'F', 'M', 'F', 'M', 'F', 'M', 'M', 'M', 'M'],
        'age': [16.0, 24.0, 19.0, 58.0, 30.0, 22.0, 40.0, 37.0, 65.0, 32.0],
        'fare': [86.5, 49.5042, 8.05, 153.4625, 7.8958, 7.75, 7.8958, 29.7, 7.75, 7.8958],
        'survived': [1, 1, 1, 1, 0, 1, 0, 0, 0, 0]})
    
    # Predict the probability of survival as the inverse percentile of the passenger's age
    def predict(age):
        age_index = np.array(np.where(np.sort(np.array(titanic_df['age'])) == age)).mean()
        return 1 - (age_index / (len(titanic_df) - 1))
    
    titanic_df['pred_survived'] = titanic_df['age'].apply(predict)
    titanic_df['pred_perished'] = 1 - titanic_df['pred_survived']
    
    # Update ground truth labels to include the negative class as well, to match predictions
    titanic_df.rename(columns={"survived": "gt_survived"}, inplace=True)
    titanic_df["gt_perished"] = 1 - titanic_df["gt_survived"]
    
  4. Register the input and output attributes with the Arthur Model object:

    # Define a map from the predicted to corresponding ground truth attributes
    prediction_to_ground_truth_attribute_map = {
        "pred_survived": "gt_survived",
        "pred_perished": "gt_perished"
    }
    
    # Register model attributes with Arthur
    arthur_model.build(titanic_df,
                       pred_to_ground_truth_map = prediction_to_ground_truth_attribute_map,
                       positive_predicted_attr = 'pred_survived')
    
  5. The SDK will infer metadata about your model attributes from the sample data provided, and return a summary of the inferred attributes. If some attributes do not have the correct metadata they can be updated, see the Arthur Attributes section for more information on how to do this. Once the model schema is correct ArthurModel.save() can be called to save the model to your dashboard:

    arthur_model.save()
    
  6. Send inferences: The model registered in this example is a batch model, so we’ll send three batches of data to Arthur. To learn more about different types of models see Arthur Model Types.

    from random import randint
    for batch in range(1, 4):
        # Sample the dataset with predictions
        inferences = titanic_df.sample(randint(2, 5))
        
        # Send the inferences to Arthur
        arthur_model.send_inferences(inferences, batch_id=f"batch_{batch}")
    

Next Steps

For More examples see the Arthur Sandbox repository.

Or continue with the Getting Started section: