Installing Arthur Pre-requisites

This is a guide to help you prepare your existing Kubernetes cluster for installing the Arthur platform.
The examples use Helm 3.

Make sure you're in the correct kubectl environment context before running the installer.

Install Ingress

Nginx

Nginx is an enterprise-grade cloud-agnostic open source ingress controller that can be used to access the Arthur application.

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm upgrade --install -n ingress-system \
  --create-namespace \
  ingress-nginx \
  ingress-nginx/ingress-nginx

[Optional] To monitor nginx using Prometheus and add an AWS managed SSL certificate, create a values.yaml file with following contents -

controller:
  metrics:
    enabled: true
    serviceMonitor:
      enabled: true
      additionalLabels:
        release: "kube-prometheus-stack"
  service:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
      service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
      service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert: <ACM certificate ARN>
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: https
      service.beta.kubernetes.io/aws-load-balancer-type: <nlb> # optional annotation that creates a Network Load Balancer. Defaults to elb (Classic Load Lalancer)
      service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy: ELBSecurityPolicy-TLS-1-2-2017-01
      service.beta.kubernetes.io/aws-load-balancer-internal: true # optional annotation that creates a non-internet-facing loadbalancer. Defaults to false
    targetPorts:
      http: "tohttps"

  allowSnippetAnnotations: "true"
  config:
    http-snippet: |
      server {
        listen 2443;
        return 308 https://$host$request_uri;
      }
    use-forwarded-headers: "true"
  ingressClassResource:
    name: nginx
    enabled: true
    default: false
    controllerValue: "k8s.io/internal-ingress-nginx" # default: k8s.io/ingress-nginx

  containerPort:
    http: 8080
    tohttps: 2443
    https: 80

Upgrade or install the helm chart with the values.yaml you created.

helm upgrade --install -n ingress-system \
  --create-namespace \
  ingress-nginx \
  ingress-nginx/ingress-nginx \
  -f values.yaml

If you need to install Nginx in the same namespace as Arthur (not recommended) and want to use our network-policy to restrict ingress to the Arthur application, use the below command to add labels to the pods and services. The network-policy allows traffic between pods and services that have these labels.

helm upgrade --install -n arthur --set controller.podLabels.network-app=arthurai,controller.service.labels.network-app=arthurai,defaultBackend.podLabels.network-app=arthurai,.service.labels.network-app=arthurai \
  ingress-nginx \
  ingress-nginx/ingress-nginx

Look up the hostname for the Ingress and configure it in your DNS (e.g. arthur.mydomain.com).

kubectl get svc -n ingress-system ingress-nginx-controller -ojsonpath='{.status.loadBalancer.ingress[*].hostname}'

Install Prometheus

Installing the Chart

helm repo add \
  prometheus-community \
  https://prometheus-community.github.io/helm-charts
helm repo update

helm upgrade --install -n monitoring \
  --create-namespace \
  kube-prometheus-stack \
  prometheus-community/kube-prometheus-stack \
  -f /path/to/values.yaml # see below for contents of this file
helm upgrade --install -n monitoring \
  --create-namespace \
  prometheus-adapter \
  prometheus-community/prometheus-adapter

Note: The values.yaml is not incremental. Helm uses a single values.yaml file, so all configuration must be present in the same values.yaml file. If you are doing this step by step, you must re-apply the prior changes to values.yaml file.

Setting up retention for Grafana and Prometheus

By default, Prometheus and Grafana will use local pod storage to store metrics/dashboards. These metrics/dashboards will be lost if the pod restarts for any reason. To avoid this and keep the metrics/dashboards for a longer period of time, add the following to your values.yaml to use a persistent volume store:

prometheus:
  prometheusSpec:
    serviceMonitorSelectorNilUsesHelmValues: false
    retention: 30d # metrics rolled over every 30 days
    retentionSize: 49GiB # size of metrics retained before they are rolled over
    storageSpec:
     volumeClaimTemplate:
       spec:
         storageClassName: gp2
         accessModes: ["ReadWriteOnce"]
         resources:
           requests:
             storage: 50Gi # size of disk for metrics
grafana:
  persistence:
    type: pvc
    enabled: true
    storageClassName: gp2
    accessModes:
      - ReadWriteOnce
    size: 1Gi # size of disk for dashboards

Run the following command to apply the updated configurations (replace the path to the values.yaml file):

helm upgrade --install -n monitoring \
  kube-prometheus-stack \
  prometheus-community/kube-prometheus-stack \
  -f /path/to/values.yaml

Setting up ingress for Prometheus, AlertManager and Grafana

Grafana and Prometheus are useful to have exposed on an ingress route so that cluster administrators can access real-time telemetry and observe the behavior of the Arthur Platform.

Please note that Grafana comes with a default username and password which should be changed immediately once installed. We also highly recommend installing Prometheus and Grafana within a VPC where the domains will not be exposed to the public internet.

The steps to enable ingress are:

Copy the values.yaml file below with the ingress configuration
Make the following edits to the yaml file that describe your environment:
1. The ingressClassName
  1. If you installed using the nginx chart above, this should be nginx.
  2. If you are using a custom nginx ingressClass, this will be the name of that ingress class
  3. If you unsure what your ingressClass is called, run kubectl get ingressclass
2. The URL hostnames that you want to expose for these services
  1. Note - these URL hostnames will need to be published DNS entries
Run the following command to deploy (replace the path to the values.yaml file)

helm upgrade --install -n monitoring \
  kube-prometheus-stack \
  prometheus-community/kube-prometheus-stack \
  -f /path/to/values.yaml

To confirm this is working, navigate to the URL hostname defined in the values.yaml and you should be taken
to the front page for either Grafana or Prometheus.

Change the default password for Grafana.

Here is the values.yaml file that configures ingress for prometheus/grafana:

prometheus:
  ingress:
    enabled: true
    ingressClassName: nginx  # Confirm this is correct or replace me
    hosts:
      - prometheus.mydomain.com  # Replace me
alertmanager:
  ingress:
    enabled: true
    ingressClassName: nginx
    hosts:
      - alertmanager.mydomain.com  # Replace me
grafana:
  ingress:
    enabled: true
    ingressClassName: nginx  # Confirm this is correct or replace me
    hosts:
      - grafana.mydomain.com  # Replace me

Verifying the install

Verify that Prometheus CRDs are installed:

kubectl api-resources | grep monitoring

Verify that Prometheus is up and running:

kubectl --namespace monitoring get pods -l "release=kube-prometheus-stack"

If everything is installed correctly, the following command should not return "ServiceUnavailable":

kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1

Monitoring and Alerting for the Arthur Platform

When you are ready to setup monitoring and Alerting, please reach out to your Arthur support representative and we can share additional details on this.

Prometheus alerts can be configured to trigger when certain rules that are setup to track Prometheus metrics violate for a period of time, which is customizable. For more information see the Prometheus alerting documentation These rules can then be configured to send via a notification channel (eg: email, Slack, etc) so that someone can be notified. This is the job of the Alert Manager.

Install Metrics Server

Example:

helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm upgrade --install -n monitoring \
  --create-namespace \
  metrics-server \
  bitnami/metrics-server \
  --set apiService.create=true \
  --set --extraArgs.kubelet-preferred-address-types=InternalIP

Verify that you can retrieve metric snapshots.

kubectl top node

Configure the cluster-autoscaler

In a production environment, it is vital to ensure that there are enough resources (memory and cpu) available for pods to get scheduled on the Kubernetes cluster. Please follow the instructions for your cloud provider to install the cluster-autoscaler on your cluster.

Verify that the cluster-autoscaler is successfully installed.

kubectl get deployments -n kube-system | grep -i cluster-autoscaler

Cloud Provider-specific Configuration

If installing on an existing Amazon AWS EKS, follow the additional steps Deploying on Amazon AWS EKS