Deployment with AWS
This is the recommended option for scalable, production-grade deployments in AWS environments.
Prerequisites
-
Engine Version:
Look up an engine version to use from the Releases. -
OpenAI GPT Model:
- OpenAI GPT model with at least one endpoint (supports Azure and OpenAI).
- A secure network route between AWS and the OpenAI endpoint(s).
- Token limits configured as needed.
-
DNS:
- DNS URL for the GenAI Engine with an SSL certificate.
-
AWS Environment:
- AWS credentials with permissions to manage IAM, security groups, Secrets Manager, load balancer, RDS, ECS, CloudWatch.
- VPC with 3 private subnets and 2β3 public subnets.
- ARN of the TLS certificate from AWS Certificate Manager for the application DNS.
-
Container Image Repository:
- Network route to Docker Hub OR access to your private registry.
-
Arthur Platform Engine Credentials:
- If using Arthur Platform, obtain your
Client ID
andClient Secret
. - Otherwise, proceed with Engine-only (guardrails) deployment.
- If using Arthur Platform, obtain your
-
GPU Recommendation:
Arthur recommends using GPUs for production deployments for optimal latency and scalability.
Installation Steps
-
Log in to your AWS account (with your target VPC/subnets).
-
Navigate to CloudFormation in the AWS Console.
-
On the "Stacks" page, select Create stack > With new resources (standard).
-
On the "Create stack" page:
-
Choose "Template is ready" and "Amazon S3 URL".
-
Paste the S3 URL for the CloudFormation template for your desired deployment (GPU/CPU, GenAI Engine, or GenAI Engine-only guardrails). Example:
https://arthur-cft.s3.us-east-2.amazonaws.com/arthur-engine/templates/<version_number>/root-arthur-engine-gpu.yml
-
Replace
<version_number>
with your desired version from Releases.
-
-
Populate the stack details and click "Next".
-
Configure stack options as needed, then click "Next".
-
Review and create the stack.
- Set provisioning failure to "Roll back all stack resources."
- Use deletion policy during rollback.
-
Once
GenaiEngineLBStack
is complete, create anA
record (unless handled by Route 53) that routes the application DNS URL to the ALB.
Architecture Diagram

Frequently Asked Questions (FAQs)
- Subnet requirements:
- Private subnets: Host application and database, not accessible from the internet.
- Public subnets: Entry point for client LLM applications (via ALB), typically routed through an IGW or VPN.
- Ensure proper routes exist between public and private subnets.
- IAM & Security Groups:
- Refer to IAM/security group CloudFormation templates for customization.
- Azure OpenAI Quota:
- Use multiple endpoints or request a quota increase from Azure.
Continue with Setup: Click Here
Updated 2 days ago