Arthur + AWS Bedrock

How does a developer instrument AWS Bedrock API calls with Arthur in under 10 minutes? You install the Arthur Observability SDK with the Bedrock extra, initialize the Arthur client, call arthur.instrument_bedrock(), and every subsequent boto3 Bedrock call — whether it uses converse or invoke_model — is automatically traced with model ID, token usage, and latency. No per-model instrumentation is needed; Claude on Bedrock, Titan, Llama, and any other Bedrock-hosted model are all captured through a single integration point.

Overview

AWS Bedrock gives you a unified API to call models from Anthropic, Amazon, Meta, and others. Arthur's Bedrock instrumentation wraps the boto3 Bedrock Runtime client using OpenInference so that every model invocation is captured as an OpenTelemetry span and exported to the Arthur GenAI Engine. Once instrumented, you get full visibility into:

Model ID — which Bedrock model handled each request
Prompts and completions — input messages and output content
Token usage — input and output token counts per call
Latency and errors — end-to-end duration and failure tracking
Multi-model visibility — one instrumentation call covers all Bedrock models
Session and user context — group traces by conversation or end-user

sequenceDiagram
    participant App as Your Application
    participant SDK as Arthur SDK
    participant Bedrock as AWS Bedrock Runtime
    participant Engine as Arthur GenAI Engine

    App->>SDK: arthur.instrument_bedrock()
    Note over SDK: Auto-instrumentation enabled
    App->>Bedrock: client.converse() / invoke_model()
    Bedrock-->>App: Model response
    SDK->>Engine: Trace (spans, attributes)
    Note over Engine: Traces visible in dashboard

Prerequisites:

Python 3.10+
An Arthur GenAI Engine instance (cloud or local)
An Arthur API key — see API Keys to create one

Installation

Install the Arthur Observability SDK with the bedrock extra:

pip install "arthur-observability-sdk[bedrock]"

This pulls in the openinference-instrumentation-bedrock package and its dependencies.

Initialize Arthur

Create a single Arthur instance at application startup.

from arthur_observability_sdk import Arthur

arthur = Arthur(
    api_key="your-api-key",        # or set ARTHUR_API_KEY env var
    base_url="https://your-arthur-engine-instance",  # or set ARTHUR_BASE_URL env var
    task_id="<your-task-uuid>",    # Arthur task UUID
    service_name="my-bedrock-app",
)

Parameter	Description
`api_key`	Your Arthur Engine API key. Falls back to `ARTHUR_API_KEY` env var.
`base_url`	Base URL of your Arthur GenAI Engine. Falls back to `ARTHUR_BASE_URL` env var, then `http://localhost:3030`.
`task_id`	Arthur task UUID for associating traces with a specific task.
`service_name`	OpenTelemetry `service.name` resource attribute. Used to identify your application in the Arthur dashboard. Creates a new task based on `service_name` if `task_id` isn't specified.

📘
At least one of task_id or service_name must be provided. A new task with the service_name will be created if task_id is not specified.

⚠️
Use environment variables for secrets. Set ARTHUR_API_KEY and ARTHUR_BASE_URL as environment variables (e.g., in a .env file) rather than hardcoding them in your application.

Instrument Bedrock

A single call to instrument_bedrock() patches the boto3 Bedrock Runtime client. After this, both converse and invoke_model calls are traced automatically.

import boto3
from arthur_observability_sdk import Arthur

arthur = Arthur(
    api_key="your-api-key",
    base_url="https://your-arthur-engine-instance",
    task_id="<your-task-uuid>",
    service_name="my-bedrock-app",
)
arthur.instrument_bedrock()

client = boto3.client("bedrock-runtime", region_name="us-east-1")

# Using the Converse API — traced automatically
response = client.converse(
    modelId="anthropic.claude-4-6-sonnet",
    messages=[
        {"role": "user", "content": [{"text": "Hello, Claude on Bedrock!"}]},
    ],
)
print(response["output"]["message"]["content"][0]["text"])

arthur.shutdown()

You can also use the lower-level invoke_model API — it is traced the same way:

import json

body = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello!"}],
})

response = client.invoke_model(
    modelId="anthropic.claude-4-6-sonnet",
    body=body,
)
result = json.loads(response["body"].read())
print(result["content"][0]["text"])

Key points:

instrument_bedrock() patches the boto3 Bedrock Runtime client globally — you do not need to wrap individual calls.
Both converse and invoke_model APIs are traced with model ID, token usage, and latency.
Call arthur.shutdown() when your application exits to flush any remaining traces.

Trace Multi-Model Calls

Because instrument_bedrock() patches the boto3 client globally, you can call different Bedrock models in the same application and every call is traced — no additional configuration per model.

import boto3
from arthur_observability_sdk import Arthur

arthur = Arthur(
    api_key="your-api-key",
    base_url="https://your-arthur-engine-instance",
    task_id="<your-task-uuid>",
    service_name="multi-model-app",
)
arthur.instrument_bedrock()

client = boto3.client("bedrock-runtime", region_name="us-east-1")

# Call Claude on Bedrock
claude_response = client.converse(
    modelId="anthropic.claude-4-6-sonnet",
    messages=[
        {"role": "user", "content": [{"text": "Summarize quantum computing in one sentence."}]},
    ],
)

# Call Amazon Titan
titan_response = client.converse(
    modelId="amazon.titan-text-express-v1",
    messages=[
        {"role": "user", "content": [{"text": "Summarize quantum computing in one sentence."}]},
    ],
)

# Call Llama via Bedrock
llama_response = client.converse(
    modelId="meta.llama3-8b-instruct-v1:0",
    messages=[
        {"role": "user", "content": [{"text": "Summarize quantum computing in one sentence."}]},
    ],
)

# All three calls are traced with their respective model IDs
arthur.shutdown()

flowchart LR
    A[Your App] -->|converse - Claude| B[AWS Bedrock]
    A -->|converse - Titan| B
    A -->|converse - Llama| B
    B --> C[Responses]
    A -->|OpenTelemetry spans| D[Arthur GenAI Engine]
    D --> E[Unified Dashboard]

Add Session and User Context

Use arthur.attributes() as a context manager to attach session and user metadata to all spans created within the block.

with arthur.attributes(session_id="sess-1", user_id="user-42"):
    response = client.converse(
        modelId="anthropic.claude-4-6-sonnet",
        messages=[{"role": "user", "content": [{"text": "Hello!"}]}],
    )

This is especially useful for:

Multi-turn conversations — trace an entire chat session end-to-end
Per-user analytics — understand how individual users interact with your application
Debugging — filter traces in the Arthur dashboard by session or user

Verify in Arthur

After running your instrumented application, traces appear in the Arthur GenAI Engine within seconds.

What to look for in the dashboard:

Trace list — each Bedrock call appears as a trace with the model ID, input messages, output, and latency
Session grouping — if you used arthur.attributes(session_id=...), traces are grouped by session
User filtering — filter by user_id to see a specific user's interactions
Token usage — input and output token counts are captured automatically

You can also query traces programmatically:

curl -X GET "${ARTHUR_BASE_URL}/api/v1/traces?task_ids=${ARTHUR_TASK_ID}" \
  -H "Authorization: Bearer ${ARTHUR_API_KEY}"

Troubleshooting

Symptom	Fix
No traces appearing	Verify `ARTHUR_API_KEY` and `ARTHUR_BASE_URL` are correct and your Arthur Engine is reachable from your application.
Missing Bedrock spans	Call `arthur.instrument_bedrock()` before creating your `boto3.client("bedrock-runtime")` client.
AWS authentication errors	Ensure your AWS credentials and region grant access to the Bedrock model you're invoking.
Traces delayed	Traces are exported asynchronously via `BatchSpanProcessor`; allow a few seconds, or call `arthur.shutdown()` to flush.
`ImportError` on instrument	Run `pip install "arthur-observability-sdk[bedrock]"` to install the required extra.

Next Steps

Now that you have unified observability across all your AWS Bedrock model calls, explore these capabilities:

Continuous Evaluations — automatically score Bedrock responses for quality, safety, and relevance
Prompt Management — version and A/B test the prompts you send to Bedrock models via arthur.get_prompt()
Agentic Experiments — if you're building agents that call Bedrock as one step in a chain, run end-to-end evaluations
Read our Best Practices for Building Agents Blog Series — observability and tracing fundamentals for building production agents
Other Integrations — if you also call OpenAI, Anthropic direct, or other providers, add arthur.instrument_openai() or arthur.instrument_langchain() to get a single pane of glass across all your LLM calls

flowchart LR
    A[Instrument Bedrock] --> B[View Traces]
    B --> C[Add Evaluations]
    B --> D[Manage Prompts]
    C --> E[Set Up Continuous Evals]
    D --> F[A/B Test Prompts]
    E --> G[Production Monitoring]
    F --> G

Arthur + AWS Bedrock

Overview

Installation

Initialize Arthur

At least one of `task_id` or `service_name` must be provided. A new task with the `service_name` will be created if `task_id` is not specified.

Use environment variables for secrets. Set `ARTHUR_API_KEY` and `ARTHUR_BASE_URL` as environment variables (e.g., in a `.env` file) rather than hardcoding them in your application.

Instrument Bedrock

Trace Multi-Model Calls

Add Session and User Context

Verify in Arthur

Troubleshooting

Next Steps

Overview

Installation

Initialize Arthur

At least one of task_id or service_name must be provided. A new task with the service_name will be created if task_id is not specified.

Use environment variables for secrets. Set ARTHUR_API_KEY and ARTHUR_BASE_URL as environment variables (e.g., in a .env file) rather than hardcoding them in your application.

Instrument Bedrock

Trace Multi-Model Calls

Add Session and User Context

Verify in Arthur

Troubleshooting

Next Steps

At least one of `task_id` or `service_name` must be provided. A new task with the `service_name` will be created if `task_id` is not specified.

Use environment variables for secrets. Set `ARTHUR_API_KEY` and `ARTHUR_BASE_URL` as environment variables (e.g., in a `.env` file) rather than hardcoding them in your application.