Arthur + OpenAI Agents SDK

How does a developer instrument an OpenAI Agents SDK application with Arthur in under 10 minutes? You install the Arthur Observability SDK with the openai-agents extra, initialize the Arthur client with your credentials and task ID, call arthur.instrument_openai_agents(), and run your agents as usual. Arthur automatically captures the full agent execution graph — including agent handoffs, tool calls, multi-turn conversations, and LLM completions — as OpenTelemetry traces and sends them to your Arthur GenAI Engine instance.

Overview

The OpenAI Agents SDK (the production successor to Swarm) introduces first-class primitives for multi-agent orchestration: Agent, Runner, handoffs, and tool definitions. Arthur's instrumentation hooks into these primitives at the framework level so you get complete trace visibility without modifying your agent logic. Once instrumented, you get full visibility into:

Agent invocations — each agent run with name, instructions, and model
Handoffs — when one agent hands off to another, with parent-child span relationships preserved
Tool calls — input arguments and return values for every tool invocation
LLM completions — underlying OpenAI API calls with token usage, model, and latency
Session and user context — group traces by conversation or end-user

sequenceDiagram
    participant App as Your Application
    participant SDK as Arthur SDK
    participant Agents as OpenAI Agents SDK
    participant Engine as Arthur GenAI Engine

    App->>SDK: arthur.instrument_openai_agents()
    Note over SDK: Auto-instrumentation enabled
    App->>Agents: Runner.run(agent, ...)
    Agents-->>App: Result
    SDK->>Engine: Trace (spans, attributes)
    Note over Engine: Traces visible in dashboard

Prerequisites:

Python 3.10+
An Arthur GenAI Engine instance (cloud or local)
An Arthur API key — see API Keys to create one

Installation

Install the Arthur Observability SDK with the openai-agents extra:

pip install "arthur-observability-sdk[openai-agents]"

Initialize Arthur

Create an Arthur instance to configure telemetry export and connect to your Arthur GenAI Engine.

from arthur_observability_sdk import Arthur

arthur = Arthur(
    api_key="your-api-key",        # or set ARTHUR_API_KEY env var
    base_url="https://your-arthur-engine-instance",  # or set ARTHUR_BASE_URL env var
    task_id="<your-task-uuid>",    # Arthur task UUID
    service_name="my-agents-app",
)

Parameter	Description
`api_key`	Your Arthur Engine API key. Falls back to `ARTHUR_API_KEY` env var.
`base_url`	Base URL of your Arthur GenAI Engine. Falls back to `ARTHUR_BASE_URL` env var, then `http://localhost:3030`.
`task_id`	Arthur task UUID for associating traces with a specific task.
`service_name`	OpenTelemetry `service.name` resource attribute. Used to identify your application in the Arthur dashboard. Creates a new task based on `service_name` if `task_id` isn't specified.

📘
At least one of task_id or service_name must be provided. A new task with the service_name will be created if task_id is not specified.

⚠️
Use environment variables for secrets. Set ARTHUR_API_KEY and ARTHUR_BASE_URL as environment variables (e.g., in a .env file) rather than hardcoding them in your application.

Instrument OpenAI Agents

A single method call enables automatic instrumentation of the entire OpenAI Agents SDK. Call this before any Runner.run() calls.

import asyncio
from agents import Agent, Runner
from arthur_observability_sdk import Arthur

arthur = Arthur(
    api_key="your-api-key",
    base_url="https://your-arthur-engine-instance",
    task_id="<your-task-uuid>",
    service_name="my-agents-app",
)
arthur.instrument_openai_agents()

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant.",
    model="gpt-4o-mini",
)

async def main():
    result = await Runner.run(agent, "What is observability?")
    print(result.final_output)

asyncio.run(main())

arthur.shutdown()

Key points:

instrument_openai_agents() must be called before any Runner.run(), Runner.run_sync(), or Runner.run_streamed() calls.
All agent invocations, handoffs, tool calls, and LLM completions are automatically traced.
Call arthur.shutdown() when your application exits to flush any remaining traces.

Trace Agent Handoffs and Tool Calls

The real power of the OpenAI Agents SDK is multi-agent orchestration — agents that hand off to other agents and invoke tools. Arthur captures this entire execution graph automatically.

Multi-agent handoffs

When you define agents that hand off to each other, Arthur records the full chain of agent invocations as nested spans:

from agents import Agent, Runner
import asyncio

triage_agent = Agent(
    name="Triage",
    instructions="You route questions to the right specialist.",
    handoffs=["billing_agent", "technical_agent"],
    model="gpt-4o-mini",
)

billing_agent = Agent(
    name="Billing",
    instructions="You handle billing and payment questions.",
    model="gpt-4o-mini",
)

technical_agent = Agent(
    name="Technical",
    instructions="You handle technical support questions.",
    model="gpt-4o-mini",
)

# Update handoffs to reference actual Agent objects
triage_agent.handoffs = [billing_agent, technical_agent]

async def main():
    result = await Runner.run(triage_agent, "I need help with my invoice.")
    print(result.final_output)

asyncio.run(main())

Arthur captures this as a trace with the following span hierarchy:

flowchart TD
    R["Runner.run() — CHAIN"] --> T["Triage Agent — AGENT"]
    T --> LLM1["OpenAI gpt-4o-mini — LLM"]
    T -->|handoff| B["Billing Agent — AGENT"]
    B --> LLM2["OpenAI gpt-4o-mini — LLM"]

Tool calls

When agents invoke tools, each tool execution is captured as a TOOL span with input arguments and return values:

from agents import Agent, Runner, function_tool
import asyncio

@function_tool
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    return f"The weather in {city} is 72°F and sunny."

weather_agent = Agent(
    name="WeatherBot",
    instructions="You help users check the weather.",
    tools=[get_weather],
    model="gpt-4o-mini",
)

async def main():
    result = await Runner.run(weather_agent, "What's the weather in San Francisco?")
    print(result.final_output)

asyncio.run(main())

Add Session and User Context

Use arthur.attributes() as a context manager to attach session IDs, user IDs, and custom metadata to all traces within the block.

async def handle_request(user_id: str, session_id: str, message: str):
    with arthur.attributes(session_id=session_id, user_id=user_id):
        result = await Runner.run(agent, message)
        return result.final_output

This is especially useful for:

Multi-turn conversations — trace an entire chat session end-to-end
Per-user analytics — understand how individual users interact with your application
Debugging — filter traces in the Arthur dashboard by session or user

The attributes() context manager uses OpenInference context propagation, so all spans created within the block — including nested agent handoffs and tool calls — inherit the session and user attributes.

Verify in Arthur

After running your instrumented application, traces appear in the Arthur GenAI Engine within seconds.

What to look for in the dashboard:

Trace list — each Runner.run() call appears as a trace with the full agent/handoff/tool span tree
Session grouping — if you used arthur.attributes(session_id=...), traces are grouped by session
User filtering — filter by user_id to see a specific user's interactions
Token usage — prompt and completion token counts are captured automatically

You can also query traces programmatically:

curl -X GET "${ARTHUR_BASE_URL}/api/v1/traces?task_ids=${ARTHUR_TASK_ID}" \
  -H "Authorization: Bearer ${ARTHUR_API_KEY}"

Troubleshooting

Symptom	Fix
No traces appear	Verify `ARTHUR_API_KEY` is set and valid in the Arthur UI.
No traces appear	Ensure `ARTHUR_BASE_URL` is reachable and includes the `https://` protocol.
Traces appear but no agent spans	Call `instrument_openai_agents()` before any `Runner.run()` calls.
Missing tool spans	Decorate tool functions with `@function_tool` — the Agents SDK requires this for proper registration.
`ImportError` on startup	Run `pip install "arthur-observability-sdk[openai-agents]"` to install the required extra.

Next Steps

Now that you have OpenAI Agents SDK traces flowing into Arthur, explore these capabilities:

Continuous Evaluations — automatically score agent responses for quality, safety, and relevance
Agentic Experiments — compare agent configurations side-by-side
Prompt Management — fetch and version prompts from Arthur Engine using arthur.get_prompt() to keep your agent instructions centrally managed
Read our Best Practices for Building Agents Blog Series — observability and tracing fundamentals for building production agents
Other Integrations — combine instrument_openai_agents() with instrument_openai() for even deeper LLM-level traces

flowchart LR
    A[Instrument OpenAI Agents] --> B[View Traces]
    B --> C[Add Evaluations]
    B --> D[Manage Prompts]
    C --> E[Set Up Continuous Evals]
    D --> F[A/B Test Prompts]
    E --> G[Production Monitoring]
    F --> G

Arthur + OpenAI Agents SDK

Overview

Installation

Initialize Arthur

At least one of `task_id` or `service_name` must be provided. A new task with the `service_name` will be created if `task_id` is not specified.

Use environment variables for secrets. Set `ARTHUR_API_KEY` and `ARTHUR_BASE_URL` as environment variables (e.g., in a `.env` file) rather than hardcoding them in your application.

Instrument OpenAI Agents

Trace Agent Handoffs and Tool Calls

Multi-agent handoffs

Tool calls

Add Session and User Context

Verify in Arthur

Troubleshooting

Next Steps

Overview

Installation

Initialize Arthur

At least one of task_id or service_name must be provided. A new task with the service_name will be created if task_id is not specified.

Use environment variables for secrets. Set ARTHUR_API_KEY and ARTHUR_BASE_URL as environment variables (e.g., in a .env file) rather than hardcoding them in your application.

Instrument OpenAI Agents

Trace Agent Handoffs and Tool Calls

Multi-agent handoffs

Tool calls

Add Session and User Context

Verify in Arthur

Troubleshooting

Next Steps

At least one of `task_id` or `service_name` must be provided. A new task with the `service_name` will be created if `task_id` is not specified.

Use environment variables for secrets. Set `ARTHUR_API_KEY` and `ARTHUR_BASE_URL` as environment variables (e.g., in a `.env` file) rather than hardcoding them in your application.