Arthur + OpenAI Agents SDK

How does a developer instrument an OpenAI Agents SDK application with Arthur in under 10 minutes? You install the Arthur Observability SDK with the openai-agents extra, initialize the Arthur client with your credentials and task ID, call arthur.instrument_openai_agents(), and run your agents as usual. Arthur automatically captures the full agent execution graph — including agent handoffs, tool calls, multi-turn conversations, and LLM completions — as OpenTelemetry traces and sends them to your Arthur GenAI Engine instance.


Overview

The OpenAI Agents SDK (the production successor to Swarm) introduces first-class primitives for multi-agent orchestration: Agent, Runner, handoffs, and tool definitions. Arthur's instrumentation hooks into these primitives at the framework level so you get complete trace visibility without modifying your agent logic. Once instrumented, you get full visibility into:

  • Agent invocations — each agent run with name, instructions, and model
  • Handoffs — when one agent hands off to another, with parent-child span relationships preserved
  • Tool calls — input arguments and return values for every tool invocation
  • LLM completions — underlying OpenAI API calls with token usage, model, and latency
  • Session and user context — group traces by conversation or end-user
sequenceDiagram
    participant App as Your Application
    participant SDK as Arthur SDK
    participant Agents as OpenAI Agents SDK
    participant Engine as Arthur GenAI Engine

    App->>SDK: arthur.instrument_openai_agents()
    Note over SDK: Auto-instrumentation enabled
    App->>Agents: Runner.run(agent, ...)
    Agents-->>App: Result
    SDK->>Engine: Trace (spans, attributes)
    Note over Engine: Traces visible in dashboard

Prerequisites:

  • Python 3.10+
  • An Arthur GenAI Engine instance (cloud or local)
  • An Arthur API key — see API Keys to create one

Installation

Install the Arthur Observability SDK with the openai-agents extra:

pip install "arthur-observability-sdk[openai-agents]"

Initialize Arthur

Create an Arthur instance to configure telemetry export and connect to your Arthur GenAI Engine.

from arthur_observability_sdk import Arthur

arthur = Arthur(
    api_key="your-api-key",        # or set ARTHUR_API_KEY env var
    base_url="https://your-arthur-engine-instance",  # or set ARTHUR_BASE_URL env var
    task_id="<your-task-uuid>",    # Arthur task UUID
    service_name="my-agents-app",
)
ParameterDescription
api_keyYour Arthur Engine API key. Falls back to ARTHUR_API_KEY env var.
base_urlBase URL of your Arthur GenAI Engine. Falls back to ARTHUR_BASE_URL env var, then http://localhost:3030.
task_idArthur task UUID for associating traces with a specific task.
service_nameOpenTelemetry service.name resource attribute. Used to identify your application in the Arthur dashboard. Creates a new task based on service_name if task_id isn't specified.
📘

At least one of task_id or service_name must be provided. A new task with the service_name will be created if task_id is not specified.

⚠️

Use environment variables for secrets. Set ARTHUR_API_KEY and ARTHUR_BASE_URL as environment variables (e.g., in a .env file) rather than hardcoding them in your application.


Instrument OpenAI Agents

A single method call enables automatic instrumentation of the entire OpenAI Agents SDK. Call this before any Runner.run() calls.

import asyncio
from agents import Agent, Runner
from arthur_observability_sdk import Arthur

arthur = Arthur(
    api_key="your-api-key",
    base_url="https://your-arthur-engine-instance",
    task_id="<your-task-uuid>",
    service_name="my-agents-app",
)
arthur.instrument_openai_agents()

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant.",
    model="gpt-4o-mini",
)

async def main():
    result = await Runner.run(agent, "What is observability?")
    print(result.final_output)

asyncio.run(main())

arthur.shutdown()

Key points:

  • instrument_openai_agents() must be called before any Runner.run(), Runner.run_sync(), or Runner.run_streamed() calls.
  • All agent invocations, handoffs, tool calls, and LLM completions are automatically traced.
  • Call arthur.shutdown() when your application exits to flush any remaining traces.

Trace Agent Handoffs and Tool Calls

The real power of the OpenAI Agents SDK is multi-agent orchestration — agents that hand off to other agents and invoke tools. Arthur captures this entire execution graph automatically.

Multi-agent handoffs

When you define agents that hand off to each other, Arthur records the full chain of agent invocations as nested spans:

from agents import Agent, Runner
import asyncio

triage_agent = Agent(
    name="Triage",
    instructions="You route questions to the right specialist.",
    handoffs=["billing_agent", "technical_agent"],
    model="gpt-4o-mini",
)

billing_agent = Agent(
    name="Billing",
    instructions="You handle billing and payment questions.",
    model="gpt-4o-mini",
)

technical_agent = Agent(
    name="Technical",
    instructions="You handle technical support questions.",
    model="gpt-4o-mini",
)

# Update handoffs to reference actual Agent objects
triage_agent.handoffs = [billing_agent, technical_agent]

async def main():
    result = await Runner.run(triage_agent, "I need help with my invoice.")
    print(result.final_output)

asyncio.run(main())

Arthur captures this as a trace with the following span hierarchy:

flowchart TD
    R["Runner.run() — CHAIN"] --> T["Triage Agent — AGENT"]
    T --> LLM1["OpenAI gpt-4o-mini — LLM"]
    T -->|handoff| B["Billing Agent — AGENT"]
    B --> LLM2["OpenAI gpt-4o-mini — LLM"]

Tool calls

When agents invoke tools, each tool execution is captured as a TOOL span with input arguments and return values:

from agents import Agent, Runner, function_tool
import asyncio

@function_tool
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    return f"The weather in {city} is 72°F and sunny."

weather_agent = Agent(
    name="WeatherBot",
    instructions="You help users check the weather.",
    tools=[get_weather],
    model="gpt-4o-mini",
)

async def main():
    result = await Runner.run(weather_agent, "What's the weather in San Francisco?")
    print(result.final_output)

asyncio.run(main())

Add Session and User Context

Use arthur.attributes() as a context manager to attach session IDs, user IDs, and custom metadata to all traces within the block.

async def handle_request(user_id: str, session_id: str, message: str):
    with arthur.attributes(session_id=session_id, user_id=user_id):
        result = await Runner.run(agent, message)
        return result.final_output

This is especially useful for:

  • Multi-turn conversations — trace an entire chat session end-to-end
  • Per-user analytics — understand how individual users interact with your application
  • Debugging — filter traces in the Arthur dashboard by session or user

The attributes() context manager uses OpenInference context propagation, so all spans created within the block — including nested agent handoffs and tool calls — inherit the session and user attributes.


Verify in Arthur

After running your instrumented application, traces appear in the Arthur GenAI Engine within seconds.

Traces viewed on the Arthur Engine UI

What to look for in the dashboard:

  • Trace list — each Runner.run() call appears as a trace with the full agent/handoff/tool span tree
  • Session grouping — if you used arthur.attributes(session_id=...), traces are grouped by session
  • User filtering — filter by user_id to see a specific user's interactions
  • Token usage — prompt and completion token counts are captured automatically

You can also query traces programmatically:

curl -X GET "${ARTHUR_BASE_URL}/api/v1/traces?task_ids=${ARTHUR_TASK_ID}" \
  -H "Authorization: Bearer ${ARTHUR_API_KEY}"

Troubleshooting

SymptomFix
No traces appearVerify ARTHUR_API_KEY is set and valid in the Arthur UI.
No traces appearEnsure ARTHUR_BASE_URL is reachable and includes the https:// protocol.
Traces appear but no agent spansCall instrument_openai_agents() before any Runner.run() calls.
Missing tool spansDecorate tool functions with @function_tool — the Agents SDK requires this for proper registration.
ImportError on startupRun pip install "arthur-observability-sdk[openai-agents]" to install the required extra.

Next Steps

Now that you have OpenAI Agents SDK traces flowing into Arthur, explore these capabilities:

flowchart LR
    A[Instrument OpenAI Agents] --> B[View Traces]
    B --> C[Add Evaluations]
    B --> D[Manage Prompts]
    C --> E[Set Up Continuous Evals]
    D --> F[A/B Test Prompts]
    E --> G[Production Monitoring]
    F --> G