Transforms

Overview

To run evaluations on your agent traces, you need to tell Arthur which span attributes feed into each evaluator input. Transforms are the mechanism for this: you define an extraction mapping once — specifying which spans and attribute paths map to which evaluator variables — and reuse that configuration consistently across every trace in a task. The same saved transform can also be applied when manually adding a span's data to a dataset row, pre-filling column mappings so you don't have to re-specify them each time.

This page explains how to create transforms to extract span attributes for evaluators, and how to reuse a saved transform when adding trace data to a dataset. If you're new to spans and trace attributes, see the Tracing Overview first.


How Transforms Work

A transform is a named, reusable extraction configuration scoped to a task. It defines a list of variables, each of which specifies:

  • Variable name — the named input your evaluator expects (e.g., query, response, context)
  • Span name — the name of the span to extract from (e.g., LLMChain, RetrievalQA)
  • Attribute path — the dot-notation path to the value within that span's attributes (e.g., input.value, retrieval.documents.0.document.content). Supports wildcard * for array elements (e.g., attributes.results.*.name)
  • Fallback value (optional) — a JSON value to use if the attribute path is absent on a span

When you attach a transform to an evaluator, Arthur uses that mapping to extract the right values from each incoming trace span and pass them to the evaluator at evaluation time. When you use a transform in the "Add to Dataset" flow, it pre-fills the column mapping UI so the same extraction logic applies to your dataset rows.

flowchart LR
    A[Trace Span\nAttributes] -->|Transform\nExtraction Mapping| B[Evaluator\nInput Variables]
    A -->|Same Transform| C[Dataset Row\nColumn Mapping]
    B --> D[Evaluation\nResult]
    C --> E[Dataset\nRow]

Transforms are:

  • Scoped to a task — created under a specific task and not shared across tasks
  • Reusable — one transform can be referenced by multiple evaluators or dataset operations
  • Independently managed — you can update or delete them, subject to dependency checks

Prerequisites

Before creating a transform, make sure you have:

  • An Arthur workspace and project with at least one task configured
  • Traces flowing into that task (spans must be present to test extraction)
  • Your task_id — visible in the platform URL or via the Tasks API
  • The Arthur Python SDK installed, or API credentials for direct HTTP calls
pip install arthur-observability-sdk

You'll also need to know which span names and attribute paths you want to extract. If you're unsure what spans and attributes your traces contain, inspect a recent trace in the platform before proceeding.


Evaluate Traces Like This (Guided Flow)

The fastest way to set up a transform and a continuous evaluation together is the Evaluate Traces Like This action, available directly from any trace. This flow is context-aware — it shows you the actual span data from the trace you're viewing, so you can point-and-click to select attribute paths rather than typing them manually. When you submit, Arthur auto-creates the transform and the continuous eval in a single operation.

When to use this flow

Use Evaluate Traces Like This when you want to:

  • Set up continuous evaluation for the first time with guidance
  • Create a transform and continuous eval together without switching pages
  • Visually explore span attributes and pick paths interactively

Use the manual Create a Transform flow when you want to:

  • Create a reusable transform independently of any evaluator
  • Define a transform that multiple evaluators will share

How to use it

  1. Open any trace in the Arthur platform.
  2. Click TRACE ACTIONS (top right) → Evaluate Traces Like This.
  3. A full-screen drawer opens with three panes:
    • Left — the span tree for the trace
    • Center — span details / interactive attribute picker
    • Right — a two-step configuration form

Step 1 — Select an evaluator

Choose the LLM evaluator or ML evaluator you want to run. The form shows the evaluator's required input variables so you know exactly what you need to map.

Step 2 — Map variables to trace attributes

Two modes are available:

  • Create New (default) — Define variable mappings inline. For each evaluator variable:

    • Span Name — which span to extract from (e.g., LLMCall)
    • Attribute Path — dot-notation path to the value (e.g., attributes.input.value)
    • Fallback Value (optional) — JSON default if the attribute is absent
    • Click Select in trace to activate the interactive attribute picker in the center pane, which highlights the actual value from the trace as you navigate the span's JSON tree
    • A live preview shows the extracted value so you can confirm it before saving
  • Select Existing — Choose a transform you already created. The form auto-populates the variable mappings and you can adjust them if needed. You can also create a new transform inline from this mode.

Submit

In Create New mode, Arthur automatically:

  1. Creates a transform named <eval-name>_transform with the description Auto-created transform for continuous eval "<eval-name>"
  2. Creates the continuous evaluation linked to that transform
  3. Redirects you to the continuous eval detail page

In Select Existing mode, Arthur creates the continuous evaluation linked to your chosen transform (no new transform is created).

Tip: Transforms created through this flow appear on the Transforms management page like any other transform. You can edit, reuse, or delete them there.


Navigate to Transforms in the UI

Transforms are managed on a dedicated page within each task:

  1. Open the Arthur platform and navigate to your Project.
  2. Select the Task you want to configure transforms for.
  3. In the task sidebar, click Transforms.

The Transforms page lists all transforms for the task in a sortable, paginated table (sortable by name, created date, and updated date). From here you can create, view, edit, and delete transforms.


Create a Transform

A transform requires a name and a list of variable definitions. Each variable maps an evaluator input name to a specific span and attribute path within that span.

Step 1 — Identify your spans and attribute paths

Open a recent trace in the Arthur platform and note the span names and attribute keys you need. Common OpenInference attribute paths include:

Span Attribute PathTypical Use
input.valueUser query or chain input
output.valueFinal agent response
llm.input_messages.0.message.contentFirst LLM prompt message
llm.output_messages.0.message.contentFirst LLM completion
retrieval.documents.0.document.contentFirst retrieved document chunk
attributes.session_idSession identifier

Wildcard notation is supported for array elements — for example, attributes.results.*.name extracts the name field from every element in a results array.

Step 2 — Create the transform

Platform UI

  1. On the Transforms page, click Create Transform.
  2. Fill in the Transform Name (required) and an optional Description.
  3. Under Variable Mappings, click Add Variable for each evaluator input you need:
    • Variable Name — the evaluator input key (e.g., query)
    • Span Name — the name of the span to extract from (e.g., LLMChain)
    • Attribute Path — dot-notation path to the value (e.g., input.value)
    • Fallback Value (optional) — a valid JSON value used when the attribute is absent
  4. To speed up form entry, use Copy from Existing Transform to pre-fill the form from a previously created transform, then adjust as needed.
  5. Click Create Transform.

API

import requests

ARTHUR_API_URL = "https://your-arthur-instance.com"
API_KEY = "your-api-key"
TASK_ID = "your-task-id"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json",
}

payload = {
    "name": "rag-evaluator-inputs",
    "description": "Extracts query, response, and context for RAG quality evaluators",
    "definition": {
        "variables": [
            {
                "variable_name": "query",
                "span_name": "LLMChain",
                "attribute_path": "input.value"
            },
            {
                "variable_name": "response",
                "span_name": "LLMChain",
                "attribute_path": "output.value"
            },
            {
                "variable_name": "context",
                "span_name": "RetrievalQA",
                "attribute_path": "retrieval.documents.0.document.content"
            }
        ]
    }
}

response = requests.post(
    f"{ARTHUR_API_URL}/api/v1/tasks/{TASK_ID}/traces/transforms",
    headers=headers,
    json=payload,
)
response.raise_for_status()
transform = response.json()
print(f"Created transform: {transform['id']}")
const ARTHUR_API_URL = "https://your-arthur-instance.com";
const API_KEY = "your-api-key";
const TASK_ID = "your-task-id";

const payload = {
  name: "rag-evaluator-inputs",
  description: "Extracts query, response, and context for RAG quality evaluators",
  definition: {
    variables: [
      {
        variable_name: "query",
        span_name: "LLMChain",
        attribute_path: "input.value",
      },
      {
        variable_name: "response",
        span_name: "LLMChain",
        attribute_path: "output.value",
      },
      {
        variable_name: "context",
        span_name: "RetrievalQA",
        attribute_path: "retrieval.documents.0.document.content",
      },
    ],
  },
};

const response = await fetch(
  `${ARTHUR_API_URL}/api/v1/tasks/${TASK_ID}/traces/transforms`,
  {
    method: "POST",
    headers: {
      Authorization: `Bearer ${API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify(payload),
  }
);

const transform = await response.json();
console.log("Created transform:", transform.id);
curl -X POST https://your-arthur-instance.com/api/v1/tasks/{task_id}/traces/transforms \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "rag-evaluator-inputs",
    "description": "Extracts query, response, and context for RAG quality evaluators",
    "definition": {
      "variables": [
        {
          "variable_name": "query",
          "span_name": "LLMChain",
          "attribute_path": "input.value"
        },
        {
          "variable_name": "response",
          "span_name": "LLMChain",
          "attribute_path": "output.value"
        },
        {
          "variable_name": "context",
          "span_name": "RetrievalQA",
          "attribute_path": "retrieval.documents.0.document.content"
        }
      ]
    }
  }'

The response includes the new transform's id. Save this — you'll reference it when configuring evaluators and dataset operations.

Step 3 — Test the extraction against a real trace

Before attaching the transform to an evaluator, verify it extracts the values you expect by running it against a known trace.

TRACE_ID = "a-known-trace-id"
TRANSFORM_ID = "your-new-transform-id"

response = requests.post(
    f"{ARTHUR_API_URL}/api/v1/traces/{TRACE_ID}/transforms/{TRANSFORM_ID}/extractions",
    headers=headers,
)
response.raise_for_status()
extracted = response.json()
print("Extracted variables:", extracted)
# e.g. {"query": "What is the capital of France?", "response": "Paris.", "context": "France is a country..."}
const TRACE_ID = "a-known-trace-id";
const TRANSFORM_ID = "your-new-transform-id";

const response = await fetch(
  `${ARTHUR_API_URL}/api/v1/traces/${TRACE_ID}/transforms/${TRANSFORM_ID}/extractions`,
  {
    method: "POST",
    headers: {
      Authorization: `Bearer ${API_KEY}`,
    },
  }
);

const extracted = await response.json();
console.log("Extracted variables:", extracted);
curl -X POST https://your-arthur-instance.com/api/v1/traces/{trace_id}/transforms/{transform_id}/extractions \
  -H "Authorization: Bearer your-api-key"

If any variable returns null or is missing, double-check the span name and attribute path against your actual span data.


Use Transforms with Evaluators

Transforms are the primary reason to create an extraction configuration. When you configure a continuous evaluation or a one-time eval run, you specify a transform to tell Arthur how to populate each evaluator's input variables from span attributes.

Attaching a transform to an evaluator (platform UI)

  1. Navigate to your task in the Arthur platform.
  2. Open EvaluationsConfigure Evaluator.
  3. In the Input Mapping section, select an existing transform from the dropdown, or create a new one inline.
  4. Confirm the variable preview shows the expected values from a sample trace.
  5. Save the evaluator configuration.

Why this matters

Without a transform, Arthur cannot know which part of a multi-span trace to pass to an evaluator. A single trace may contain dozens of spans — LLM calls, tool invocations, retrieval steps — each with different attribute shapes. The transform pins the evaluator to the exact span and attributes that are meaningful for that evaluation criterion.

For example, a faithfulness evaluator needs response and context. A toxicity evaluator only needs response. You can create separate transforms for each evaluator type, or create one comprehensive transform that covers all variables and let each evaluator use only what it needs.

Listing transforms for a task

To see all transforms available for a task (useful when configuring evaluators programmatically):

response = requests.get(
    f"{ARTHUR_API_URL}/api/v1/tasks/{TASK_ID}/traces/transforms",
    headers=headers,
    params={"page_size": 50, "sort": "asc"},
)
response.raise_for_status()
transforms = response.json()
for t in transforms.get("results", []):
    print(f"{t['id']}  {t['name']}")
const response = await fetch(
  `${ARTHUR_API_URL}/api/v1/tasks/${TASK_ID}/traces/transforms?page_size=50&sort=asc`,
  {
    headers: { Authorization: `Bearer ${API_KEY}` },
  }
);
const transforms = await response.json();
transforms.results.forEach((t) => console.log(t.id, t.name));
curl -X GET "https://your-arthur-instance.com/api/v1/tasks/{task_id}/traces/transforms?page_size=50&sort=asc" \
  -H "Authorization: Bearer your-api-key"

Add Trace Data to Datasets

The same saved transform serves a secondary purpose: when you manually add a span's data to a dataset row, you can select an existing transform to pre-fill the column mapping. This eliminates repetitive re-specification of the same attribute paths every time you curate examples.

How it works in the platform UI

  1. Open a trace in the Arthur platform and select a span you want to add to a dataset.
  2. Click Add to Dataset.
  3. In the column mapping step, choose Use existing transform and select the transform by name.
    • The dropdown shows a match indicator for each transform:
      • Green (full match) — all transform variables match existing dataset columns
      • Yellow (partial match) — some variables match; new columns will be added for the rest
      • Red (no match) — none of the variables match existing columns
  4. Arthur executes the transform against the span and pre-fills each dataset column with the extracted value.
  5. Review and adjust any individual mappings if needed, then confirm.

Save a manual extraction as a transform

If you mapped columns manually and want to reuse that mapping in the future:

  1. After mapping columns in the Add to Dataset flow, click Save Transform.
  2. Enter a name and optional description.
  3. Arthur converts your column mappings into a saved transform available for future use.

Why reuse the same transform

Using the same transform for both evaluators and dataset population ensures consistency: the query column in your dataset is populated from exactly the same span and attribute path that your evaluator reads at eval time. This alignment is important when you use dataset rows as ground-truth examples or few-shot references in evaluator prompts.


Manage Transforms

Retrieve a specific transform

TRANSFORM_ID = "your-transform-id"

response = requests.get(
    f"{ARTHUR_API_URL}/api/v1/traces/transforms/{TRANSFORM_ID}",
    headers=headers,
)
response.raise_for_status()
print(response.json())
const response = await fetch(
  `${ARTHUR_API_URL}/api/v1/traces/transforms/${TRANSFORM_ID}`,
  { headers: { Authorization: `Bearer ${API_KEY}` } }
);
const transform = await response.json();
console.log(transform);
curl -X GET https://your-arthur-instance.com/api/v1/traces/transforms/{transform_id} \
  -H "Authorization: Bearer your-api-key"

Update a transform

Use PATCH to update the name, description, or variable definitions of an existing transform. Changes take effect for any subsequent evaluations or dataset operations that reference this transform.

update_payload = {
    "name": "rag-evaluator-inputs-v2",
    "definition": {
        "variables": [
            {
                "variable_name": "query",
                "span_name": "LLMChain",
                "attribute_path": "input.value"
            },
            {
                "variable_name": "response",
                "span_name": "LLMChain",
                "attribute_path": "output.value"
            },
            {
                "variable_name": "context",
                "span_name": "RetrievalQA",
                "attribute_path": "retrieval.documents.0.document.content"
            },
            {
                "variable_name": "session_id",
                "span_name": "LLMChain",
                "attribute_path": "attributes.session_id"
            }
        ]
    }
}

response = requests.patch(
    f"{ARTHUR_API_URL}/api/v1/traces/transforms/{TRANSFORM_ID}",
    headers=headers,
    json=update_payload,
)
response.raise_for_status()
print("Updated:", response.json())
const updatePayload = {
  name: "rag-evaluator-inputs-v2",
  definition: {
    variables: [
      {
        variable_name: "query",
        span_name: "LLMChain",
        attribute_path: "input.value",
      },
      {
        variable_name: "response",
        span_name: "LLMChain",
        attribute_path: "output.value",
      },
      {
        variable_name: "context",
        span_name: "RetrievalQA",
        attribute_path: "retrieval.documents.0.document.content",
      },
      {
        variable_name: "session_id",
        span_name: "LLMChain",
        attribute_path: "attributes.session_id",
      },
    ],
  },
};

const response = await fetch(
  `${ARTHUR_API_URL}/api/v1/traces/transforms/${TRANSFORM_ID}`,
  {
    method: "PATCH",
    headers: {
      Authorization: `Bearer ${API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify(updatePayload),
  }
);
const updated = await response.json();
console.log("Updated:", updated);
curl -X PATCH https://your-arthur-instance.com/api/v1/traces/transforms/{transform_id} \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "rag-evaluator-inputs-v2",
    "definition": {
      "variables": [
        {
          "variable_name": "query",
          "span_name": "LLMChain",
          "attribute_path": "input.value"
        },
        {
          "variable_name": "response",
          "span_name": "LLMChain",
          "attribute_path": "output.value"
        },
        {
          "variable_name": "context",
          "span_name": "RetrievalQA",
          "attribute_path": "retrieval.documents.0.document.content"
        },
        {
          "variable_name": "session_id",
          "span_name": "LLMChain",
          "attribute_path": "attributes.session_id"
        }
      ]
    }
  }'

Check dependents before deleting

Before deleting a transform, check whether any continuous evaluations, agentic experiments, or agentic notebooks depend on it. The API returns a 409 if you attempt to delete a transform that has active dependents.

response = requests.get(
    f"{ARTHUR_API_URL}/api/v1/traces/transforms/{TRANSFORM_ID}/dependents",
    headers=headers,
)
response.raise_for_status()
dependents = response.json()
print("Dependents:", dependents)
const response = await fetch(
  `${ARTHUR_API_URL}/api/v1/traces/transforms/${TRANSFORM_ID}/dependents`,
  { headers: { Authorization: `Bearer ${API_KEY}` } }
);
const dependents = await response.json();
console.log("Dependents:", dependents);
curl -X GET https://your-arthur-instance.com/api/v1/traces/transforms/{transform_id}/dependents \
  -H "Authorization: Bearer your-api-key"

In the platform UI, the delete confirmation dialog lists all dependent continuous evals, agentic experiments, and agentic notebooks with links so you can navigate to them directly before proceeding.

Delete a transform

response = requests.delete(
    f"{ARTHUR_API_URL}/api/v1/traces/transforms/{TRANSFORM_ID}",
    headers=headers,
)
if response.status_code == 204:
    print("Transform deleted.")
elif response.status_code == 409:
    print("Cannot delete: transform has active dependents. Remove them first.")
const response = await fetch(
  `${ARTHUR_API_URL}/api/v1/traces/transforms/${TRANSFORM_ID}`,
  {
    method: "DELETE",
    headers: { Authorization: `Bearer ${API_KEY}` },
  }
);

if (response.status === 204) {
  console.log("Transform deleted.");
} else if (response.status === 409) {
  console.log("Cannot delete: transform has active dependents.");
}
curl -X DELETE https://your-arthur-instance.com/api/v1/traces/transforms/{transform_id} \
  -H "Authorization: Bearer your-api-key"

Troubleshooting

Extraction returns null for one or more variables

Cause: The span name or attribute path in your transform doesn't match the actual span or key in the trace.

Fix: Run the extraction test endpoint against a known trace and compare the returned keys against your configured paths. Span names are case-sensitive. Attribute paths use dot notation and zero-based index notation for arrays (e.g., llm.output_messages.0.message.content). Use wildcard * notation when the index is variable (e.g., retrieval.documents.*.document.content).


409 Conflict when deleting a transform

Cause: One or more continuous evaluations, agentic experiments, or agentic notebooks reference this transform.

Fix: Call GET /api/v1/traces/transforms/{transform_id}/dependents to list all dependent resources. Remove or reconfigure those resources to use a different transform before retrying the delete.


Transform not appearing in the evaluator dropdown

Cause: The transform was created under a different task than the one your evaluator belongs to.

Fix: Transforms are task-scoped. Confirm the task_id used when creating the transform matches the task where you're configuring the evaluator. If needed, create a new transform under the correct task.


Extraction test passes but evaluator receives wrong values in production

Cause: Your spans may have variable structure depending on the execution path (e.g., some traces skip retrieval, so retrieval.documents.0.document.content is absent).

Fix: Review a sample of traces that represent different execution paths. Consider creating separate transforms for different span types (e.g., one for LLM-only spans, one for RAG spans) and configuring evaluators accordingly. Use the fallback value field on each variable to supply a default when an attribute is absent.


422 Validation Error on transform creation

Cause: The request body is missing required fields or contains an invalid definition format.

Fix: Ensure your definition.variables array contains at least one entry, and that each variable has a non-empty variable_name, span_name, and attribute_path. Variable names must be unique within a transform. Check that name is provided and non-empty.


Next Steps

Now that you have transforms configured, you're ready to:

  • Configure LLM Evaluators — Build LLM-as-a-judge evaluators that consume the variables your transforms extract
  • Set Up Continuous Evaluations — Attach evaluators to your task so every incoming trace is automatically scored
  • Manage Datasets — Build curated datasets from trace data using transforms to pre-fill column mappings
  • Run Agentic Experiments — Use transforms in experiment configurations to evaluate prompt and model variants systematically