Integrations Overview

Arthur AI supports a broad set of frameworks and tools for building, observing, and evaluating AI applications — including OpenAI, Anthropic, LangChain, LiteLLM, CrewAI, LlamaIndex, OpenAI Agents, Google ADK, AWS Bedrock, and Mastra (JavaScript/TypeScript). To pick the right integration, identify the language you develop in (Python or JavaScript/TypeScript), the LLM framework you already use, and whether you need tracing, evaluation, or both. This page gives you a single at-a-glance reference for every supported integration so you can confirm compatibility with your stack and jump straight to the framework-specific guide.


How Arthur Integrations Work

Every Arthur integration follows the same core pattern: your application sends OpenTelemetry-compatible traces to the Arthur GenAI Engine, where they are stored, scored, and surfaced in the Arthur platform. The integration layer handles span creation, context propagation, and attribute mapping automatically — you just initialize Arthur and instrument your framework.

flowchart LR
    A[Your Application] -->|instrument| B[Arthur SDK / Integration]
    B -->|OTLP traces| C[Arthur GenAI Engine]
    C --> D[Observability Dashboard]
    C --> E[Continuous Evals]
    C --> F[Agentic Experiments]

The table below summarizes every supported integration, the language it targets, and what it provides.

IntegrationLanguageInstall ExtraCapabilities
OpenAIPythonarthur-observability-sdk[openai]Auto-instrumented traces for Chat Completions, Embeddings, and Assistants API calls
AnthropicPythonarthur-observability-sdk[anthropic]Auto-instrumented traces for messages.create() and other Anthropic SDK calls
LangChainPythonarthur-observability-sdk[langchain]Auto-instrumented traces for chains, agents, tools, retrievers, and LLM calls
LiteLLMPythonarthur-observability-sdk[litellm]Auto-instrumented traces across 100+ LLM providers via LiteLLM's unified interface
CrewAIPythonarthur-observability-sdk[crewai]Multi-agent crew tracing including agent actions, tool calls, and LLM completions
LlamaIndexPythonarthur-observability-sdk[llama-index]RAG pipeline tracing — retrieval, embeddings, and LLM synthesis
OpenAI AgentsPythonarthur-observability-sdk[openai-agents]Agent handoff and tool-call tracing for the OpenAI Agents SDK
Google ADKPythonarthur-observability-sdk[google-adk]Gemini-powered agent tracing including tool calls and conversation turns
AWS BedrockPythonarthur-observability-sdk[bedrock]Auto-instrumented traces for Bedrock invoke_model and converse calls across all hosted models
MastraJavaScript / TypeScript@mastra/arthurTelemetry exporter for Mastra agents, tools, and workflows + remote prompt management

Next Steps

Once you've finished following the guides above to set up tracing, you can explore these capabilities:

flowchart LR
    A[Instrument Your Framework] --> B[View Traces]
    B --> C[Add Evaluations]
    B --> D[Manage Prompts]
    C --> E[Set Up Continuous Evals]
    D --> F[A/B Test Prompts]
    E --> G[Production Monitoring]
    F --> G