Opening a trace
Open a trace with client.trace(name, ...) as a context manager. The trace stays active for the duration of the block; spans, messages, feedback, and signals you record inside it attach automatically. When the block exits, the trace flushes.
with client.trace( "support_agent", conversation_id="thread_123", user_id="cus_123",) as trace: trace.message(role="user", content="Where is my order?") ...The most useful arguments to trace():
| Argument | Type | Purpose |
|---|---|---|
| name | str | Trace name — usually the agent or workflow. |
| user_id | str | None | Associate the run with an end user. |
| session_id | str | None | Group several traces into one session. |
| conversation_id | str | None | Tie turns of a multi-turn conversation together. |
| external_trace_id | str | None | Correlate with an ID from your own system. |
| attributes | dict | None | Arbitrary metadata attached to the trace. |
| context | dict | None | Extra fields merged onto every event in the trace. |
Messages
Record the conversation with trace.message(...). Content can be a plain string or a structured content list (for multimodal turns). Set is_internal=True for messages the user never sees, such as a scratchpad or a system reflection.
trace.message(role="user", content="Where is my order?")trace.message(role="assistant", content="Your order has shipped.")# link an assistant tool message to the originating tool calltrace.message(role="tool", content=result_json, tool_call_id=tool_call.id)Spans
A span is a typed unit of work inside a trace. Open one with trace.span(type, ...) as a context manager and record its input and output. The span captures its own latency and marks itself failed if an exception propagates out of the block.
with trace.span("retrieval", name="search_docs") as span: span.record_input({"query": "refund policy"}) docs = search("refund policy") span.record_output({"hits": len(docs)})Span types are free-form strings. The SDK and dashboard understand a few conventional ones, and you can use your own for custom steps:
| Span type | Meaning |
|---|---|
llm | A model call. Use trace.llm(...) as a shorthand. |
tool | A tool invocation. Usually recorded via trace.tool(...). |
task | A multi-step unit of work — see the @task decorator. |
retrieval | A retrieval / RAG step. |
span | A generic span for anything else. |
LLM spans & usage
trace.llm(name, ...) is a shorthand for an llm span. Pass the model and provider so they are recorded as structured metadata, then use record_input / record_output and set_usage. record_outputaccepts pydantic models directly, so you don't serialize the response yourself.
from mv37.rollout import usage_from_openaiwith trace.llm("openai.responses", model="gpt-4.1-mini") as span: span.record_input({"messages": messages}) response = openai_client.responses.create(model="gpt-4.1-mini", input=messages) span.record_output(response) span.set_usage(**usage_from_openai(response))set_usage accepts the full set of token and cost fields — pass whichever your provider reports:
| Field | Meaning |
|---|---|
| input_tokens / output_tokens | Prompt and completion tokens. |
| cached_tokens | Prompt tokens served from cache. |
| reasoning_tokens | Reasoning / thinking tokens, when reported. |
| total_tokens | Total tokens for the call. |
| cost_usd | Cost of the call in USD, if you compute it. |
| context_window_tokens / context_used_tokens | Context window size and how much was used. |
Tip
usage_from_openai(response) and usage_from_anthropic(response) read these counts straight off the provider response (including cached and reasoning tokens) and return a dict ready to splat into set_usage.
Nesting spans
Spans nest naturally — open a span inside another with block and it records its parent automatically. This is how a planning step that makes several model and tool calls shows up as a tree rather than a flat list.
with trace.span("task", name="resolve_ticket") as task: task.record_input({"ticket_id": "T-42"}) with trace.llm("openai.responses", model="gpt-4.1-mini") as span: ... # this llm span is a child of the task span with trace.tool("issue_refund", arguments={"order": "4421"}) as call: call.record_output(run_refund("4421"))Async apps
Every context manager has an async form. Use async with for traces and spans in async code; the rest of the API is identical.
async with client.trace("support_agent") as trace: async with trace.llm("openai.responses", model="gpt-4.1-mini") as span: span.record_input({"messages": messages}) response = await openai_client.responses.create(...) span.record_output(response)Heads up
In long-running async services, also call await client.ashutdown()from your framework's shutdown hook so queued events are flushed. See Lifecycle & shutdown.