Beta
The Python package is published as mv37-rollout and imported as mv37.rollout. The TypeScript package is @mv37/rollout. The public API is in beta, but the surface documented here is stable; expect additive changes. If you only need to ship raw events, every method maps to a documented event on the ingest API.
Why instrument agents
AI agents fail differently from ordinary software. There is rarely a stack trace — instead a tool gets called with the wrong arguments, a model quietly drifts, a retrieval step returns nothing useful, or a user silently gives up. The Rollout SDK makes those runs visible:
- Traces & spans — group a whole agent run into a trace and record each step (LLM call, tool call, retrieval, task) as a typed span with input, output, latency, and errors.
- Token & cost usage — pulled straight off the provider response, including cached and reasoning tokens.
- Feedback & signals — attach explicit user feedback (a thumbs-up, a rating) and implicit behavioral outcomes (an order placed, a ticket reopened).
- Privacy by construction — every event passes through a scrubber and an optional
before_sendhook before it leaves the process; PII is never required.
Install
The base install is lightweight — only httpx and pydantic, with no provider SDKs pulled in. Provider integrations are optional extras.
pip install mv37-rollout# oruv add mv37-rollout# optional provider extrasuv add "mv37-rollout[openai]"uv add "mv37-rollout[anthropic]"uv add "mv37-rollout[openai-agents]"# TypeScriptpnpm add @mv37/rolloutRequires Python 3.12 or newer. Set ROLLOUT_API_KEY in the environment, then a whole hand-written agent loop is two calls — wrap() the provider client and decorate the entry point:
import mv37.rollout as rolloutfrom openai import OpenAIrollout.init(api_key="...", agent_name="support_agent", environment="production")openai_client = rollout.wrap(OpenAI())@rollout.agentdef run_agent(user_message: str) -> str: response = openai_client.responses.create(model="gpt-4.1-mini", input=user_message) return response.output_textWhat a trace looks like
Everything the SDK records hangs off a trace — one agent run, one conversation turn, one request. Inside a trace you record messages (the conversation) and spans (units of work). An LLM call is an llm span; a tool call is a paired tool.call / tool.result; you can open your own task or retrieval spans too.
trace support_agent (conversation_id=thread_123, user_id=cus_123)├─ message user "Where is my order?"├─ span llm openai.responses · gpt-4.1-mini · 512→128 tok · 840ms│ ├─ tool.call lookup_order(id="4421")│ └─ tool.result { status: "shipped" }├─ message assistant "Your order has shipped."└─ feedback thumbs_up trueStart here
@agent / @trace / @tool.