Getting started

Rollout SDK

Rollout instruments your AI agents — capturing every trace, span, tool call, token count, and user reaction — and then helps you make them better. Use the Python or TypeScript SDK to observe, and the CLI to run GEPA optimization against your own datasets and verifiers.

Beta

The Python package is published as mv37-rollout and imported as mv37.rollout. The TypeScript package is @mv37/rollout. The public API is in beta, but the surface documented here is stable; expect additive changes. If you only need to ship raw events, every method maps to a documented event on the ingest API.

Why instrument agents

AI agents fail differently from ordinary software. There is rarely a stack trace — instead a tool gets called with the wrong arguments, a model quietly drifts, a retrieval step returns nothing useful, or a user silently gives up. The Rollout SDK makes those runs visible:

  • Traces & spans — group a whole agent run into a trace and record each step (LLM call, tool call, retrieval, task) as a typed span with input, output, latency, and errors.
  • Token & cost usage — pulled straight off the provider response, including cached and reasoning tokens.
  • Feedback & signals — attach explicit user feedback (a thumbs-up, a rating) and implicit behavioral outcomes (an order placed, a ticket reopened).
  • Privacy by construction — every event passes through a scrubber and an optional before_send hook before it leaves the process; PII is never required.

Install

The base install is lightweight — only httpx and pydantic, with no provider SDKs pulled in. Provider integrations are optional extras.

shell
pip install mv37-rollout# oruv add mv37-rollout# optional provider extrasuv add "mv37-rollout[openai]"uv add "mv37-rollout[anthropic]"uv add "mv37-rollout[openai-agents]"# TypeScriptpnpm add @mv37/rollout

Requires Python 3.12 or newer. Set ROLLOUT_API_KEY in the environment, then a whole hand-written agent loop is two calls — wrap() the provider client and decorate the entry point:

agent.py
import mv37.rollout as rolloutfrom openai import OpenAIrollout.init(api_key="...", agent_name="support_agent", environment="production")openai_client = rollout.wrap(OpenAI())@rollout.agentdef run_agent(user_message: str) -> str:    response = openai_client.responses.create(model="gpt-4.1-mini", input=user_message)    return response.output_text

What a trace looks like

Everything the SDK records hangs off a trace — one agent run, one conversation turn, one request. Inside a trace you record messages (the conversation) and spans (units of work). An LLM call is an llm span; a tool call is a paired tool.call / tool.result; you can open your own task or retrieval spans too.

trace.txt
trace  support_agent  (conversation_id=thread_123, user_id=cus_123)├─ message   user       "Where is my order?"├─ span      llm        openai.responses · gpt-4.1-mini · 512→128 tok · 840ms│   ├─ tool.call    lookup_order(id="4421")│   └─ tool.result  { status: "shipped" }├─ message   assistant  "Your order has shipped."└─ feedback  thumbs_up  true

Start here

Explore Rollout