Rollout OS
rollout.mv37.org
v0.1.0
https://rollout.mv37.org
rolloutBeta · v0.1

Run rollouts. Verify outcomes.
Train better agents.

Rollout is a platform for evaluating and improving AI agents. Define task sets, sandbox the environment, capture every trace, and turn the results into training data — all in one place.

Python · TypeScript · CLI Trace-first Built in the open · github

Datasets & traces

Build the dataset once. Use it forever.

Define tasks, attach verifiers, version them like code. The same dataset feeds today's eval and tomorrow's RL run.

Datasets

Task sets with deterministic, versioned exports.

Verifiers

Deterministic checks, LLM judges, or your own scoring fn.

Traces

Every step of every rollout — replayable and queryable.

Files

Attach artifacts to tasks, runs, and exports.

Live preview

Datasets, traces, environments, and agents — under one workspace.

Roadmap.txt

What is shipping next.

Soon

  • Hosted RL training loops — point at a dataset, get a fine-tuned policy back
  • Semantic trace search and run-to-run diff
  • Verifier authoring from natural-language task descriptions

Later

  • Public gallery of community environments and datasets
  • Multi-agent rollouts with shared world state
  • On-prem and air-gapped deployments

$ rollout init

Ready to roll out your first agent?

Book a 20-minute demo, or jump straight in and try it yourself.

ReadyUTF-8 · LF