Checkpoint & Replay

A durable run is deterministic by policy: the runtime pins Date, seeds Math.random, and routes every side effect in a Chidori agent — LLM calls, tools, HTTP, memory, human input — through a logged chidori host function. Combine those properties and you get free time travel: save a session's call log to disk, replay it later, get the same output with zero LLM round trips.

What a checkpoint looks like

A checkpoint is a JSON object containing the session's input and the full call log:

{
  "session_id": "c4cac6c7-4092-4df4-a0fe-c67dd3951792",
  "input": {"document": "Rust is great."},
  "call_log": [
    {
      "seq": 1,
      "function": "prompt",
      "args": {"text": "Summarize...", "model": "claude-sonnet"},
      "result": "Here are 3 bullets...",
      "duration_ms": 2024,
      "token_usage": {"input_tokens": 26, "output_tokens": 66},
      "timestamp": "2026-04-11T21:10:24.362337Z"
    },
    ...
  ]
}

Creating and replaying a session

Over HTTP:

# Create a session (live LLM calls)
curl -X POST http://localhost:8080/sessions \
  -H "Content-Type: application/json" \
  -d '{"input": {"document": "Rust is a systems language."}}'
# → {"id": "c4cac6c7-...", "status": "completed", "output": {...}}

# Fetch the checkpoint
curl http://localhost:8080/sessions/c4cac6c7-.../checkpoint > session.json

# Replay later — zero LLM calls, identical output
curl -X POST http://localhost:8080/sessions \
  -H "Content-Type: application/json" \
  -d @session.json

From the Python SDK:

from chidori import AgentClient, Checkpoint

client = AgentClient("http://localhost:8080")

# Live run
session = client.run({"document": "Rust is great."})
session.checkpoint().save("session.json")

# Later: replay
cp       = Checkpoint.load("session.json")
replayed = client.replay(cp)
assert replayed.output == session.output

Or from the TypeScript SDK:

import { AgentClient, Checkpoint } from "chidori";

const client = new AgentClient("http://localhost:8080");

// Live run
const session = await client.run({ document: "Rust is great." });
await session.checkpoint().save("session.json");

// Later: replay
const cp       = await Checkpoint.load("session.json");
const replayed = await client.replay(cp);

How replay works

  1. The runtime loads the checkpoint's call log into a RuntimeContext::with_replay().
  2. The agent's .ts file is loaded and agent(input, chidori) is called with the original input.
  3. Each chidori.* host call assigns itself a sequence number as it executes.
  4. Before making the real call, each host function asks the context try_replay(seq) — if there's a matching record, it returns the cached result immediately.
  5. If replay runs past the end of the log, execution continues normally with live calls, and new results are appended.

Because durable runs pin the clock and seed randomness, the sequence numbers always line up and the agent takes the exact same control-flow path it did originally.

What replay guarantees

  • Exact — the same output every time.
  • Fast — no LLM round trips; replays complete in milliseconds.
  • Cheap — zero token cost.

What it's good for

  • Debugging — reproduce a failed production run locally without spending a cent on LLM calls.
  • Testing — check a checkpoint into git and assert agent() output hasn't regressed.
  • Crash recovery — if an agent crashes mid-execution, replay from the last recorded call to resume.
  • Human-in-the-loopchidori.input() saves a checkpoint and suspends the session; when the human responds, the agent resumes at the exact point it paused.
  • Incremental development — iterate on downstream logic without re-running expensive upstream LLM calls every time.

Branching

Open a checkpoint in the Chidori Debugger, edit the recorded result of any call, and the rest of the run takes a new path — a lightweight "what-if" that costs one fresh LLM call instead of the entire pipeline.

Constraints to be aware of

  • The agent code must be the same as when the checkpoint was created. If you change which host function is called at sequence N, replay will diverge.
  • Durable runs must be deterministic — which they are by policy: the runtime fixes Date, seeds Math.random, and routes every side effect through a chidori host function that is logged, cached, and replayed.
  • External tool definitions matter if tools are re-executed during replay. By default tool calls are cached along with LLM calls, so this usually doesn't come up.

Was this page helpful?