Checkpoint & Replay
Starlark is deterministic. Every side effect in a Chidori agent — LLM calls, tools, HTTP, memory, human input — is a logged host function call. Combine those two properties and you get free time travel: save a session's call log to disk, replay it later, get byte-identical output with zero LLM round trips.
What a checkpoint looks like
A checkpoint is a JSON object containing the session's input and the full call log:
{
"session_id": "c4cac6c7-4092-4df4-a0fe-c67dd3951792",
"input": {"document": "Rust is great."},
"call_log": [
{
"seq": 1,
"function": "prompt",
"args": {"text": "Summarize...", "model": "claude-sonnet"},
"result": "Here are 3 bullets...",
"duration_ms": 2024,
"token_usage": {"input_tokens": 26, "output_tokens": 66},
"timestamp": "2026-04-11T21:10:24.362337Z"
},
...
]
}Creating and replaying a session
Over HTTP:
# Create a session (live LLM calls)
curl -X POST http://localhost:8080/sessions \
-H "Content-Type: application/json" \
-d '{"input": {"document": "Rust is a systems language."}}'
# → {"id": "c4cac6c7-...", "status": "completed", "output": {...}}
# Fetch the checkpoint
curl http://localhost:8080/sessions/c4cac6c7-.../checkpoint > session.json
# Replay later — zero LLM calls, identical output
curl -X POST http://localhost:8080/sessions \
-H "Content-Type: application/json" \
-d @session.jsonFrom the Python SDK:
from chidori import AgentClient, Checkpoint
client = AgentClient("http://localhost:8080")
# Live run
session = client.run({"document": "Rust is great."})
session.checkpoint().save("session.json")
# Later: replay
cp = Checkpoint.load("session.json")
replayed = client.replay(cp)
assert replayed.output == session.outputHow replay works
- The runtime loads the checkpoint's call log into a
RuntimeContext::with_replay(). - The agent's
.starfile is parsed andagent()is called with the original input. - Each host function call assigns itself a sequence number as it executes.
- Before making the real call, each host function asks the context
try_replay(seq)— if there's a matching record, it returns the cached result immediately. - If replay runs past the end of the log, execution continues normally with live calls, and new results are appended.
Because Starlark is deterministic, the sequence numbers always line up and the agent takes the exact same control-flow path it did originally.
What replay guarantees
- Exact — same output, byte for byte.
- Fast — no LLM round trips; replays complete in milliseconds.
- Cheap — zero token cost.
What it's good for
- Debugging — reproduce a failed production run locally without spending a cent on LLM calls.
- Testing — check a checkpoint into git and assert
agent()output hasn't regressed. - Crash recovery — if an agent crashes mid-execution, replay from the last recorded call to resume.
- Human-in-the-loop —
input()saves a checkpoint and suspends the session; when the human responds, the agent resumes at the exact point it paused. - Incremental development — iterate on downstream logic without re-running expensive upstream LLM calls every time.
Branching
Open a checkpoint in the Chidori Debugger, edit the recorded result of any call, and the rest of the run takes a new path — a lightweight "what-if" that costs one fresh LLM call instead of the entire pipeline.
Constraints to be aware of
- The agent code must be the same as when the checkpoint was created. If you change which host function is called at sequence N, replay will diverge.
- Pure Starlark output must be deterministic — which, given Starlark's restrictions (no
random, notime, nowhile, no I/O outside host functions), it is by construction. - External tool definitions matter if tools are re-executed during replay. By default tool calls are cached along with LLM calls, so this usually doesn't come up.