Most agent frameworks ship a DSL or a YAML schema. Chidori ships a Python dialect with no I/O of its own, plus a fixed set of host functions. Because every side effect is a logged, deterministic call, any past run can be replayed byte-for-byte, for free.
Starlark is a deterministic Python dialect. Native control flow, list comprehensions, helper functions. No YAML, no template DSL, no graph builder.
def agent(question):
qs = prompt(
"3 queries for: " + question,
format="json",
)
return parallel([
lambda q=q: tool("search", q=q)
for q in qs
])Starlark has no random, no clocks, no I/O of its own. Every side effect goes through a fixed set of host functions the runtime logs, caches, and replays.
# no hidden state
config(model="claude-sonnet")
# every call logged:
# prompt(), tool(),
# parallel(), input()
#
# seq=0..N, content-addressedSave a session's call log to disk and replay it later for byte-identical output — zero LLM spend. Debug production locally without burning a token.
$ chidori run agent.star \
--replay session.json
[cache] prompt hit
[cache] tool hit
[cache] prompt hit
→ identical output · $0.00input() suspends execution and writes a checkpoint. When the human responds, the agent resumes at the exact sequence number it paused on — no special state machine.
answer = input(
"Approve resume from 0x4a1?",
schema={"ok": "bool"},
)
if not answer["ok"]:
return {"status": "rejected"}Any .star file is an agent, and any agent can call another by name. Each sub-agent runs in its own Starlark VM, inherits the session, and gets its own replayable span — so you can compose specialists without inventing a DAG.
# researcher.star
draft = agent("writer",
topic=question,
notes=results,
)
review = agent("critic",
draft=draft,
bar="rigorous",
)
return {"draft": draft, "review": review}exec() runs agent-written code inside a WebAssembly sandbox — no filesystem, no network, strict memory and CPU limits. Let an LLM write and run Python without letting it touch your box.
# agent writes + runs code safely
result = exec(
code=prompt("Write python to…"),
runtime="python-wasm",
limits={"mem":"64MB","cpu":"2s"},
)
# WASI sandbox · no net · no fsEvery host function call — prompt, tool, agent, exec — is emitted as an OpenTelemetry span. Point it at Tael, Jaeger, Tempo, or Honeycomb by setting one env var.
OTEL_EXPORTER_OTLP_ENDPOINT=\
http://localhost:4317
# every prompt/tool/agent/exec
# arrives as a span.
# zero extra config.An agent is a .star file with a def agent(...) function. Parameters are inputs; the return value is JSON output. Prompts, tools, parallel fan-out, and human input are ordinary Starlark calling into host functions.
# agents/researcher.star
config(model = "claude-sonnet")
def agent(question, depth = "standard"):
queries = prompt(
"3 search queries for: " + question,
format = "json",
)
results = parallel([
lambda q=q: tool("web_search", query=q)
for q in queries
])
return {"answer": results, "queries": queries}chidori run executes once from the CLI. chidori serve turns the same file into an HTTP server — every incoming request becomes an event dict passed to agent(event). Webhooks, chat, alerts — same primitive.
$ chidori run agents/researcher.star \
--input question="OTel SDK choice"
$ chidori serve agents/researcher.star --port 8080
$ curl -X POST localhost:8080/sessions \
-d '{"input": {"question": "..."}}'Every run produces a session log — a content-addressed record of every host function call. Feed it back in and the agent re-runs deterministically, with cached results. Zero tokens, identical output.
# save a live session
$ curl localhost:8080/sessions/<id>/checkpoint \
> session.json
# replay later — zero LLM spend
$ curl -X POST localhost:8080/sessions \
-d @session.json
→ identical output · cost: $0.00No dashboards. No browsers. Tael ingests OpenTelemetry over OTLP, stores it in an embedded columnar database, and returns JSON by default — so your agents can query, monitor, and annotate production telemetry themselves.
$ tael anomalies --since 1h --format json | jq[
{
"service": "researcher-agent",
"metric": "latency.p95",
"baseline_ms": 1180,
"current_ms": 4210,
"severity": "high",
"first_seen": "14:02:31Z",
"trace_ids": ["0x4a1c2e", "0x4a2f11"]
},
{
"service": "writer-agent",
"metric": "errors.rate",
"baseline": 0.003,
"current": 0.041,
"severity": "medium",
"sample_trace": "0x5b8a03"
}
]$ tael correlate --trace 0x4a1c2e→ 14 spans · 3 logs · 2 metric series → hand off to chidori agent? [y/N]
You already have the hardest part — the goal, the context, the judgment. The bottleneck is how much you can hold at once. Agents extend your reach: delegate to AI that remembers its work, recovers from failures, and tells you exactly what it did.
Chidori and Tael are the building blocks. Open source. Deterministic. Built to be boring, so the agents running on top of them can afford to be interesting.