Agents as Starlark

A Chidori agent is a .star file with a def agent(...) function. The function's parameters are the agent's inputs. Its return value, serialized to JSON, is the agent's output. Everything inside the function is ordinary Starlark — the deterministic Python dialect from Bazel.

Anatomy of an agent

agents/summarizer.star

config(
    model = "claude-sonnet",     # default model for prompt() calls
    temperature = 0.7,           # default sampling temperature
    max_tokens = 4096,           # default max response tokens
    max_turns = 10,              # max tool-use loop iterations
    timeout = 300,               # agent timeout in seconds
)

def agent(document, depth = "standard"):
    summary = prompt("Summarize in 3 bullets:\n" + document)
    actions = prompt("Extract action items:\n" + summary)
    return {"summary": summary, "action_items": actions}
  1. An optional config(...) call at module scope sets defaults.
  2. def agent(...) is the entry point — its parameters are the agent's inputs.
  3. Its return value becomes the agent's JSON output.

Starlark: Python minus the chaos

If you know Python, you can read any Chidori agent. Control flow, list comprehensions, string methods, dicts, and helper functions all work exactly as you'd expect:

# Conditionals
if depth == "deep":
    sources = fetch_sources(urls)
else:
    sources = []

# For loops
facts = []
for url in urls:
    page = tool("fetch_url", url = url)
    facts.append(prompt("Extract facts:\n" + page))

# List comprehensions
top_urls = [r["url"] for batch in results for r in batch[:3]]

# Helper functions
def summarize_for(audience, doc):
    return prompt("Summarize for " + audience + ":\n" + doc)

Differences from Python

The things Starlark leaves out are exactly the things that break determinism:

RemovedWhy
while loopsGuarantees termination; use for ... in range(n) if you need a bounded loop.
import of external modulesAll capabilities come from host functions — no hidden I/O.
ClassesUse dicts and functions. Keeps the language small and data introspectable.
ExceptionsUse try_call() — errors are values, not control flow.
Global mutation from inside functionsPrevents hidden state; every result is derivable from inputs and the call log.
random, time, direct I/ONon-determinism is the enemy of replay. Use env() and host functions instead.

The side-effect boundary

Everything Starlark cannot do itself — call an LLM, hit an HTTP endpoint, read memory, ask a human — goes through a host function:

answer = prompt("What is 2+2?")                # LLM call
page   = tool("fetch_url", url = some_url)     # registered tool
resp   = http("GET", "https://api.example.com") # raw HTTP
pref   = memory("get", key = "user_pref")      # persistent KV + vector store
ok     = input("Proceed?", context = plan)     # human-in-the-loop

Each call is a checkpoint boundary: the runtime logs the name, arguments, and result before returning. See the Host Functions page for the full API.

Why this design matters

Because the only way an agent touches the outside world is through host functions, and because Starlark itself is deterministic, Chidori can:

  1. Log every side effect with zero instrumentation from you.
  2. Replay any past run byte-for-byte by returning cached results for each logged call.
  3. Suspend and resume mid-execution — input() saves a checkpoint, the human responds, the agent picks up exactly where it stopped.
  4. Branch alternative histories — edit a recorded result in the debugger, and the rest of the run takes a different path.

None of this requires a custom SDK or special framework magic. It's just "Python where the only I/O is a documented function call."

Was this page helpful?