Chatbot

A chatbot is an event-driven Chidori agent: it runs under chidori serve, every incoming message becomes an event, and conversation history is persisted with memory(). Each turn is its own session — and therefore its own replayable checkpoint.

The whole thing, front to back

agents/chatbot.star

config(model = "claude-sonnet", max_turns = 5)

def dict_get(d, key, default = None):
    if key in d:
        return d[key]
    return default

def load_history(user_id):
    stored = memory("get", key = "chat:" + user_id)
    if stored == None:
        return []
    return stored

def save_history(user_id, history):
    # Keep the last 20 turns to bound the prompt
    memory("store", key = "chat:" + user_id, value = history[-20:])

def agent(event):
    if event["path"] != "/chat" or event["method"] != "POST":
        return {"status": 404, "body": {"error": "POST /chat"}}

    body    = event["body"]
    user_id = dict_get(body, "user_id", "anonymous")
    message = dict_get(body, "message", "")

    history = load_history(user_id)
    history.append({"role": "user", "content": message})

    reply = prompt(
        template("prompts/chat.jinja", history = history),
        tools     = ["web_search", "calculator"],
        max_turns = 5,
    )

    history.append({"role": "assistant", "content": reply})
    save_history(user_id, history)

    return {"status": 200, "body": {"reply": reply, "user_id": user_id}}

prompts/chat.jinja

You are a helpful assistant. Keep answers concise unless the user asks for detail.

{% for turn in history %}
{{ turn.role | upper }}: {{ turn.content }}
{% endfor %}

ASSISTANT:

Run it

chidori serve agents/chatbot.star --port 8080

Send a message:

curl -X POST http://localhost:8080/chat \
  -H "Content-Type: application/json" \
  -d '{"user_id": "alice", "message": "What\u2019s 15% of 80?"}'

Response:

{ "reply": "15% of 80 is 12.", "user_id": "alice" }

The next request from alice will see the first turn in its history thanks to memory().

Adding tools

Expose tools by listing them in the prompt(..., tools=[...]) argument. Each turn's tool-use loop is capped by max_turns:

tools/calculator.star

def calculator(expression):
    """Evaluate an arithmetic expression. Supports + - * / and parentheses."""
    # Delegate to a sandboxed exec() so LLM-generated expressions can't escape.
    return exec("result = " + expression + "\nresult", lang = "python", timeout = 2)

The docstring becomes the LLM-facing description; parameters and defaults are auto-generated into a JSON schema for function-calling.

Per-turn checkpoints

Every request to /chat is a session with its own checkpoint. That means:

  • Debug a weird answer by replaying the exact session from the checkpoint — same history, same retrieved tool results, zero LLM spend.
  • A/B new system prompts by replaying old sessions against a new template and diffing the outputs.
  • Regression-test tool wiring by checking canonical session checkpoints into git.
# Grab the last session's checkpoint
curl http://localhost:8080/sessions | jq '.[0].id'
curl http://localhost:8080/sessions/<id>/checkpoint > session.json

Streaming

For token-by-token streaming, have your HTTP client connect to the agent's response stream directly — when the agent's return value is a string, the runtime streams tokens as they arrive from the LLM. For JSON responses, the runtime buffers until the agent returns.

Tips

  • Bound your history — unbounded chat history means unbounded prompt size and unbounded token cost.
  • Store only raw turns in memory() — regenerate any system prompt or summary in the agent. This makes history re-renderable under a new prompt template.
  • Use retry() around the prompt() call for transient provider errors in production.
  • Keep dict_get at the top — Starlark dicts have no .get(), and you'll need it everywhere event payloads show up.

Was this page helpful?