Autonomous Agent
An autonomous agent plans its own work, executes it, and critiques its results before responding. This guide builds a deep-research agent end-to-end: it generates search queries, fans them out in parallel, optionally fetches primary sources, renders a Jinja prompt over the aggregate, and then self-checks the final answer.
Because every step is ordinary Starlark and every side effect is a logged host function call, the entire run is a replayable checkpoint.
The full agent
agents/deep_researcher.star
config(model = "claude-sonnet", max_turns = 15)
def agent(question, depth = "standard"):
# 1. Plan — ask the LLM what to search for
queries = prompt(
"Generate 3 search queries for: " + question +
"\nReturn as a JSON array of strings.",
format = "json",
)
# 2. Execute — fan out searches in parallel
all_results = parallel([
lambda q = q: tool("web_search", query = q)
for q in queries
])
# 3. Optionally fetch primary sources for deep research
sources = []
if depth == "deep":
urls = [r["url"] for batch in all_results for r in batch[:2]]
sources = parallel([
lambda u = u: tool("fetch_url", url = u)
for u in urls
])
# 4. Synthesize — render a Jinja prompt over the gathered context
answer = prompt(template("prompts/research.jinja",
question = question,
results = all_results,
sources = sources,
), model = "claude-opus", temperature = 0.3)
# 5. Self-critique — fact-check the answer before returning it
final = prompt(
"Fact-check this answer. Return the corrected version.\n\n" + answer,
)
return {
"answer": final,
"sources": all_results,
"queries_used": queries,
}The supporting pieces
Tools
tools/search.star
def web_search(query, max_results = 5):
"""Search the web and return results."""
resp = http("GET", "https://api.search.example/search",
params = {"q": query, "limit": max_results},
)
return resp["results"]
def fetch_url(url):
"""Fetch a URL and return its text content."""
resp = http("GET", url)
return {"url": url, "text": resp["body"]}Prompt template
prompts/research.jinja
You are a senior research analyst. Answer the question using the evidence below.
Cite specific sources with [index] markers. If the evidence is insufficient, say so.
## Question
{{ question }}
## Search results
{% for batch in results %}
{% for r in batch %}
- [{{ loop.index }}] {{ r.title }} ({{ r.url }}): {{ r.snippet }}
{% endfor %}
{% endfor %}
{% if sources %}
## Primary sources
{% for s in sources %}
### {{ s.url }}
{{ s.text[:2000] }}
{% endfor %}
{% endif %}
## AnswerRun it
chidori run agents/deep_researcher.star \
--input question="How does Rust's borrow checker work?" \
--input depth=deep \
--traceWhat's autonomous about this
- The LLM plans — queries aren't hardcoded; the agent asks the model what to search for.
- The agent routes its own work —
depth="deep"branches into primary-source fetching;depth="standard"skips it. - It critiques itself — the final step is a fact-check pass over its own draft.
- It explains its reasoning —
queries_usedis returned alongside the answer so a reviewer can audit the plan.
Iterating safely
The expensive parts of this agent are the LLM calls and the web searches. Thanks to checkpointing, you can iterate on the prompt template or the critique step without paying for those again:
# First run — real searches, real LLM calls
chidori run agents/deep_researcher.star --input question="..." --trace 2> run1.trace
# Pull the checkpoint out of the trace and replay while iterating
curl -X POST http://localhost:8080/sessions -d @run1.checkpoint.jsonWhere to go next
- Add a human-in-the-loop approval step with
input()before returning the answer. - Expose the agent as a webhook with
chidori serveso it can be triggered from Slack. - Store question-answer pairs in
memory()so future runs can reference prior research.