§ blog

From the engineering log.

Writing on deterministic agents, agent-native observability, and the patterns we're discovering as we build Thousand Birds. Releases, deep dives, and case studies — pulled from the same call logs we use to debug production.

A flock of birds at dusk.
RELEASE2026.04.12

Chidori V3 — faster replay cache, Starlark stdlib, and a friendlier serve loop

A content-addressed replay cache, an expanded Starlark standard library, and a serve loop that finally treats backpressure and HITL resume the way we always wanted.

#chidori4 min
A diving bird with sunlit wings.
DEEP DIVE2026.03.28

Why determinism matters more than clever prompts

The hard part of shipping agents isn't getting them to work once. It's getting the exact same run back when something goes wrong.

#engineering9 min
A bird in flight trailing afterimages of itself.
CASE STUDY2026.03.14

Debugging a flaky research agent with checkpoint replay

A research agent passed a thousand evals and then started failing once a week in production. Here is how we found the bug without spending another token.

#tael7 min