jonathangu.com · OpenClawBrain · Latest blog

Shadow routing: the brain learns from a teacher without slowing down

OpenClawBrain v12.2.6 · March 2026 · Jonathan Gu

OpenClawBrain now prioritizes one operating rule: keep query-time retrieval fast, then learn better routing asynchronously.

Hot path stays local:

embed(query) -> local traversal -> prompt_context -> answer

No teacher LLM call runs on this path.

Background learning loop:

sample recent route decisions
-> ask teacher (gpt-5-mini) what it would choose
-> apply policy-gradient updates to edge weights + relevance metadata
-> feed improved signals into split/merge/prune/connect

Why this shift matters

Better routing quality without adding hot-path latency.
Fewer turns and fewer prompt tokens on repeated workflows.
Cleaner context blocks because weak routes are gradually down-weighted.
Human corrections remain high-authority; teacher labels are weak supervision.

Human vs teacher authority

Teacher labels are useful for coverage, but they are not the truth source. Explicit human feedback takes precedence and can override teacher-driven updates.

This is a design update with early operational behavior, not a claim of new benchmark wins beyond already measured artifacts in the repo.

How to run

openclawbrain async-route-pg --state /tmp/brain/state.json --dry-run
openclawbrain async-route-pg --state /tmp/brain/state.json --apply

Dry-run first, inspect proposed route updates, then apply.