releaseJune 27, 2026

Junior adds memory and cuts one analytics task from 3m to 1m

Junior’s first memory system cut one analytics task from about 3 minutes to 1 minute in early tests, with tokens down two-thirds and tool calls down 60%. The feature moves persistent task learning into the agent loop, though the results are still internal.

3 min read

Junior adds memory and cuts one analytics task from 3m to 1m

TL;DR

zeeg's launch post says Junior just shipped its first memory system.
In early internal tests, zeeg's benchmark post said memory cut one slightly complex analytics task from about 3 minutes to 1 minute, with tokens down by two-thirds and tool calls down 60%.
The first version is simple. In zeeg's reply on implementation, zeeg said Junior is using embeddings for memory right now.
The rollout is still tentative. In zeeg's follow-up, he called memory for agents a hard engineering problem and said he would not be surprised if things start breaking.

You can jump from Junior's site to the product itself, and Teknium's shared diagram shows a separate Nous Research sketch about timestamping recalled state so multi-agent workers stop treating stale context as current truth. That sits nicely next to zeeg's GitHub auth reply, which describes Junior mixing service-account reads with personal write scopes when attribution matters.

Memory

Junior's first memory pass is framed as persistent self-improvement inside the agent loop, not just a chat history cache.

According to zeeg's benchmark post, the immediate gains on one analytics workflow were:

runtime: about 3 minutes to 1 minute
tokens: down by roughly two-thirds
tool calls: down by 60%

zeeg's implementation reply adds one concrete design detail: the current memory layer uses simple embeddings.

Freshness

The strongest extra clue about how this kind of system has to behave comes from Teknium's shared diagram, which is a Nous Research schematic about stale handoffs in multi-agent work.

The diagram breaks the problem into six pieces:

sibling handoffs can be read as current truth even when they are old
bare text plus an absolute timestamp still looks like fact to the model
relative age labels like "completed 18h ago" push the model to re-check
recalled state should be framed as a point-in-time snapshot
the age stamp should cover parent results, comments, prior attempts, and role history
coarse buckets like "just now," "18h ago," and "3d ago" are easier for the model to read and safer under clock skew

That is not presented as Junior documentation, but it does show the class of failure mode memory systems run into once recalled state starts steering other workers.

Auth paths

Junior is also exposing some of the plumbing around how an agent gets access to external systems.

In zeeg's GitHub auth reply, zeeg said Junior uses both service accounts and personal auth, with service accounts handling broader read access and personal auth required to upgrade into write scopes where attribution matters. In zeeg's Slack reply, he said the OAuth link is sent into Slack but users still finish the flow in a browser.

One more deployment detail comes from zeeg on Junior's stack, where he said Junior is built on top of Pi and Vercel, and that he mainly uses Pi as an SDK in Junior and Warden.

TL;DR

Memory

Freshness

Auth paths

Discussion across the web