releaseMarch 16, 2026

Hankweave adds runtime budgets for dollars, tokens, and wall-clock limits

Hankweave shipped budget controls that cap spend, tokens, and elapsed time globally or per step, including loop budgets and shared pools. Use them to prototype or productionize long agent runs without hand-managing model switches and failure states.

3 min read

Hankweave adds runtime budgets for dollars, tokens, and wall-clock limits

TL;DR

Hankweave shipped budget controls that let runtime users cap three resources at once: dollars, elapsed time, and tokens, as described in the launch thread.
Builders can now shape budgets inside a workflow by assigning step percentages, setting loop ceilings, reserving minimum spend for critical steps, and using shared pools instead of “first-past-the-post,” according to the follow-up thread.
The practical change is that a complex agent run can be trialed with a small cap like -m haiku --max-cost 0.50, while Hankweave handles model switches and cost management under the hood, per the launch thread.
Hankweave’s runtime now does a preflight budget calculation per codon and can warn when limits conflict or a run is likely to fail, as the follow-up thread explains.

What shipped

Hankweave added budget controls for long agentic runs across “the 3 primary costs involved in AI — time, dollars and tokens,” in the words of the launch thread. The new controls work at both runtime and build time: runtime users can set hard ceilings like “Don’t spend more than $100” or “Don’t take longer than 15 hours or 750K tokens,” while workflow authors can express per-step constraints inside the pipeline itself.

The technical payload is more than a global max. According to the follow-up thread, builders can give one step “20% of the budget,” loop steps until a dollar threshold is reached, reserve money for a later must-complete step, and “distribute token costs from a shared pool” across related steps. The budgets docs position this as a way to encode execution policy directly into the run graph rather than hand-tuning each model invocation.

How the runtime allocates and fails runs

The runtime computes budgets before execution. As the follow-up thread puts it, “the preflight check calculates the budget and shared pool for each codon ahead of time,” then warns if constraints are inconsistent or if “a hank is going to fail because of a conflict.” That matters for long multi-step jobs where a late-stage failure can waste both tokens and wall-clock time.

The screenshots in the launch thread and the follow-up thread show the allocation model in more detail: a global ceiling and time limit, proportional allocation where “unspent flows to later codons,” loop-level budgets, per-step token caps, and explicit failure behavior on overrun. In one example, a final-edit step on Opus is marked “must finish or the run fails,” while earlier research and revision steps are allowed to complete within shared or capped pools. That makes the feature useful for both cheap proof-of-concept runs and production pipelines with critical terminal steps.