Skip to content
AI Primer
workflow

Micropython-wasm, TakoVM, and FSM runtimes add WASM and gVisor guards for agents

New open-source runtime projects moved agent control away from unconstrained loops toward WebAssembly sandboxes, gVisor-backed filesystems, deterministic FSMs, and YAML state engines. Teams should use these guards to bound file access, orchestration, and history growth before a model can mutate full system state.

5 min read
Micropython-wasm, TakoVM, and FSM runtimes add WASM and gVisor guards for agents
Micropython-wasm, TakoVM, and FSM runtimes add WASM and gVisor guards for agents

TL;DR

  • simonw's announcement pointed to Running Python code in a sandbox with MicroPython and WASM, where Simon Willison describes an alpha package, micropython-wasm, that runs MicroPython inside Wasmtime to block arbitrary file and network access and cap CPU and memory use.
  • In the TakoVM post, the author pitched a Docker sandbox with optional gVisor, and the TakoVM README adds built-in job queues, retries, execution history, replay, and network isolation by default.
  • ale007xd's runtime post replaces model-owned loops with a deterministic FSM, adds AST-only conditions, hides governance state behind a ProjectionLayer, and flags unstable runs when transition entropy crosses 2.5 bits.
  • Mobile_Star3587's TrueNorth post takes the opposite path from heavy Python state graphs: agents are defined in YAML, run locally with Ollama, compress old chat into a fact sheet, and reportedly cut false claims from 18% to about 2% with a three-stage conflict detector.

You can read Simon Willison's full sandbox write-up, inspect the TakoVM repo, and compare that container-heavy approach with two Reddit-native runtime experiments, a deterministic FSM stack and a YAML-first intake engine. The interesting common move is not a new model. It is pushing orchestration, file access, and history management out of the model's hands.

MicroPython in WASM

Willison's blog post is the clearest statement of the new constraint-first mood. He wanted plugin-style Python execution without full host privileges, with strict file controls, no ambient network access, and explicit CPU and memory limits.

The implementation details are more concrete than the tweet. According to Willison's write-up, micropython-wasm packages a custom MicroPython WASI build for Wasmtime, keeps interpreter state alive through a thread plus request queue, and uses a 362 KB WebAssembly blob plus 78 lines of C to expose selected host functions. The same post says Wasmtime memory limits work today, while CPU control currently relies on a 20 million "fuel" default that he calls experimental.

The companion release note, micropython-wasm 0.1a2, says the package now has a CLI, and the PyPI page describes it as a one-shot sandbox for small MicroPython snippets. The quick try path is exactly what simonw's example shows: uvx micropython-wasm -c 'print("Hello world")'.

TakoVM

r/LLMDevs

TakoVM, an isolated/sandbox job execution for agents!

0 comments

TakoVM takes the more familiar container route, but wraps a lot more runtime machinery around it than a thin sandbox service. The Reddit post says enterprises are already using it to run agents in a filesystem with Docker plus gVisor, optional dependency installation, and read-only path injection for context.

The README fills in the missing pieces:

  • Docker isolation for each job, with seccomp filtering.
  • Optional gVisor sandboxing on top.
  • No network by default, with per-job allowlists.
  • Built-in job queue and worker pool, no Redis or Celery setup.
  • PostgreSQL-backed execution history with stdout, stderr, timing, and artifacts.
  • Rerun and fork APIs for replay and debugging.
  • Self-hosted deployment, described as offline-capable.

That makes TakoVM look less like a raw sandbox and more like an execution backend for code-writing agents. The interesting part is the package boundary: the model can generate code, but the runtime owns isolation, retries, history, and replay.

Deterministic FSMs

r/ArtificialInteligence

Why we locked an LLM inside a deterministic FSM (and built a failure laboratory around it)

0 comments

The most opinionated design in the set is the FSM runtime that explicitly rejects model-owned orchestration. In the post, the author argues that autonomous loop selection and tool choice make systems hard to audit, replay, bound, or reason about in regulated environments, then proposes a deterministic runtime where "system decides, LLM computes."

The post breaks that runtime into six parts:

  1. A deterministic FSM where the runtime, not the model, controls transitions and topology.
  2. A ProjectionLayer that withholds governance metadata, rollback density, policy internals, and anomaly state from the model.
  3. A constrained AST engine for conditions, with no eval(), exec(), method calls, or unrestricted expressions.
  4. Transition entropy monitoring, where structural instability above 2.5 bits is treated as a warning signal.
  5. A failure lab with attack injectors for tool injection, policy bypass, step reordering, corrupted receipts, and GDPR erase simulation.
  6. Transactional code mutation, where stage_patch(), mypy validation, and pytest must pass before commit, otherwise the repo rolls back.

The same post claims the current repo is at 51 out of 51 passing tests with zero mypy errors. It is a very different answer to agent safety than a sandbox. Instead of isolating bad code after generation, it tries to make the orchestration graph itself non-negotiable.

YAML state engines

r/AI_Agents

I got tired of building heavy Python state machines, so I built a YAML-first agent framework for structured JSON extraction

0 comments

TrueNorth lands in a middle zone between free-form agents and hard-coded FSMs. The pitch in the Reddit post is narrow on purpose: define required fields in YAML, let the engine manage conversational state, and end with a clean JSON object for intake-style workflows.

The mechanics called out in the post are unusually specific:

  • Local-first execution through Ollama.
  • A three-stage conflict detector that reportedly reduced false claims from 18% to about 2% in the author's tests.
  • Automatic token compression that turns older chat history into a dense fact sheet.
  • A schema where fields, types, required flags, prompts, and output format live in YAML instead of Python control flow.

That is a quieter shift than the sandboxing work, but it points at the same target. If long-context chat agents are failing because state, memory, and control flow are too implicit, one obvious fix is to move those concerns into YAML schemas, bounded runtimes, or container policies before the model ever touches the next step.

Share on X