Tagged: update

4 posts

May 21, 2026

Sub-Agents and the Parent's Clean Context

Merlin now spawns sub-agents inside the agent loop. The honest pitch: not a cost-saver. A quality multiplier — each child has its full attention on one thing — plus predictability when input sizes are unknown.

updatemerlinagent-loop

May 20, 2026

The one-shot arcade: making models perform in public

Nine classic games, written in a single prompt by an LLM, playable live on the public site. Pyodide runs the Python ones; sandboxed iframes run the HTML ones. A new tier of bench checks runs the programs and asserts the output. Models that scored 100% on the static checks turned out to ship chess boards that are upside down. The kind of failure you can only catch by playing.

updatemerlinbenchmarks

May 17, 2026

Watching Merlin work: per-tool telemetry and the plugin-first push

A per-tool summary on every run. Tool-usage chips on the benchmarks page. Typed cargo / files / git / node plugins replacing common shell-exec calls. Granular approval flags. And the last three plugins of the cycle were written by Merlin itself.

updatemerlinplugins

May 15, 2026

Discord, Run-Anywhere CLI, and a Better Place to Start

Discord bridge, run-anywhere CLI, NDJSON streaming, a new specsync-create plugin. A tour of what landed in Merlin this cycle.

updatemerlin