Five layers deep
We shipped a destructive-op gate and thought safety was done. Over the next six hours of red-teaming we found five more deletion paths, each surfaced while testing the previous fix. Then we ran a six-probe adversarial sweep to confirm the chain holds. Here's the full audit, the sharpening pass, the validation, and what we learned about how safety thinking generalizes.
safetymerlinagent-looplessonspost-mortem