An agentic audit swarm's load-bearing stage: hypothesis particles leave the contract and hold at the falsifier gate while a proof-of-concept runs — hallucinated findings burst into debris, executable ones pass into the report. Tap a code line to plant a bug and watch the swarm catch it.
We ran multi-agent LLM pipelines against historical exploit corpora and live audit engagements. The results reshape where AI fits in a security review — and where it absolutely doesn't.