What it is
Open-source autonomous pentest agent. The most architecturally bold project in the corpus — many of its choices look unconventional at first and turn out to be the right tradeoff for a long-running, multi-agent, security-sensitive workload.
What’s worth studying
Strix gets four things right that few other agents do:
- XML tool format with single-call-per-turn. The system prompt forbids more than one
<function>block, which lets the parser abort the stream the moment it sees</function>and stop paying for trailing hallucination. A free 10–20% per-step output saving. See streaming-early-stop. - Markdown-first skills directory. ~30 attack methodologies live as
.mdfiles instrix/skills/. New methodology = markdown PR. The agent’s loop almost never changes; the brain evolves weekly. See markdown-as-skills. - Authorized scope injected at render time. The agent’s allowed targets come from the platform DB, rendered into the system prompt by jinja, never from user chat. Defeats prompt injection by construction. See scope-injected-at-render.
- LLM-based dedupe of findings. Two sub-agents reporting “found SQL injection at /search” and “input validation missing in search.py” are the same bug. Hashing fails; LLM reasoning works. See llm-dedupe-root-cause.
A fifth choice that’s quieter but elegant: inter-agent messaging via plain Python module-level dicts. No Redis, no broker. One assumption (“one scan per process”), checked at boot, and you’re done. See module-level-mailbox.
Drill-down
The full per-doc analysis lives below — these are the original numbered analyses, rendered as styled HTML. Pick a section to study deeper.