Generic "summarize this conversation" loses the bits the agent needs to keep working. Mature systems enumerate preservation rules.
Insights (10)
One non-obvious trick per card. The clever bits that don't fit a textbook chapter: cache breakpoints, early-stop streaming, mailbox patterns, the small choices that shape an entire agent.
Come here when you want to be surprised. Filter by concept, project, difficulty, or tag — URL-driven so views are bookmarkable.
Decouples the agent's loop from its expertise. Domain experts contribute via PR; the loop almost never changes; the library evolves weekly.
Two pentest reports describing the same SQL injection with different payloads aren't textually similar — but they should dedupe. Hashing fails; LLM reasoning works.
Survives every refactor, no marker objects to remember to add. Lets dozens of contributors compose one prompt without breaking cross-org cache reuse.
A free 10-20% cost reduction per agent step. Compounds across hundreds of steps in a session.
A specific failure mode (empty response with temp=0) has a specific cheap fix. Worth knowing because it's not in any tutorial.
Lose this and the model re-thinks every turn (cost spike) or you crash on model switch with an opaque signature error.
User-toggleable flags that live in the system prompt would bust 50-70K cache tokens on every toggle. Latching trades UX flexibility for a 10× cost cut.
For one-process multi-agent coordination, plain Python dicts are the right answer. No Redis, no broker, no race conditions you need locks for.
A pentest agent that can be talked out of scope is dangerous. Putting scope in the locked system prompt — not the message log — defeats prompt injection.