Section 3 of 6

Insights (10)

One non-obvious trick per card. The clever bits that don't fit a textbook chapter: cache breakpoints, early-stop streaming, mailbox patterns, the small choices that shape an entire agent.

Come here when you want to be surprised. Filter by concept, project, difficulty, or tag — URL-driven so views are bookmarkable.

Search Concept Project Difficulty Tag

OpenHands (v0) ●●●

Memory compression preserves credentials, payloads, task IDs explicitly

Generic "summarize this conversation" loses the bits the agent needs to keep working. Mature systems enumerate preservation rules.

memory-compression

Strix ●●●

Markdown-as-prompt-library architecture

Decouples the agent's loop from its expertise. Domain experts contribute via PR; the loop almost never changes; the library evolves weekly.

skills-as-md agent-loop

Strix ●●●

LLM-based deduplication that reasons about root cause

Two pentest reports describing the same SQL injection with different payloads aren't textually similar — but they should dedupe. Hashing fails; LLM reasoning works.

memory-compression multi-agent-coordination

Claude Code ●●●

Cache boundary as a literal sentinel string

Survives every refactor, no marker objects to remember to add. Lets dozens of contributors compose one prompt without breaking cross-org cache reuse.

prompt-caching

Strix ●●●

Streaming early stop on </function>

A free 10-20% cost reduction per agent step. Compounds across hundreds of steps in a session.

streaming-tools tool-calling-formats agent-loop

OpenHands (v0) ●●●

Temperature perturbation as recovery from empty responses

A specific failure mode (empty response with temp=0) has a specific cheap fix. Worth knowing because it's not in any tutorial.

agent-loop guardrails

Claude Code ●●●

Thinking signature preservation across turns (and stripping on model switch)

Lose this and the model re-thinks every turn (cost spike) or you crash on model switch with an opaque signature error.

extended-thinking prompt-caching agent-loop

Claude Code ●●●

Latched sticky flags for cache coherence

User-toggleable flags that live in the system prompt would bust 50-70K cache tokens on every toggle. Latching trades UX flexibility for a 10× cost cut.

prompt-caching

Strix ●●●

Inter-agent messaging via module-level dicts

For one-process multi-agent coordination, plain Python dicts are the right answer. No Redis, no broker, no race conditions you need locks for.

multi-agent-coordination mailbox-patterns

Strix ●●●

Authorized scope injected into system prompt at render time

A pentest agent that can be talked out of scope is dangerous. Putting scope in the locked system prompt — not the message log — defeats prompt injection.

guardrails multi-agent-coordination