TLDR
The --exclude-dynamic-system-prompt-sections flag currently achieves ~82% cross-user cache sharing, but actually ~98% of the content is sharable. Currently, the flag moves a ~14K-char block from the system prompt into the user message, but ~12K of that block is shared boilerplate (auto-memory type definitions, examples, save/access instructions) identical for every user. Only ~550 chars actually vary per user. Keeping the shared boilerplate in the system prompt and only moving the truly per-user values would significantly improve cross-user cache hit rates.
Context
The --exclude-dynamic-system-prompt-sections flag (CLI) / exclude_dynamic_sections (SDK SystemPromptPreset) was a great addition for enabling cross-user prompt caching. It moves per-user dynamic sections from the system prompt into a <system-reminder> block in the first user message, so that the system prompt is identical across users and can be cached.
The flag works correctly — system prompts are byte-for-byte identical across different users/directories when it's enabled. However, there's room to further improve how much content stays in the cacheable system prompt.
Observation
When exclude_dynamic_sections: true is set, the following content gets moved from the system prompt to the first user message:
| Content |
Size |
Varies per user? |
| Auto-memory type definitions, examples, save/access instructions |
~12,000 chars |
No — identical for every user |
| Auto-memory header + memory storage path |
~740 chars |
Yes — the path varies, the instructions don't |
| Environment section (CWD, git status, platform, shell, OS, model info) |
~985 chars |
Partially — CWD, git status, platform, shell, OS vary; model info boilerplate doesn't |
| Date |
~40 chars |
Yes |
| CLAUDE.md file path and contents |
~100 chars |
Yes |
| Git recent commits |
~15 chars |
Yes |
The relocated block totals ~13.9K chars. About ~12K of that (~87%) is the auto-memory instructions template — identical for every user. The rest contains a mix of per-user values and shared boilerplate. Only ~550 chars actually vary between users/machines.
Caching impact
Measuring system prompt + first user message (excluding tools):
| Configuration |
Cacheable (system prompt) |
Per-request (user message) |
Cache sharing |
| Without flag (baseline) |
0% — system prompt differs per user |
100% |
0% |
| With flag (current) |
46% |
54% |
~46% |
| With proposed improvement |
93% |
7% |
~93% |
Reproducer
See this repro folder for:
reproduce.sh — Self-contained script that captures actual API request bodies with and without the flag. Requires the ANTHROPIC_API_KEY environment variable to be set.
proxy_server.py — Intercepting HTTP proxy that logs request bodies.
Diffs
Here's how the flag currently changes the outgoing API request: baseline vs. with flag enabled.
And here's the additional change we'd like to see — moving shared boilerplate back into the cacheable system prompt while keeping only per-user values in the user message: current behavior vs desired behavior.
Suggestion
A template-and-bind approach could work well here: keep the auto-memory instructions, environment template text, and other shared documentation in the system prompt with placeholders (e.g., {{MEMORY_PATH}}, {{CWD}}), and resolve them using per-user values provided in the first user message.
Everything that doesn't vary per user/machine (memory type definitions, save/access instructions, examples, model info boilerplate, environment template text) is identical for all users on the same model and could stay in the system prompt.
Environment
- Claude Code CLI:
2.1.98
- Claude Agent SDK (Python):
0.1.58
- Model tested:
haiku → claude-haiku-4-5-20251001
- OS: Linux arm64
Related
TLDR
The
--exclude-dynamic-system-prompt-sectionsflag currently achieves~82%cross-user cache sharing, but actually~98%of the content is sharable. Currently, the flag moves a~14K-char block from the system prompt into the user message, but~12Kof that block is shared boilerplate (auto-memory type definitions, examples, save/access instructions) identical for every user. Only~550chars actually vary per user. Keeping the shared boilerplate in the system prompt and only moving the truly per-user values would significantly improve cross-user cache hit rates.Context
The
--exclude-dynamic-system-prompt-sectionsflag (CLI) /exclude_dynamic_sections(SDKSystemPromptPreset) was a great addition for enabling cross-user prompt caching. It moves per-user dynamic sections from the system prompt into a<system-reminder>block in the first user message, so that the system prompt is identical across users and can be cached.The flag works correctly — system prompts are byte-for-byte identical across different users/directories when it's enabled. However, there's room to further improve how much content stays in the cacheable system prompt.
Observation
When
exclude_dynamic_sections: trueis set, the following content gets moved from the system prompt to the first user message:~12,000chars~740chars~985chars~40chars~100chars~15charsThe relocated block totals
~13.9Kchars. About~12Kof that (~87%) is the auto-memory instructions template — identical for every user. The rest contains a mix of per-user values and shared boilerplate. Only~550chars actually vary between users/machines.Caching impact
Measuring system prompt + first user message (excluding tools):
0%— system prompt differs per user100%0%46%54%~46%93%7%~93%Reproducer
See
this repro folderfor:reproduce.sh— Self-contained script that captures actual API request bodies with and without the flag. Requires theANTHROPIC_API_KEYenvironment variable to be set.proxy_server.py— Intercepting HTTP proxy that logs request bodies.Diffs
Here's how the flag currently changes the outgoing API request: baseline vs. with flag enabled.
And here's the additional change we'd like to see — moving shared boilerplate back into the cacheable system prompt while keeping only per-user values in the user message: current behavior vs desired behavior.
Suggestion
A template-and-bind approach could work well here: keep the auto-memory instructions, environment template text, and other shared documentation in the system prompt with placeholders (e.g.,
{{MEMORY_PATH}},{{CWD}}), and resolve them using per-user values provided in the first user message.Everything that doesn't vary per user/machine (memory type definitions, save/access instructions, examples, model info boilerplate, environment template text) is identical for all users on the same model and could stay in the system prompt.
Environment
2.1.980.1.58haiku→claude-haiku-4-5-20251001Related
--exclude-dynamic-system-prompt-sections: Preset system prompt contains per-user dynamic content, breaking cross-user prompt caching #784