Improve cross-user prompt cache sharing with `--exclude-dynamic-system-prompt-sections`

## TLDR

The `--exclude-dynamic-system-prompt-sections` flag currently achieves `~82%` cross-user cache sharing, but actually `~98%` of the content is sharable. Currently, the flag moves a `~14K`-char block from the system prompt into the user message, but `~12K` of that block is shared boilerplate (auto-memory type definitions, examples, save/access instructions) identical for every user. Only `~550` chars actually vary per user. Keeping the shared boilerplate in the system prompt and only moving the truly per-user values would significantly improve cross-user cache hit rates.

## Context

The `--exclude-dynamic-system-prompt-sections` flag (CLI) / `exclude_dynamic_sections` (SDK `SystemPromptPreset`) [was a great addition](https://github.com/anthropics/claude-agent-sdk-python/issues/784) for enabling cross-user prompt caching. It moves per-user dynamic sections from the system prompt into a `<system-reminder>` block in the first user message, so that the system prompt is identical across users and can be cached.

The flag works correctly — system prompts are byte-for-byte identical across different users/directories when it's enabled. However, there's room to further improve how much content stays in the cacheable system prompt.

## Observation

When `exclude_dynamic_sections: true` is set, the following content gets moved from the system prompt to the first user message:

| Content | Size | Varies per user? |
|---|---|---|
| Auto-memory type definitions, examples, save/access instructions | `~12,000` chars | **No** — identical for every user |
| Auto-memory header + memory storage path | `~740` chars | **Yes** — the path varies, the instructions don't |
| Environment section (CWD, git status, platform, shell, OS, model info) | `~985` chars | **Partially** — CWD, git status, platform, shell, OS vary; model info boilerplate doesn't |
| Date | `~40` chars | **Yes** |
| CLAUDE.md file path and contents | `~100` chars | **Yes** |
| Git recent commits | `~15` chars | **Yes** |

The relocated block totals `~13.9K` chars. About `~12K` of that (`~87%`) is the auto-memory instructions template — identical for every user. The rest contains a mix of per-user values and shared boilerplate. Only `~550` chars actually vary between users/machines.

### Caching impact

Measuring system prompt + first user message (excluding tools):

| Configuration | Cacheable (system prompt) | Per-request (user message) | Cache sharing |
|---|---|---|---|
| Without flag (baseline) | `0%` — system prompt differs per user | `100%` | `0%` |
| **With flag (current)** | **`46%`** | **`54%`** | **`~46%`** |
| **With proposed improvement** | **`93%`** | **`7%`** | **`~93%`** |

## Reproducer

See [`this repro folder`](https://github.com/alx32/share_files/tree/main/002_claude_exclude_dynamic) for:

- **`reproduce.sh`** — Self-contained script that captures actual API request bodies with and without the flag. Requires the `ANTHROPIC_API_KEY` environment variable to be set.
- **`proxy_server.py`** — Intercepting HTTP proxy that logs request bodies.

### Diffs

Here's how the flag currently changes the outgoing API request: [baseline vs. with flag enabled](https://github.com/alx32/share_files/commit/451bc70bfb934e9b087cf3f0d2e78fb7827295c9).

And here's the additional change we'd like to see — moving shared boilerplate back into the cacheable system prompt while keeping only per-user values in the user message: [current behavior vs desired behavior](https://github.com/alx32/share_files/commit/3d04af0baad9ef4de8c7867ae30d466aa4cc0305).

## Suggestion

A template-and-bind approach could work well here: keep the auto-memory instructions, environment template text, and other shared documentation in the system prompt with placeholders (e.g., `{{MEMORY_PATH}}`, `{{CWD}}`), and resolve them using per-user values provided in the first user message.

Everything that doesn't vary per user/machine (memory type definitions, save/access instructions, examples, model info boilerplate, environment template text) is identical for all users on the same model and could stay in the system prompt.

## Environment

- Claude Code CLI: `2.1.98`
- Claude Agent SDK (Python): `0.1.58`
- Model tested: `haiku` → `claude-haiku-4-5-20251001`
- OS: Linux arm64

## Related

- Original issue requesting `--exclude-dynamic-system-prompt-sections`: #784


Content	Size	Varies per user?
Auto-memory type definitions, examples, save/access instructions	`~12,000` chars	No — identical for every user
Auto-memory header + memory storage path	`~740` chars	Yes — the path varies, the instructions don't
Environment section (CWD, git status, platform, shell, OS, model info)	`~985` chars	Partially — CWD, git status, platform, shell, OS vary; model info boilerplate doesn't
Date	`~40` chars	Yes
CLAUDE.md file path and contents	`~100` chars	Yes
Git recent commits	`~15` chars	Yes

Configuration	Cacheable (system prompt)	Per-request (user message)	Cache sharing
Without flag (baseline)	`0%` — system prompt differs per user	`100%`	`0%`
With flag (current)	`46%`	`54%`	`~46%`
With proposed improvement	`93%`	`7%`	`~93%`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve cross-user prompt cache sharing with `--exclude-dynamic-system-prompt-sections` #807

TLDR

Context

Observation

Caching impact

Reproducer

Diffs

Suggestion

Environment

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Improve cross-user prompt cache sharing with --exclude-dynamic-system-prompt-sections #807

Description

TLDR

Context

Observation

Caching impact

Reproducer

Diffs

Suggestion

Environment

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Improve cross-user prompt cache sharing with `--exclude-dynamic-system-prompt-sections` #807