fix: prevent recovery cascade and improve Stop-scenario reasoning lookup#25
Conversation
Two fixes for the reasoning_content cache: 1. Preserve the original conversation scope in PreparedRequest before recovery strips messages, so record_response_reasoning stores reasoning with the same scope that normalize_message will later use for lookup. This stops the scope mismatch cascade where recovery -> stripped scope -> cache miss -> recovery again. 2. Add a tool_name fallback in lookup_for_message as the fourth lookup level (after message_signature, tool_call_id, and tool_call_signature). When the user presses Stop during streaming tool-call arguments, only the function name may be complete; this fallback matches on function name alone, ignoring incomplete arguments. Made-with: Cursor
There was a problem hiding this comment.
Pull request overview
Fixes reasoning-content cache behavior to avoid repeated recovery loops and improve cache hits when streaming tool-call arguments are interrupted (e.g., Stop mid-tool-call).
Changes:
- Capture and carry a pre-recovery conversation scope (
record_response_scope) so response reasoning is stored under the scope future normalization uses for lookup. - Thread the preserved scope through regular and streaming response paths when recording/storing reasoning.
- Add a tool-name-only fallback cache key/lookup level to handle incomplete tool-call payloads.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/deepseek_cursor_proxy/transform.py | Capture pre-recovery scope and allow record_response_reasoning/rewrite_response_body to store under that scope. |
| src/deepseek_cursor_proxy/server.py | Pass preserved scope into regular/streaming response rewriting and ensure streaming stores reasoning on early exits. |
| src/deepseek_cursor_proxy/reasoning_store.py | Add tool-name cache keys + lookup fallback for partial tool-call matching. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| keys.extend( | ||
| f"scope:{scope}:tool_name:{tool_name}" | ||
| for tool_name in tool_call_names(message) | ||
| ) |
There was a problem hiding this comment.
tool_call_names() can return duplicate names when a single assistant message includes multiple tool calls to the same function. That will add duplicate scope:{scope}:tool_name:... keys, causing redundant SQLite writes (and inflating the returned key count). Consider de-duping tool names (or de-duping keys overall) before calling put() to avoid unnecessary DB churn.
| ) | |
| ) | |
| keys = list(dict.fromkeys(keys)) |
| for tool_name in tool_call_names(message): | ||
| reasoning = self.get(f"scope:{scope}:tool_name:{tool_name}") | ||
| if reasoning is not None: | ||
| return reasoning |
There was a problem hiding this comment.
The new tool-name-only fallback in lookup_for_message() introduces new matching behavior that can materially affect reasoning restoration when tool-call arguments/IDs are incomplete (e.g., Stop during streaming). Please add/adjust unit tests to cover (1) storing an assistant tool-call message and restoring its reasoning when only the function name matches, and (2) ensuring this fallback is only used when the higher-specificity keys miss.
| @@ -517,6 +520,7 @@ def prepare_upstream_request( | |||
| recovered_reasoning_messages=recovered_count, | |||
| recovery_dropped_messages=recovery_dropped_messages, | |||
| recovery_notice=recovery_notice, | |||
| record_response_scope=record_response_scope, | |||
There was a problem hiding this comment.
record_response_scope changes how response reasoning is keyed in the cache (capturing scope before recovery mutates the message list). This is a behavior change that can prevent recovery cascades, but it’s easy to regress—please add/adjust tests to assert that after a recovery pass, response reasoning is recorded under the pre-recovery conversation scope and is subsequently discoverable during normalization without triggering another recovery.
|
Thanks for the PR @Anudorannador , I'll take a look at it. Potentially, it would be helpful if we first include some mechanism to produce the full trace of the call interactions between Cursor, the proxy, and the DeepSeek API. |
…smatch-and-stop-tool-name-fallback # Conflicts: # src/deepseek_cursor_proxy/server.py # src/deepseek_cursor_proxy/transform.py
|
Closing this in favor of #28 for a more disciplined fix with tracing and testing. Feel free to create new issues or PRs if you still encouters similar issues. Thanks again @Anudorannador . |
… keying Brings current main (post PR yxlao#28) onto PR yxlao#25's branch and reconciles the two designs: - Keep @Anudorannador's tool_name cache fallback (the second of two fixes in PR yxlao#25). It addresses the rare case where the upstream stream is cut off (user presses Stop) before any tool_call.id chunk arrives, so neither tool_call_id nor tool_call_signature lookups can match the assistant message Cursor synthesises in the next request. - Drop PR yxlao#25's first fix (preserve pre-recovery scope on response recording): main's response_recording_contexts mechanism (PR yxlao#28) now covers it more generally and has been verified in tests. - Integrate tool_name into the helper-function structure introduced by PR yxlao#28: scoped_reasoning_keys, portable_reasoning_keys, and the reasoning_lookup_keys diagnostic list now all carry strict and portable tool_name kinds. The unconditional tool_name lookup in ReasoningStore.lookup_for_message becomes redundant once the helpers produce the keys, so it's removed. - Preserve PR yxlao#25's try/finally around the streaming read loop. With main's early-return-on-disconnect logic, partial reasoning would be dropped on client disconnect; the finally block stores it across all recording contexts before exit. Tests: 90 unit tests pass (88 from main + 2 new regression tests in StopMidStreamingToolCallTests covering tool_name lookup and per-turn scope isolation).
|
Hi @Anudorannador — thanks for this contribution! I reopened the PR and pushed a merge onto your branch (
Your commit |
Merges 6 upstream commits from yxlao/deepseek-cursor-proxy: - feat(streaming): add collapsible reasoning display (yxlao#32) - refactor(proxy): audit thinking-mode protocol and refactor test suite (yxlao#33) - fix(server): honor missing-reasoning reject mode (yxlao#34) - fix: prevent recovery cascade and improve Stop-scenario reasoning lookup (yxlao#25) - feat(config): default reasoning effort to max (yxlao#36) - refactor(logging): simplified prints and add spinner (yxlao#37) Combines both branches' changes: - Concurrent request deduplication (ours) + dual-scope recording (upstream) - Broad namespace cache keys (ours) + portable tool_name keys (upstream) - Spinner, build_arg_parser, main() function (upstream) - Metrics tracking, schema migration, graceful shutdown (ours) Co-authored-by: Cursor <cursoragent@cursor.com>
Two fixes for the reasoning_content cache:
Preserve the original conversation scope in PreparedRequest before recovery strips messages, so record_response_reasoning stores reasoning with the same scope that normalize_message will later use for lookup. This stops the scope mismatch cascade where recovery -> stripped scope -> cache miss -> recovery again.
Add a tool_name fallback in lookup_for_message as the fourth lookup level (after message_signature, tool_call_id, and tool_call_signature). When the user presses Stop during streaming tool-call arguments, only the function name may be complete; this fallback matches on function name alone, ignoring incomplete arguments.
Made-with: Cursor