Skip to content

fix: prevent recovery cascade and improve Stop-scenario reasoning lookup#25

Merged
yxlao merged 4 commits intoyxlao:mainfrom
Anudorannador:fix/recovery-scope-mismatch-and-stop-tool-name-fallback
May 1, 2026
Merged

fix: prevent recovery cascade and improve Stop-scenario reasoning lookup#25
yxlao merged 4 commits intoyxlao:mainfrom
Anudorannador:fix/recovery-scope-mismatch-and-stop-tool-name-fallback

Conversation

@Anudorannador
Copy link
Copy Markdown
Contributor

Two fixes for the reasoning_content cache:

  1. Preserve the original conversation scope in PreparedRequest before recovery strips messages, so record_response_reasoning stores reasoning with the same scope that normalize_message will later use for lookup. This stops the scope mismatch cascade where recovery -> stripped scope -> cache miss -> recovery again.

  2. Add a tool_name fallback in lookup_for_message as the fourth lookup level (after message_signature, tool_call_id, and tool_call_signature). When the user presses Stop during streaming tool-call arguments, only the function name may be complete; this fallback matches on function name alone, ignoring incomplete arguments.

Made-with: Cursor

Two fixes for the reasoning_content cache:

1. Preserve the original conversation scope in PreparedRequest before
   recovery strips messages, so record_response_reasoning stores reasoning
   with the same scope that normalize_message will later use for lookup.
   This stops the scope mismatch cascade where recovery -> stripped scope
   -> cache miss -> recovery again.

2. Add a tool_name fallback in lookup_for_message as the fourth lookup
   level (after message_signature, tool_call_id, and tool_call_signature).
   When the user presses Stop during streaming tool-call arguments,
   only the function name may be complete; this fallback matches on
   function name alone, ignoring incomplete arguments.

Made-with: Cursor
Copilot AI review requested due to automatic review settings April 29, 2026 06:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes reasoning-content cache behavior to avoid repeated recovery loops and improve cache hits when streaming tool-call arguments are interrupted (e.g., Stop mid-tool-call).

Changes:

  • Capture and carry a pre-recovery conversation scope (record_response_scope) so response reasoning is stored under the scope future normalization uses for lookup.
  • Thread the preserved scope through regular and streaming response paths when recording/storing reasoning.
  • Add a tool-name-only fallback cache key/lookup level to handle incomplete tool-call payloads.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/deepseek_cursor_proxy/transform.py Capture pre-recovery scope and allow record_response_reasoning/rewrite_response_body to store under that scope.
src/deepseek_cursor_proxy/server.py Pass preserved scope into regular/streaming response rewriting and ensure streaming stores reasoning on early exits.
src/deepseek_cursor_proxy/reasoning_store.py Add tool-name cache keys + lookup fallback for partial tool-call matching.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

keys.extend(
f"scope:{scope}:tool_name:{tool_name}"
for tool_name in tool_call_names(message)
)
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tool_call_names() can return duplicate names when a single assistant message includes multiple tool calls to the same function. That will add duplicate scope:{scope}:tool_name:... keys, causing redundant SQLite writes (and inflating the returned key count). Consider de-duping tool names (or de-duping keys overall) before calling put() to avoid unnecessary DB churn.

Suggested change
)
)
keys = list(dict.fromkeys(keys))

Copilot uses AI. Check for mistakes.
Comment on lines +194 to +213
for tool_name in tool_call_names(message):
reasoning = self.get(f"scope:{scope}:tool_name:{tool_name}")
if reasoning is not None:
return reasoning
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new tool-name-only fallback in lookup_for_message() introduces new matching behavior that can materially affect reasoning restoration when tool-call arguments/IDs are incomplete (e.g., Stop during streaming). Please add/adjust unit tests to cover (1) storing an assistant tool-call message and restoring its reasoning when only the function name matches, and (2) ensuring this fallback is only used when the higher-specificity keys miss.

Copilot uses AI. Check for mistakes.
Comment thread src/deepseek_cursor_proxy/transform.py Outdated
Comment on lines +489 to +523
@@ -517,6 +520,7 @@ def prepare_upstream_request(
recovered_reasoning_messages=recovered_count,
recovery_dropped_messages=recovery_dropped_messages,
recovery_notice=recovery_notice,
record_response_scope=record_response_scope,
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

record_response_scope changes how response reasoning is keyed in the cache (capturing scope before recovery mutates the message list). This is a behavior change that can prevent recovery cascades, but it’s easy to regress—please add/adjust tests to assert that after a recovery pass, response reasoning is recorded under the pre-recovery conversation scope and is subsequently discoverable during normalization without triggering another recovery.

Copilot uses AI. Check for mistakes.
@yxlao
Copy link
Copy Markdown
Owner

yxlao commented Apr 29, 2026

Thanks for the PR @Anudorannador , I'll take a look at it.

Potentially, it would be helpful if we first include some mechanism to produce the full trace of the call interactions between Cursor, the proxy, and the DeepSeek API.

…smatch-and-stop-tool-name-fallback

# Conflicts:
#	src/deepseek_cursor_proxy/server.py
#	src/deepseek_cursor_proxy/transform.py
@yxlao
Copy link
Copy Markdown
Owner

yxlao commented Apr 29, 2026

Closing this in favor of #28 for a more disciplined fix with tracing and testing. Feel free to create new issues or PRs if you still encouters similar issues. Thanks again @Anudorannador .

@yxlao yxlao closed this Apr 29, 2026
… keying

Brings current main (post PR yxlao#28) onto PR yxlao#25's branch and reconciles
the two designs:

- Keep @Anudorannador's tool_name cache fallback (the second of two
  fixes in PR yxlao#25). It addresses the rare case where the upstream
  stream is cut off (user presses Stop) before any tool_call.id chunk
  arrives, so neither tool_call_id nor tool_call_signature lookups can
  match the assistant message Cursor synthesises in the next request.
- Drop PR yxlao#25's first fix (preserve pre-recovery scope on response
  recording): main's response_recording_contexts mechanism (PR yxlao#28)
  now covers it more generally and has been verified in tests.
- Integrate tool_name into the helper-function structure introduced by
  PR yxlao#28: scoped_reasoning_keys, portable_reasoning_keys, and the
  reasoning_lookup_keys diagnostic list now all carry strict and
  portable tool_name kinds. The unconditional tool_name lookup in
  ReasoningStore.lookup_for_message becomes redundant once the helpers
  produce the keys, so it's removed.
- Preserve PR yxlao#25's try/finally around the streaming read loop. With
  main's early-return-on-disconnect logic, partial reasoning would be
  dropped on client disconnect; the finally block stores it across all
  recording contexts before exit.

Tests: 90 unit tests pass (88 from main + 2 new regression tests in
StopMidStreamingToolCallTests covering tool_name lookup and per-turn
scope isolation).
@yxlao yxlao reopened this May 1, 2026
@yxlao
Copy link
Copy Markdown
Owner

yxlao commented May 1, 2026

Hi @Anudorannador — thanks for this contribution! I reopened the PR and pushed a merge onto your branch (aa522c9) that brings it up to date with main and reconciles it with #28, which landed in the meantime.

  • Kept your tool_name fallback. It's the only thing that rescues the Stop-mid-stream-before-tool_call.id case, and fix(cache): fix reasoning recovery during mode/model switch #28 doesn't cover it.
  • Dropped your first fix (pre-recovery scope on response recording). fix(cache): fix reasoning recovery during mode/model switch #28's record_response_contexts records under both pre- and post-recovery scopes for every key kind, generalizing it.
  • Integrated tool_name as a first-class lookup tier instead of an inline branch: scoped_reasoning_keys, portable_reasoning_keys, and reasoning_lookup_keys all carry strict + portable tool_name kinds. The portable variant is keyed by turn_context_signature so it can't collide across turns or modes.
  • Kept your try/finally around the streaming read loop, adapted to iterate over record_response_contexts so partial reasoning is stored under every scope on disconnect.
  • Added 2 regression tests for the Stop scenario and per-turn scope isolation. Full suite (90 tests) passes.

Your commit c1161bf is preserved with original authorship.

@yxlao yxlao merged commit 4eebf78 into yxlao:main May 1, 2026
6 checks passed
maThiaslI152 added a commit to maThiaslI152/deepseek-cursor-proxy that referenced this pull request May 6, 2026
Merges 6 upstream commits from yxlao/deepseek-cursor-proxy:
- feat(streaming): add collapsible reasoning display (yxlao#32)
- refactor(proxy): audit thinking-mode protocol and refactor test suite (yxlao#33)
- fix(server): honor missing-reasoning reject mode (yxlao#34)
- fix: prevent recovery cascade and improve Stop-scenario reasoning lookup (yxlao#25)
- feat(config): default reasoning effort to max (yxlao#36)
- refactor(logging): simplified prints and add spinner (yxlao#37)

Combines both branches' changes:
- Concurrent request deduplication (ours) + dual-scope recording (upstream)
- Broad namespace cache keys (ours) + portable tool_name keys (upstream)
- Spinner, build_arg_parser, main() function (upstream)
- Metrics tracking, schema migration, graceful shutdown (ours)

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants