Skip to content

Add more checks and add functional coverage for bifurcation_detector#28068

Merged
yuslepukhin merged 6 commits intomainfrom
yuslepukhin/bifurcation_use_after_free
Apr 15, 2026
Merged

Add more checks and add functional coverage for bifurcation_detector#28068
yuslepukhin merged 6 commits intomainfrom
yuslepukhin/bifurcation_use_after_free

Conversation

@yuslepukhin
Copy link
Copy Markdown
Member

This pull request strengthens input validation and significantly expands test coverage for the BifurcationDetector operator. The main logic change enforces that prev_suffix_match_idx must be non-negative, preventing invalid input. The test suite is enhanced with a wide range of scenarios, including edge cases, error handling, and attribute-based behaviors.

Input validation improvements:

  • Added a check to ensure prev_suffix_match_idx is non-negative in the BifurcationDetector kernel, improving robustness against invalid input values.

Test coverage enhancements:

  • Added comprehensive tests for various bifurcation and suffix matching scenarios, including:
    • Bifurcation at the first token, mid-sequence, and with non-zero prev_suffix_match_idx.
    • Suffix matching with unique and multiple n-gram occurrences, handling of n-gram size attributes, and output length edge cases.
    • Combined bifurcation and suffix match logic.
  • Added tests to verify correct handling of invalid or boundary inputs:
    • Negative and excessively negative prev_suffix_match_idx values are correctly rejected.
    • prev_suffix_match_idx exceeding the source token length is rejected.
    • Boundary case where prev_suffix_match_idx equals the source token length is handled.
  • Added tests for cases with no prediction tokens, ensuring output matches expectations.

@yuslepukhin yuslepukhin requested a review from Copilot April 14, 2026 21:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR tightens input validation for BifurcationDetector by rejecting negative prev_suffix_match_idx, and significantly increases functional test coverage across bifurcation and suffix-matching scenarios.

Changes:

  • Enforce prev_suffix_match_idx >= 0 in the CPU kernel.
  • Add broad functional tests for bifurcation detection and suffix matching (including attribute-driven paths).
  • Add negative/boundary input tests for prev_suffix_match_idx and “no pred tokens” scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
onnxruntime/contrib_ops/cpu/bert/bifurcation_detector.h Adds a non-negative validation check for prev_suffix_match_idx.
onnxruntime/test/contrib_ops/bifurcation_detector_op_test.cc Adds many new tests covering bifurcation positions, suffix-match cases, attributes, and invalid inputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/test/contrib_ops/bifurcation_detector_op_test.cc Outdated
Comment thread onnxruntime/test/contrib_ops/bifurcation_detector_op_test.cc
Comment thread onnxruntime/test/contrib_ops/bifurcation_detector_op_test.cc
Comment thread onnxruntime/test/contrib_ops/bifurcation_detector_op_test.cc Outdated
Comment thread onnxruntime/test/contrib_ops/bifurcation_detector_op_test.cc
@yuslepukhin yuslepukhin requested a review from Copilot April 14, 2026 23:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/contrib_ops/cpu/bert/bifurcation_detector.h Outdated
Comment thread onnxruntime/test/contrib_ops/bifurcation_detector_op_test.cc Outdated
Comment thread onnxruntime/test/contrib_ops/bifurcation_detector_op_test.cc
@vraspar
Copy link
Copy Markdown
Contributor

vraspar commented Apr 15, 2026

Review generated with assistance from GitHub Copilot CLI.

Fix looks correct and complete for the targeted bug — the non-negative check on prev_suffix_match_idx closes the OOB read path before the buffer start. The three ORT_ENFORCE guards together guarantee safe loop bounds. Error messages with actual values are a nice improvement for debuggability.

Tests are solid — math verified, good coverage of bifurcation logic, suffix matching, boundary cases, and invalid inputs. A couple of minor coverage nits:

1. Missing "full match" test — The case where all predicted tokens match source tokens (pred_bifur_idx reaches the end of the loop without breaking) is the common speculative decoding success path and isn't covered. Would be good to have a test where no bifurcation occurs.

2. Missing pred_tokens length mismatch negative test — The pred_tokens_len == (src_tokens_len + 1 - prev_suffix_match_idx_data) invariant is enforced but not regression-tested with a deliberate mismatch. A test with AddShapeToTensorData(false) and a wrong-length pred_tokens asserting "pred_tokens length mismatch" would lock that down.

Neither is blocking — the security fix and core regression tests are good to go.

@yuslepukhin yuslepukhin enabled auto-merge (squash) April 15, 2026 21:06
tianleiwu
tianleiwu previously approved these changes Apr 15, 2026
Copy link
Copy Markdown
Contributor

@tianleiwu tianleiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall: Solid defensive-hardening PR. The non-negative validation for prev_suffix_match_idx closes a real out-of-bounds read, and the test coverage (13 new tests) is thorough — covering happy paths, error rejection, boundary conditions, and attribute variations. The diagnostic messages with actual parameter values are a nice touch.

One suggestion below about ORT_ENFORCE vs ORT_RETURN_IF_NOT for input validation in Compute().

Comment thread onnxruntime/contrib_ops/cpu/bert/bifurcation_detector.h Outdated
@yuslepukhin yuslepukhin requested a review from tianleiwu April 15, 2026 21:23
@yuslepukhin
Copy link
Copy Markdown
Member Author

/azp run Windows ARM64 QNN CI Pipeline

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@yuslepukhin yuslepukhin merged commit 9e2364d into main Apr 15, 2026
105 of 108 checks passed
@yuslepukhin yuslepukhin deleted the yuslepukhin/bifurcation_use_after_free branch April 15, 2026 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants