enhance: shard cgo active future manager with 32-core groups#49047
enhance: shard cgo active future manager with 32-core groups#49047CLiqing wants to merge 1 commit intomilvus-io:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: CLiqing The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@CLiqing Thanks for your contribution. Please submit with DCO, see the contributing guide https://github.com/milvus-io/milvus/blob/master/CONTRIBUTING.md#developer-certificate-of-origin-dco. |
|
Invalid PR Title Format Detected Your PR submission does not adhere to our required standards. To ensure clarity and consistency, please meet the following criteria:
Required Title Structure: Where Example: Please review and update your PR to comply with these guidelines. |
|
[ci-v2-notice] To rerun ci-v2 checks, comment with:
If you have any questions or requests, please contact @zhikunyao. |
Use a 256-slot register buffer and route future registration across `cpuNum/32` managers (minimum 1 shard) to reduce single-manager contention in the cancel/select loop. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: CLiqing <2208529306@qq.com>
03608b3 to
e30d7ec
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #49047 +/- ##
==========================================
- Coverage 78.02% 77.90% -0.12%
==========================================
Files 2167 2167
Lines 356124 356265 +141
==========================================
- Hits 277868 277560 -308
- Misses 69753 70148 +395
- Partials 8503 8557 +54
🚀 New features to boost your workflow:
|
foxspy
left a comment
There was a problem hiding this comment.
Review: Shard CGO Active Future Manager
The sharding approach is sound — distributing futures across N managers (CPU/32, min 1) with atomic round-robin eliminates the single-goroutine bottleneck on the select loop and register channel. The mechanical changes are correct.
However, there is a critical bug in Prometheus metric reporting that must be fixed before merge.
🔴 Critical: ActiveFutureTotal gauge clobbered by multiple shards
File: internal/util/cgo/manager_active.go, line 104-106 (unchanged code, but broken by this PR)
metrics.ActiveFutureTotal.WithLabelValues(m.nodeID).Set(float64(activeTotal))ActiveFutureTotal is a GaugeVec keyed only by nodeID. All shards share the same nodeID, so each shard calls .Set() on the exact same gauge series with its own local activeTotal. The last shard to call doSelect() wins — the gauge shows only that shard's count, not the sum.
Impact: On a 128-core machine (4 shards), the metric could report ~25% of the true active future count. This breaks production monitoring and any alerting built on this metric.
Suggested fix (pick one):
- Option A (simplest): Switch from
.Set()to.Add(1)on register and.Sub(1)on completion — this naturally aggregates across shards. - Option B: Add a shard-index label so each shard reports independently and Grafana queries use
sum().
🟡 Medium: defaultRegisterBuf 256x increase lacks rationale
1 → 256 is a large jump. A buffer of 256 per shard means up to 1024 futures can queue (on a 4-shard system) before any caller blocks. This is probably fine but could mask overload. A brief comment explaining the sizing rationale would help future readers.
🟢 Verified — no issues found
- No stray references to old
futureManagersingleton — grep confirms all references are in the three modified files. ✓ hardware.GetCPUNum()usesGOMAXPROCS(0)which respects cgroup limits afterautomaxprocsinit. Correct for k8s. ✓- Round-robin distribution via
registerSeq.Inc() % managerCountis correct.Inc()returns post-increment (starts at 1), so shard 0 gets its first future when the counter wraps to a multiple of N — uniform distribution over time. ✓ - Test aggregation correctly iterates all shards and sums
ActiveCount. ✓ - No shutdown/cleanup path existed before this PR and none is needed now (pre-existing design — managers run for process lifetime). ✓
maxSelectCaseis now per-shard — each shard sees fewer futures, which is strictly better for the select loop. ✓
Uh oh!
There was an error while loading. Please reload this page.