ROX-33758: Fix quadratic runtime in endpointsStore.addToHistory#19566
ROX-33758: Fix quadratic runtime in endpointsStore.addToHistory#19566
endpointsStore.addToHistory#19566Conversation
|
This change is part of the following stack: Change managed by git-spice. |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
benchmarkSeedEndpointsStore,for i := range numEndpointsis invalid for anintand won't compile; switch to a standard counter loop (e.g.,for i := 0; i < numEndpoints; i++ { ... }) to correctly populate the benchmark data.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `benchmarkSeedEndpointsStore`, `for i := range numEndpoints` is invalid for an `int` and won't compile; switch to a standard counter loop (e.g., `for i := 0; i < numEndpoints; i++ { ... }`) to correctly populate the benchmark data.
## Individual Comments
### Comment 1
<location path="sensor/common/clusterentities/store_endpoints_benchmark_test.go" line_range="17" />
<code_context>
+ epSet := set.NewSet[net.NumericEndpoint]()
+
+ var firstEndpoint net.NumericEndpoint
+ for i := range numEndpoints {
+ ep := buildEndpoint(fmt.Sprintf("10.%d.%d.%d", (i/65536)%256, (i/256)%256, i%256), 80)
+ if i == 0 {
</code_context>
<issue_to_address>
**issue (bug_risk):** The `for i := range numEndpoints` loop is invalid and will prevent the benchmark from compiling
`numEndpoints` is an `int`, so `for i := range numEndpoints {` will not compile and will break `go test ./...`. Use a counted loop instead, e.g. `for i := 0; i < numEndpoints; i++ { ... }`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
de03fd3 to
8eb0a35
Compare
|
Sorry for the review request spam - I was changing base branch automatically and I didn't mean to ping you all. |
|
Images are ready for the commit at 8eb0a35. To use with deploy scripts, first |
|
/retest |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #19566 +/- ##
=======================================
Coverage 49.26% 49.27%
=======================================
Files 2727 2727
Lines 205821 205819 -2
=======================================
+ Hits 101396 101408 +12
+ Misses 96889 96879 -10
+ Partials 7536 7532 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Description
Fix quadratic runtime in
endpointsStore.addToHistorythat caused elevated mutex hold times duringApply()in large clusters.Problem: The legacy
addToHistoryiterates the entirereverseEndpointMap[deploymentID](all M endpoints of a deployment) to rebuild the reverse-history map on every call. DuringpurgeNoLock, this is invoked once per endpoint, resulting in O(M^2) total work. In clusters with many nodes and NodePort/LoadBalancer expansions, M grows large enough to cause noticeable event-processing latency.Fix: Replace the full deployment-endpoint scan with a targeted O(T) insert that records only the specific endpoint being moved to history. During a full purge of M endpoints, total work drops from O(sum(T_i) + M^2) to O(sum(T_i) + M).
AI note: Implementation was partially AI-assisted; all code was reviewed, corrected, and benchmarked by the author.
User-facing documentation
Testing and quality
Automated testing
How I validated my change
Benchmark results (Apple M3 Pro) show constant-time behavior for the fast path regardless of endpoint count, vs linear growth for legacy:
Allocations drop proportionally (12 allocs/op vs 1k-5k allocs/op).