ROX-33201: optimize applySingleNoLock to reduce allocations#19274
ROX-33201: optimize applySingleNoLock to reduce allocations#19274
Conversation
User request: Reduce allocations in applySingleNoLock by applying pre-allocation and moving redundant assignments out of the loop. Changes: - Pre-allocate sets/maps with known capacity to avoid growth reallocations - Move reverseEndpointMap assignment outside loop (was N redundant writes per call) - Add comprehensive benchmarks for performance tracking Benchmark results (10 iterations, benchstat): ApplySingleNoLock/20Endpoints: -39.73% time (22.46µs → 13.54µs) ApplySingleNoLock/50Endpoints: -37.31% time (59.48µs → 37.29µs) ApplyRepeated (realistic usage): -40.45% time (2.489ms → 1.482ms) Memory improvements: Medium workloads: -5.09% bytes/op (552.8Ki → 524.7Ki), -0.74% allocs/op Large workloads: -7.39% bytes/op (2.271Mi → 2.103Mi), -0.78% allocs/op Note: Code partially generated with AI assistance. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Remove unconditional etiSet allocation on every loop iteration. Only allocate when actually needed based on map state. Changes: - Move etiSet allocation into conditional branches - Retrieve etiSet from map after ensuring it exists - Clearer logic flow with explicit cases Performance improvement: - Medium workloads: -5% memory (566KB → 537KB), -50 allocations - Large workloads: -7.4% memory (2.38MB → 2.20MB), -300 allocations Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Pre-allocate historical maps with known capacity from current maps to avoid growth reallocations. Changes: - Pre-allocate historicalEndpoints[ep][deploymentID] with size from endpointMap - Pre-allocate reverseHistoricalEndpoints[deploymentID] with size from reverseEndpointMap This reduces allocations when moving endpoints to history during reset operations or deployment changes. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Tomasz Janiszewski <tomek@redhat.com>
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The benchmark helper loops like
for d := range numDeployments/for e := range numEndpoints/for t := range targetsPerEpwon't compile in Go; these should be standard counted loops (e.g.for d := 0; d < numDeployments; d++ { ... }). - The benchmark bodies use
for b.Loop()which is not part oftesting.B; replace this with the usualfor i := 0; i < b.N; i++ { ... }(and consider usingb.ResetTimer()/b.ReportAllocs()if you want cleaner measurements).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The benchmark helper loops like `for d := range numDeployments` / `for e := range numEndpoints` / `for t := range targetsPerEp` won't compile in Go; these should be standard counted loops (e.g. `for d := 0; d < numDeployments; d++ { ... }`).
- The benchmark bodies use `for b.Loop()` which is not part of `testing.B`; replace this with the usual `for i := 0; i < b.N; i++ { ... }` (and consider using `b.ResetTimer()`/`b.ReportAllocs()` if you want cleaner measurements).
## Individual Comments
### Comment 1
<location path="sensor/common/clusterentities/store_endpoints_bench_test.go" line_range="26-30" />
<code_context>
+ }
+
+ for _, bm := range benchmarks {
+ b.Run(bm.name, func(b *testing.B) {
+ // Create realistic test data
+ updates := generateTestUpdates(bm.numDeployments, bm.endpointsPerDep, bm.targetsPerEp)
+
+ for b.Loop() {
+ store := newEndpointsStoreWithMemory(100)
+ store.Apply(updates, bm.incremental)
</code_context>
<issue_to_address>
**issue (bug_risk):** Benchmarks use `b.Loop()` and `range` over ints, which will not compile and/or behave as intended
These benchmarks currently use `for b.Loop()` and `range` over `int` values (e.g. `for d := range numDeployments`), which is invalid Go and will prevent them from compiling/running.
Please switch to the standard benchmarking pattern and counting loops, e.g.:
```go
for i := 0; i < b.N; i++ {
store := newEndpointsStoreWithMemory(100)
store.Apply(updates, bm.incremental)
}
for d := 0; d < numDeployments; d++ { … }
for e := 0; e < numEndpoints; e++ { … }
for t := 0; t < targetsPerEp; t++ { … }
```
If you want to track allocation changes, also add `b.ReportAllocs()` to the relevant benchmarks.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Images are ready for the commit at d7f678d. To use with deploy scripts, first |
|
/retest |
|
@janisz: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| reverseMap := e.reverseHistoricalEndpoints[deploymentID] | ||
| reverseMap[endpoint].recordTick() |
There was a problem hiding this comment.
We introduced a var reverseMap to use it only once. Is that helping in performance in anyway?
There was a problem hiding this comment.
The change is good, but there are few subtle logic changes in the code. I will need some more time to go over it, but I will post the details soon.
Edit: I added the comment about the changed logic. PTAL and let me know.
I want to thread carefully in this code as it was the probable reason for several customer tickets in the past.
| dSet = make(set.Set[net.NumericEndpoint], len(data.endpoints)) | ||
| e.reverseEndpointMap[deploymentID] = dSet |
There was a problem hiding this comment.
This is the logic change that I meant.
In the old code, if data.endpoints is empty, the loop body never executes and reverseEndpointMap[deploymentID] is never written. In the new code, an empty set is always stored.
This matters because isDeleteOnly() checks more than just endpoints:
func (ed *EntityData) isDeleteOnly() bool {
if ed == nil {
return true
}
if len(ed.endpoints)+len(ed.containerIDs)+len(ed.ips) == 0 {
return trueSo applySingleNoLock can be called with len(data.endpoints) == 0 when containerIDs or ips are non-empty. In that scenario:
- The new code creates an empty entry
reverseEndpointMap["depl1"] = {}(empty set) - On a subsequent call for the same deployment that now does have endpoints,
deploymentFoundwill betrue - This means the
else if !deploymentFoundbranch (the endpoint takeover /moveToHistorylogic) will be skipped
In the old code, step 2 would have deploymentFound = false, and the takeover logic would correctly execute.
I estimate the impact as low in normal operation (deployments almost always arrive with endpoints), but it's a real behavioral difference that could cause missed history entries in edge cases.
What to do with that? There are few options:
- I would love to see a unit test for this. I am not sure about the current quality and coverage of the uint tests for that (even though I myself wrote or updated them most probably).
- Unless we confirm this is safe, I would suggest to restore the previous behavior.
- We can change the code of
isDeleteOnly(and maybe other functions that rely on this) to handle the cases produced by the new code correctly.
WDYT?
| updates := generateTestUpdates(bm.numDeployments, bm.endpointsPerDep, bm.targetsPerEp) | ||
|
|
||
| for b.Loop() { | ||
| store := newEndpointsStoreWithMemory(100) |
There was a problem hiding this comment.
We usually set this to something between 5 and 20. 1 unit is currently mapped to one tick that is 30s by default.
Description
Heatmap from the current situation:

User-facing documentation
Testing and quality
Automated testing
How I validated my change
CI + bench