chore(be): [WIP] Fast path test rigs#19427
chore(be): [WIP] Fast path test rigs#19427clickboo wants to merge 1 commit intoboo-adm-ctrl-cache-skip-metricfrom
Conversation
|
This change is part of the following stack: Change managed by git-spice. |
|
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The benchmark uses a hard-coded 100ms image-service latency with up to 1000 deployments per iteration, which can make some sub-benchmarks take tens of seconds or more; consider reducing the latency or tightening the max deployment counts so
go test -benchremains reasonably fast to run locally and in CI. - In
burst-test.sh,extract_counterusesgrep "^${metric}"which will also match any metric that shares the same prefix (e.g.,rox_admission_control_image_fetch_totalandrox_admission_control_image_fetch_total_extra); tightening the pattern to match the metric name plus either{or a space (e.g.,^${metric}[{ ]) would make the aggregation more robust. - The README describes the default
IMAGESasnginx:1.25,redis:7.2,python:3.12, but the script currently defaults to CentOS/Fedora images; aligning the documented defaults with the actualIMAGESdefault inburst-test.shwould avoid confusion when running the tool.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The benchmark uses a hard-coded 100ms image-service latency with up to 1000 deployments per iteration, which can make some sub-benchmarks take tens of seconds or more; consider reducing the latency or tightening the max deployment counts so `go test -bench` remains reasonably fast to run locally and in CI.
- In `burst-test.sh`, `extract_counter` uses `grep "^${metric}"` which will also match any metric that shares the same prefix (e.g., `rox_admission_control_image_fetch_total` and `rox_admission_control_image_fetch_total_extra`); tightening the pattern to match the metric name plus either `{` or a space (e.g., `^${metric}[{ ]`) would make the aggregation more robust.
- The README describes the default `IMAGES` as `nginx:1.25,redis:7.2,python:3.12`, but the script currently defaults to CentOS/Fedora images; aligning the documented defaults with the actual `IMAGES` default in `burst-test.sh` would avoid confusion when running the tool.
## Individual Comments
### Comment 1
<location path="sensor/admission-control/manager/fast_path_detection_bench_test.go" line_range="387" />
<code_context>
+
+ reqs := makeAdmissionRequests(depCount, vPct, benchImages)
+
+ for b.Loop() {
+ imgClient.Reset()
+ mgr.imageCache.Purge()
</code_context>
<issue_to_address>
**suggestion (testing):** Benchmark depends on real time and a fixed 100ms RPC latency, which may make it brittle or slow
In the loop you’re using a `benchImageServiceClient` with `latency: 100 * time.Millisecond` and `violatingImage` calls `time.Now()`. For larger `depCount` and high `b.N` this can make the benchmark unnecessarily slow and sensitive to environment. Consider reducing the latency or making it configurable, and/or adding a pure-CPU variant with a zero-latency client so the detection logic can be measured independently of I/O delay.
Suggested implementation:
```golang
// benchImageClientLatency controls the artificial latency injected into the
// benchImageServiceClient. It is intentionally small by default so the
// benchmark is not dominated by I/O delay and is less sensitive to the
// environment. Individual benchmarks can override this for specific
// scenarios (e.g. pure-CPU vs I/O-heavy).
if benchImageClientLatency == 0 {
benchImageClientLatency = 10 * time.Millisecond
}
imgClient := &benchImageServiceClient{
benchImages: benchImages,
violatingRef: violatingImage,
latency: benchImageClientLatency,
}
```
```golang
var benchImageClientLatency time.Duration
func BenchmarkFastPathDetection(b *testing.B) {
```
```golang
mgr := NewManager("bench-ns", 20*size.MB, imgClient, &benchDeploymentServiceClient{})
fp := fastPathPolicies(mix.fast)
sp := slowPathPolicies(mix.slow)
validatePoliciesCompile(b, fp, "fast-path")
validatePoliciesCompile(b, sp, "slow-path")
mgr.ProcessNewSettings(benchSettings(fp, sp))
// Precompute requests once; the benchmark should focus on the
// detection logic rather than setup cost.
reqs := makeAdmissionRequests(depCount, vPct, benchImages)
// Exclude setup (manager, policies, request generation) from the
// benchmark timing so we primarily measure the validation path.
b.ResetTimer()
for b.Loop() {
imgClient.Reset()
mgr.imageCache.Purge()
totalDenials := 0
for _, req := range reqs {
resp, err := mgr.HandleValidate(req)
if err != nil {
b.Fatalf("HandleValidate error: %v", err)
}
if resp != nil && !resp.Allowed {
totalDenials++
}
```
```golang
}
// BenchmarkFastPathDetectionCPUOnly runs the same detection benchmark but with a
// zero-latency image client so that the measurement is dominated by CPU work
// rather than the artificial RPC delay.
func BenchmarkFastPathDetectionCPUOnly(b *testing.B) {
// Save and restore the previous latency so this benchmark does not affect
// others.
prev := benchImageClientLatency
benchImageClientLatency = 0
defer func() { benchImageClientLatency = prev }()
BenchmarkFastPathDetection(b)
}
```
The edits above assume the following existing structure in `fast_path_detection_bench_test.go`:
1. An `imgClient` variable is initialized using a composite literal for `benchImageServiceClient` that includes a `latency: 100 * time.Millisecond` field.
2. A `func BenchmarkFastPathDetection(b *testing.B)` benchmark already exists.
3. The file already imports `"time"`.
If these assumptions differ from your actual code, you will need to:
1. Adjust the first `SEARCH`/`REPLACE` block so it matches the real `benchImageServiceClient` construction site and replace the hard-coded `100 * time.Millisecond` with `benchImageClientLatency`.
2. If `time` is not yet imported in this file, add it to the import block.
3. If `BenchmarkFastPathDetection` has a different name, update the `BenchmarkFastPathDetectionCPUOnly` implementation to call the correct function, or factor the common benchmark body into a helper (e.g. `runFastPathDetectionBenchmark(b *testing.B)`) and have both benchmarks call that helper.
</issue_to_address>
### Comment 2
<location path="tools/admission-control/README.md" line_range="5" />
<code_context>
+
+### `burst-test.sh`
+
+Simulates a burst of deployment creates (`--dry-run=server`) against a live
+cluster, scrapes admission controller Prometheus metrics before/after, and
+reports deltas with a correctness check.
</code_context>
<issue_to_address>
**suggestion (typo):** Consider rephrasing "deployment creates" for grammatical clarity.
You could use a phrase like "deployment creation requests" or "Deployment create requests" to better reflect that these are simulated create operations on Deployments while matching Kubernetes terminology.
```suggestion
Simulates a burst of deployment creation requests (`--dry-run=server`) against a live
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
|
||
| reqs := makeAdmissionRequests(depCount, vPct, benchImages) | ||
|
|
||
| for b.Loop() { |
There was a problem hiding this comment.
suggestion (testing): Benchmark depends on real time and a fixed 100ms RPC latency, which may make it brittle or slow
In the loop you’re using a benchImageServiceClient with latency: 100 * time.Millisecond and violatingImage calls time.Now(). For larger depCount and high b.N this can make the benchmark unnecessarily slow and sensitive to environment. Consider reducing the latency or making it configurable, and/or adding a pure-CPU variant with a zero-latency client so the detection logic can be measured independently of I/O delay.
Suggested implementation:
// benchImageClientLatency controls the artificial latency injected into the
// benchImageServiceClient. It is intentionally small by default so the
// benchmark is not dominated by I/O delay and is less sensitive to the
// environment. Individual benchmarks can override this for specific
// scenarios (e.g. pure-CPU vs I/O-heavy).
if benchImageClientLatency == 0 {
benchImageClientLatency = 10 * time.Millisecond
}
imgClient := &benchImageServiceClient{
benchImages: benchImages,
violatingRef: violatingImage,
latency: benchImageClientLatency,
}var benchImageClientLatency time.Duration
func BenchmarkFastPathDetection(b *testing.B) { mgr := NewManager("bench-ns", 20*size.MB, imgClient, &benchDeploymentServiceClient{})
fp := fastPathPolicies(mix.fast)
sp := slowPathPolicies(mix.slow)
validatePoliciesCompile(b, fp, "fast-path")
validatePoliciesCompile(b, sp, "slow-path")
mgr.ProcessNewSettings(benchSettings(fp, sp))
// Precompute requests once; the benchmark should focus on the
// detection logic rather than setup cost.
reqs := makeAdmissionRequests(depCount, vPct, benchImages)
// Exclude setup (manager, policies, request generation) from the
// benchmark timing so we primarily measure the validation path.
b.ResetTimer()
for b.Loop() {
imgClient.Reset()
mgr.imageCache.Purge()
totalDenials := 0
for _, req := range reqs {
resp, err := mgr.HandleValidate(req)
if err != nil {
b.Fatalf("HandleValidate error: %v", err)
}
if resp != nil && !resp.Allowed {
totalDenials++
}}
// BenchmarkFastPathDetectionCPUOnly runs the same detection benchmark but with a
// zero-latency image client so that the measurement is dominated by CPU work
// rather than the artificial RPC delay.
func BenchmarkFastPathDetectionCPUOnly(b *testing.B) {
// Save and restore the previous latency so this benchmark does not affect
// others.
prev := benchImageClientLatency
benchImageClientLatency = 0
defer func() { benchImageClientLatency = prev }()
BenchmarkFastPathDetection(b)
}The edits above assume the following existing structure in fast_path_detection_bench_test.go:
- An
imgClientvariable is initialized using a composite literal forbenchImageServiceClientthat includes alatency: 100 * time.Millisecondfield. - A
func BenchmarkFastPathDetection(b *testing.B)benchmark already exists. - The file already imports
"time".
If these assumptions differ from your actual code, you will need to:
- Adjust the first
SEARCH/REPLACEblock so it matches the realbenchImageServiceClientconstruction site and replace the hard-coded100 * time.MillisecondwithbenchImageClientLatency. - If
timeis not yet imported in this file, add it to the import block. - If
BenchmarkFastPathDetectionhas a different name, update theBenchmarkFastPathDetectionCPUOnlyimplementation to call the correct function, or factor the common benchmark body into a helper (e.g.runFastPathDetectionBenchmark(b *testing.B)) and have both benchmarks call that helper.
|
|
||
| ### `burst-test.sh` | ||
|
|
||
| Simulates a burst of deployment creates (`--dry-run=server`) against a live |
There was a problem hiding this comment.
suggestion (typo): Consider rephrasing "deployment creates" for grammatical clarity.
You could use a phrase like "deployment creation requests" or "Deployment create requests" to better reflect that these are simulated create operations on Deployments while matching Kubernetes terminology.
| Simulates a burst of deployment creates (`--dry-run=server`) against a live | |
| Simulates a burst of deployment creation requests (`--dry-run=server`) against a live |
2be89e1 to
8cb4b89
Compare
ce4a86f to
3613eed
Compare
|
Images are ready for the commit at b5bcd9c. To use with deploy scripts, first |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## boo-adm-ctrl-cache-skip-metric #19427 +/- ##
===============================================================
Coverage 49.69% 49.69%
===============================================================
Files 2702 2702
Lines 203573 203573
===============================================================
+ Hits 101159 101165 +6
+ Misses 94887 94883 -4
+ Partials 7527 7525 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
8cb4b89 to
a5f65af
Compare
3613eed to
398c298
Compare
a5f65af to
0531859
Compare
398c298 to
b5bcd9c
Compare
No description provided.