Skip to content

fix(risk): improve reprocess deployment risk#16987

Closed
janisz wants to merge 1 commit intomasterfrom
optimize_reprocess_deployment_risk
Closed

fix(risk): improve reprocess deployment risk#16987
janisz wants to merge 1 commit intomasterfrom
optimize_reprocess_deployment_risk

Conversation

@janisz
Copy link
Contributor

@janisz janisz commented Sep 23, 2025

SearchRawProcessIndicators calls are not optimal in our use case where we need to loop over elements of the process indicators return set. Instead let's use a callback approach. This will reduce memory and GC pressure as we do not need to load all objects into a slice but iterate one by one.

go test -tags sql_integration -run=NO -bench=BenchmarkEvaluateBaselinesAndPersistResult  ./central/processbaseline/evaluator/. -count 6 -benchmem -benchtime 3s 
                                                                              │ master-2.txt │               new.txt               │
                                                                              │    sec/op    │    sec/op     vs base               │
EvaluateBaselinesAndPersistResult/100_processes_2_containers-8                  1.781m ± 32%   1.720m ± 55%        ~ (p=0.240 n=6)
EvaluateBaselinesAndPersistResult/500_processes_3_containers-8                  4.968m ± 28%   4.865m ± 43%        ~ (p=0.485 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers-8                 9.076m ± 46%   8.471m ± 32%        ~ (p=0.180 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers_large_args-8      11.86m ± 12%   11.64m ±  9%        ~ (p=0.485 n=6)
EvaluateBaselinesAndPersistResult/2000_processes_10_containers-8                15.86m ± 29%   15.19m ± 29%        ~ (p=0.240 n=6)
EvaluateBaselinesAndPersistResult/10000_processes_20_containers-8               81.46m ± 44%   74.28m ± 17%        ~ (p=0.132 n=6)
EvaluateBaselinesAndPersistResult/25000_processes_50_containers-8               228.4m ± 14%   182.5m ± 12%  -20.12% (p=0.026 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers-8              483.2m ± 17%   346.9m ± 33%  -28.21% (p=0.009 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers_large_args-8   633.3m ± 24%   542.8m ± 13%  -14.30% (p=0.004 n=6)
geomean                                                                         35.34m         31.66m        -10.41%

                                                                              │  master-2.txt  │                new.txt                │
                                                                              │ max_heap_bytes │ max_heap_bytes  vs base               │
EvaluateBaselinesAndPersistResult/100_processes_2_containers-8                    11.23M ±  9%     11.23M ± 17%        ~ (p=0.818 n=6)
EvaluateBaselinesAndPersistResult/500_processes_3_containers-8                    13.56M ± 18%     12.79M ± 15%        ~ (p=0.132 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers-8                   15.10M ± 13%     13.50M ± 28%        ~ (p=0.240 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers_large_args-8        20.17M ± 35%     18.61M ± 33%   -7.73% (p=0.041 n=6)
EvaluateBaselinesAndPersistResult/2000_processes_10_containers-8                  16.49M ±  7%     15.79M ±  2%   -4.28% (p=0.004 n=6)
EvaluateBaselinesAndPersistResult/10000_processes_20_containers-8                 38.74M ±  9%     32.44M ±  8%  -16.27% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/25000_processes_50_containers-8                 76.33M ±  5%     39.86M ± 18%  -47.78% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers-8               137.26M ±  3%     41.16M ± 19%  -70.02% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers_large_args-8    362.78M ±  6%     86.84M ± 35%  -76.06% (p=0.002 n=6)
geomean                                                                           36.42M           23.99M        -34.12%

                                                                              │   master-2.txt   │                 new.txt                 │
                                                                              │ max_heap_objects │ max_heap_objects  vs base               │
EvaluateBaselinesAndPersistResult/100_processes_2_containers-8                      11.23M ±  9%       11.23M ± 17%        ~ (p=0.818 n=6)
EvaluateBaselinesAndPersistResult/500_processes_3_containers-8                      13.56M ± 18%       12.79M ± 15%        ~ (p=0.132 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers-8                     15.10M ± 13%       13.50M ± 28%        ~ (p=0.240 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers_large_args-8          20.17M ± 35%       18.61M ± 33%   -7.73% (p=0.041 n=6)
EvaluateBaselinesAndPersistResult/2000_processes_10_containers-8                    16.49M ±  7%       15.79M ±  2%   -4.28% (p=0.004 n=6)
EvaluateBaselinesAndPersistResult/10000_processes_20_containers-8                   38.74M ±  9%       32.44M ±  8%  -16.27% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/25000_processes_50_containers-8                   76.33M ±  5%       39.86M ± 18%  -47.78% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers-8                 137.26M ±  3%       41.16M ± 19%  -70.02% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers_large_args-8      362.78M ±  6%       86.84M ± 35%  -76.06% (p=0.002 n=6)
geomean                                                                             36.42M             23.99M        -34.12%

                                                                              │ master-2.txt │               new.txt               │
                                                                              │     B/op     │     B/op      vs base               │
EvaluateBaselinesAndPersistResult/100_processes_2_containers-8                  678.9Ki ± 0%   660.3Ki ± 0%   -2.74% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/500_processes_3_containers-8                  2.116Mi ± 0%   2.029Mi ± 0%   -4.11% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers-8                 3.774Mi ± 0%   3.609Mi ± 0%   -4.39% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers_large_args-8      8.955Mi ± 0%   8.790Mi ± 0%   -1.85% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/2000_processes_10_containers-8                7.132Mi ± 0%   6.673Mi ± 0%   -6.43% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/10000_processes_20_containers-8               34.66Mi ± 0%   31.69Mi ± 0%   -8.55% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/25000_processes_50_containers-8               87.49Mi ± 0%   77.93Mi ± 0%  -10.93% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers-8              175.7Mi ± 1%   155.0Mi ± 0%  -11.80% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers_large_args-8   463.0Mi ± 0%   442.3Mi ± 0%   -4.47% (p=0.002 n=6)
geomean                                                                         16.35Mi        15.33Mi        -6.20%

                                                                              │ master-2.txt │              new.txt              │
                                                                              │  allocs/op   │  allocs/op   vs base              │
EvaluateBaselinesAndPersistResult/100_processes_2_containers-8                   15.59k ± 0%   15.45k ± 0%  -0.90% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/500_processes_3_containers-8                   57.99k ± 0%   57.81k ± 0%  -0.31% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers-8                  108.7k ± 0%   108.5k ± 0%  -0.18% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers_large_args-8       108.7k ± 0%   108.5k ± 0%  -0.18% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/2000_processes_10_containers-8                 208.7k ± 0%   208.4k ± 0%  -0.11% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/10000_processes_20_containers-8                1.015M ± 0%   1.015M ± 0%  -0.03% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/25000_processes_50_containers-8                2.515M ± 0%   2.515M ± 0%  -0.02% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers-8               5.015M ± 0%   5.015M ± 0%  -0.01% (p=0.002 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers_large_args-8    5.015M ± 0%   5.015M ± 0%  -0.01% (p=0.002 n=6)
geomean                                                                          374.0k        373.2k       -0.20%

refs:

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Sep 23, 2025

Images are ready for the commit at 2078c69.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.9.x-877-g2078c690f5.

@rhybrillou
Copy link
Contributor

Rebase to integrate #16983

@janisz janisz requested a review from dashrews78 September 23, 2025 15:31
@codecov
Copy link

codecov bot commented Sep 23, 2025

Codecov Report

❌ Patch coverage is 38.46154% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 48.80%. Comparing base (6253b79) to head (2078c69).
⚠️ Report is 72 commits behind head on master.

Files with missing lines Patch % Lines
...entral/processbaseline/evaluator/evaluator_impl.go 45.45% 5 Missing and 1 partial ⚠️
...ntral/processindicator/datastore/datastore_impl.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master   #16987   +/-   ##
=======================================
  Coverage   48.80%   48.80%           
=======================================
  Files        2696     2696           
  Lines      201446   201452    +6     
=======================================
+ Hits        98310    98316    +6     
- Misses      95378    95380    +2     
+ Partials     7758     7756    -2     
Flag Coverage Δ
go-unit-tests 48.80% <38.46%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@janisz janisz force-pushed the optimize_reprocess_deployment_risk branch from 31d6324 to b515b59 Compare September 23, 2025 17:01
@janisz
Copy link
Contributor Author

janisz commented Sep 23, 2025

Signed-off-by: Tomasz Janiszewski <tomek@redhat.com>
@janisz janisz force-pushed the optimize_reprocess_deployment_risk branch from b515b59 to 2078c69 Compare September 24, 2025 11:20
@janisz janisz added the auto-retest PRs with this label will be automatically retested if prow checks fails label Sep 24, 2025
@rhacs-bot
Copy link
Contributor

/retest

@red-hat-konflux
Copy link
Contributor

Caution

There are some errors in your PipelineRun template.

PipelineRun Error
central-db-on-push CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request
main-on-push CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request
operator-on-push CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request
operator-bundle-on-push CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request
retag-collector CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request
retag-scanner-db-slim CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request
retag-scanner-db CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request
retag-scanner-slim CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request
retag-scanner CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request
roxctl-on-push CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request
scanner-v4-on-push CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request
scanner-v4-db-on-push CEL expression evaluation error: expression "(\n event == \"push\" && target_branch.matches(\"^(master|release-.*|refs/tags/.*)$\")\n) || (\n event == \"pull_request\" && (\n target_branch.startsWith(\"release-\") ||\n source_branch.matches(\"(konflux|renovate|appstudio|rhtap)\") ||\n (has(body.pull_request.labels) && body.pull_request.labels.exists(l, l.name == \"konflux-build\"))\n ) && body.action != \"ready_for_review\"\n)\n" failed to evaluate: no such key: pull_request

@JoukoVirtanen
Copy link
Contributor

You can merge this PR without my approval. I would like to see a more detailed explanation of how this PR was tested. What was the load used for the test. Is one screen shot with these changes and the other without? How were the screen shots created?

Copy link
Contributor

@dashrews78 dashrews78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking so we don't merge this yet. I think we should go with my stack as the my changes to the query will help both central and central-db memory. My changes should also hold up better when auto locking is enabled. Or we can discuss that. I'm close on getting the migration done.

#16965 is my target implementation and a pattern for other places where we get large objects when we only need a few items.

@JoukoVirtanen
Copy link
Contributor

Blocking so we don't merge this yet. I think we should go with my stack as the my changes to the query will help both central and central-db memory. My changes should also hold up better when auto locking is enabled. Or we can discuss that. I'm close on getting the migration done.

#16965 is my target implementation and a pattern for other places where we get large objects when we only need a few items.

Perhaps the implementation in this PR and @dashrews78 idea are not mutually incompatible. Though, it might not be easy to do both. GetByQueryFn would have to be modified to allow handling of just the field of interest, rather than the whole object, which is outside the scope of this PR. Perhaps this PR could be merged and then David's idea could be implemented later.

@JoukoVirtanen
Copy link
Contributor

Unit tests could be added to improve code coverage.

@dashrews78
Copy link
Contributor

Blocking so we don't merge this yet. I think we should go with my stack as the my changes to the query will help both central and central-db memory. My changes should also hold up better when auto locking is enabled. Or we can discuss that. I'm close on getting the migration done.
#16965 is my target implementation and a pattern for other places where we get large objects when we only need a few items.

Perhaps the implementation in this PR and @dashrews78 idea are not mutually incompatible. Though, it might not be easy to do both. GetByQueryFn would have to be modified to allow handling of just the field of interest, rather than the whole object, which is outside the scope of this PR. Perhaps this PR could be merged and then David's idea could be implemented later.

The select framework would have to be updated. We probably should do that but that is larger.

The benefit of my approach is that it helps the database too which we've been seeing have some issues with buffers in the larger environments. So the less we pull in the query the better. Once auto locking goes in, the approach in this PR won't save too much on the database side.

@janisz
Copy link
Contributor Author

janisz commented Sep 30, 2025

I run benchmark that was in the package:

go test -tags sql_integration -run=NO -bench=BenchmarkEvaluateBaselinesAndPersistResult  ./central/processbaseline/evaluator/. -count 6 -benchmem -benchtime 3s -timeout 1h > new.txt \
&& git co - &&  \
go test -tags sql_integration -run=NO -bench=BenchmarkEvaluateBaselinesAndPersistResult  ./central/processbaseline/evaluator/. -count 6 -benchmem -benchtime 3s -timeout 1h > master.txt
                                                                              │   master.txt   │               new.txt               │
                                                                              │     sec/op     │    sec/op     vs base               │
EvaluateBaselinesAndPersistResult/100_processes_2_containers-8                   1.946m ±  50%   1.811m ± 61%        ~ (p=0.589 n=6)
EvaluateBaselinesAndPersistResult/500_processes_3_containers-8                   9.482m ±  55%   9.293m ± 51%        ~ (p=0.818 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers-8                  27.63m ±  34%   26.20m ± 34%        ~ (p=0.589 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers_large_args-8       51.85m ±  24%   46.50m ± 19%        ~ (p=0.394 n=6)
EvaluateBaselinesAndPersistResult/2000_processes_10_containers-8                 79.86m ±  20%   68.29m ± 19%        ~ (p=0.132 n=6)
EvaluateBaselinesAndPersistResult/10000_processes_20_containers-8                180.4m ±  40%   146.0m ± 33%        ~ (p=0.240 n=6)
EvaluateBaselinesAndPersistResult/25000_processes_50_containers-8                543.8m ±  45%   349.4m ± 33%  -35.75% (p=0.041 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers-8              1369.5m ±  33%   845.9m ± 34%  -38.23% (p=0.015 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers_large_args-8     2.458 ± 129%    1.762 ± 24%  -28.32% (p=0.009 n=6)
geomean                                                                          96.07m          77.97m        -18.85%

                                                                              │   master.txt   │               new.txt                │
                                                                              │      B/op      │     B/op       vs base               │
EvaluateBaselinesAndPersistResult/100_processes_2_containers-8                   1.213Mi ± 75%   1.091Mi ± 73%        ~ (p=0.699 n=6)
EvaluateBaselinesAndPersistResult/500_processes_3_containers-8                   6.971Mi ± 52%   6.667Mi ± 52%        ~ (p=0.699 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers-8                  20.72Mi ± 35%   19.89Mi ± 35%        ~ (p=0.589 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers_large_args-8       41.34Mi ± 27%   36.69Mi ± 20%        ~ (p=0.394 n=6)
EvaluateBaselinesAndPersistResult/2000_processes_10_containers-8                 69.30Mi ± 21%   58.76Mi ± 19%        ~ (p=0.132 n=6)
EvaluateBaselinesAndPersistResult/10000_processes_20_containers-8                164.8Mi ± 39%   134.7Mi ± 34%        ~ (p=0.310 n=6)
EvaluateBaselinesAndPersistResult/25000_processes_50_containers-8                482.1Mi ± 39%   343.7Mi ± 34%        ~ (p=0.093 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers-8              1174.7Mi ± 31%   855.8Mi ± 35%        ~ (p=0.065 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers_large_args-8    1.953Gi ± 20%   1.810Gi ± 25%        ~ (p=0.699 n=6)
geomean                                                                          76.86Mi         65.73Mi        -14.49%

                                                                              │  master.txt   │               new.txt               │
                                                                              │   allocs/op   │  allocs/op    vs base               │
EvaluateBaselinesAndPersistResult/100_processes_2_containers-8                   9.092k ± 73%   8.391k ± 71%        ~ (p=0.818 n=6)
EvaluateBaselinesAndPersistResult/500_processes_3_containers-8                   51.11k ± 51%   50.40k ± 52%        ~ (p=0.699 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers-8                  150.9k ± 35%   150.2k ± 35%        ~ (p=0.699 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers_large_args-8       262.9k ± 20%   244.7k ± 14%        ~ (p=0.394 n=6)
EvaluateBaselinesAndPersistResult/2000_processes_10_containers-8                 441.4k ± 24%   391.7k ± 21%        ~ (p=0.310 n=6)
EvaluateBaselinesAndPersistResult/10000_processes_20_containers-8               1127.4k ± 40%   965.7k ± 36%        ~ (p=0.485 n=6)
EvaluateBaselinesAndPersistResult/25000_processes_50_containers-8                3.420M ± 38%   2.541M ± 34%        ~ (p=0.132 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers-8               8.407M ± 31%   6.391M ± 36%        ~ (p=0.132 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers_large_args-8    12.78M ± 14%   11.82M ± 18%        ~ (p=0.240 n=6)
geomean                                                                          531.2k         470.4k        -11.46%

I'm surprised by low p value and huge deviation. Is it possible this benchmark does not cleanup the data and each run reads more and more data?

So the less we pull in the query the better

I totally agree although this PR is working and provides some relief on allocs and GC. In the long run we should avoid getting slices from database unless it's necessary. In this case it's not.

@dashrews78
Copy link
Contributor

I run benchmark that was in the package:

go test -tags sql_integration -run=NO -bench=BenchmarkEvaluateBaselinesAndPersistResult  ./central/processbaseline/evaluator/. -count 6 -benchmem -benchtime 3s -timeout 1h > new.txt \
&& git co - &&  \
go test -tags sql_integration -run=NO -bench=BenchmarkEvaluateBaselinesAndPersistResult  ./central/processbaseline/evaluator/. -count 6 -benchmem -benchtime 3s -timeout 1h > master.txt
                                                                              │   master.txt   │               new.txt               │
                                                                              │     sec/op     │    sec/op     vs base               │
EvaluateBaselinesAndPersistResult/100_processes_2_containers-8                   1.946m ±  50%   1.811m ± 61%        ~ (p=0.589 n=6)
EvaluateBaselinesAndPersistResult/500_processes_3_containers-8                   9.482m ±  55%   9.293m ± 51%        ~ (p=0.818 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers-8                  27.63m ±  34%   26.20m ± 34%        ~ (p=0.589 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers_large_args-8       51.85m ±  24%   46.50m ± 19%        ~ (p=0.394 n=6)
EvaluateBaselinesAndPersistResult/2000_processes_10_containers-8                 79.86m ±  20%   68.29m ± 19%        ~ (p=0.132 n=6)
EvaluateBaselinesAndPersistResult/10000_processes_20_containers-8                180.4m ±  40%   146.0m ± 33%        ~ (p=0.240 n=6)
EvaluateBaselinesAndPersistResult/25000_processes_50_containers-8                543.8m ±  45%   349.4m ± 33%  -35.75% (p=0.041 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers-8              1369.5m ±  33%   845.9m ± 34%  -38.23% (p=0.015 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers_large_args-8     2.458 ± 129%    1.762 ± 24%  -28.32% (p=0.009 n=6)
geomean                                                                          96.07m          77.97m        -18.85%

                                                                              │   master.txt   │               new.txt                │
                                                                              │      B/op      │     B/op       vs base               │
EvaluateBaselinesAndPersistResult/100_processes_2_containers-8                   1.213Mi ± 75%   1.091Mi ± 73%        ~ (p=0.699 n=6)
EvaluateBaselinesAndPersistResult/500_processes_3_containers-8                   6.971Mi ± 52%   6.667Mi ± 52%        ~ (p=0.699 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers-8                  20.72Mi ± 35%   19.89Mi ± 35%        ~ (p=0.589 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers_large_args-8       41.34Mi ± 27%   36.69Mi ± 20%        ~ (p=0.394 n=6)
EvaluateBaselinesAndPersistResult/2000_processes_10_containers-8                 69.30Mi ± 21%   58.76Mi ± 19%        ~ (p=0.132 n=6)
EvaluateBaselinesAndPersistResult/10000_processes_20_containers-8                164.8Mi ± 39%   134.7Mi ± 34%        ~ (p=0.310 n=6)
EvaluateBaselinesAndPersistResult/25000_processes_50_containers-8                482.1Mi ± 39%   343.7Mi ± 34%        ~ (p=0.093 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers-8              1174.7Mi ± 31%   855.8Mi ± 35%        ~ (p=0.065 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers_large_args-8    1.953Gi ± 20%   1.810Gi ± 25%        ~ (p=0.699 n=6)
geomean                                                                          76.86Mi         65.73Mi        -14.49%

                                                                              │  master.txt   │               new.txt               │
                                                                              │   allocs/op   │  allocs/op    vs base               │
EvaluateBaselinesAndPersistResult/100_processes_2_containers-8                   9.092k ± 73%   8.391k ± 71%        ~ (p=0.818 n=6)
EvaluateBaselinesAndPersistResult/500_processes_3_containers-8                   51.11k ± 51%   50.40k ± 52%        ~ (p=0.699 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers-8                  150.9k ± 35%   150.2k ± 35%        ~ (p=0.699 n=6)
EvaluateBaselinesAndPersistResult/1000_processes_5_containers_large_args-8       262.9k ± 20%   244.7k ± 14%        ~ (p=0.394 n=6)
EvaluateBaselinesAndPersistResult/2000_processes_10_containers-8                 441.4k ± 24%   391.7k ± 21%        ~ (p=0.310 n=6)
EvaluateBaselinesAndPersistResult/10000_processes_20_containers-8               1127.4k ± 40%   965.7k ± 36%        ~ (p=0.485 n=6)
EvaluateBaselinesAndPersistResult/25000_processes_50_containers-8                3.420M ± 38%   2.541M ± 34%        ~ (p=0.132 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers-8               8.407M ± 31%   6.391M ± 36%        ~ (p=0.132 n=6)
EvaluateBaselinesAndPersistResult/50000_processes_100_containers_large_args-8    12.78M ± 14%   11.82M ± 18%        ~ (p=0.240 n=6)
geomean                                                                          531.2k         470.4k        -11.46%

I'm surprised by low p value and huge deviation. Is it possible this benchmark does not cleanup the data and each run reads more and more data?

So the less we pull in the query the better

I totally agree although this PR is working and provides some relief on allocs and GC. In the long run we should avoid getting slices from database unless it's necessary. In this case it's not.

My competitor PR is ready for review and also working. Long running cluster is running now. That's why I blocked this one because they work some what differently and would have conflicts. I think broad strokes it is one or the other and then we iterate on top.

#16965

@janisz
Copy link
Contributor Author

janisz commented Sep 30, 2025

I slightly changed benchmark to collect max heap objects and bytes as this will differ the most and allocs will remain the same.

@janisz
Copy link
Contributor Author

janisz commented Oct 2, 2025

Closing as it's needs to be implemented from scratch as in postgres we have no callback version of that query.

@janisz janisz closed this Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/central auto-retest PRs with this label will be automatically retested if prow checks fails

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants