Skip to content

fix: do not run GC every update cycle#12695

Merged
RTann merged 2 commits intomasterfrom
ross-scanner-v4-gc-fix
Sep 20, 2024
Merged

fix: do not run GC every update cycle#12695
RTann merged 2 commits intomasterfrom
ross-scanner-v4-gc-fix

Conversation

@RTann
Copy link
Contributor

@RTann RTann commented Sep 12, 2024

Description

When testing VEX support, I noticed Scanner V4 DB's CPU was periodically active. Looking at the Scanner V4 Matcher logs, I realized that the GC is run upon every successful update cycle, even if that means there was nothing updated (success in this case just means there were no errors).

This PR now limits the GC to only run upon update cycles that not only don't error out, but also actually update something.

It's not perfect, as it's still possible a Matcher replica that does nothing towards updating the vulnerabilities runs the GC[1], but that's ok, as it's not nearly as common as this scenario.

[1] Imagine there are two Matcher replicas and there are new vulnerability updates. It is possible (highly unlikely, though) that only one of the Matchers handles all the updates when reading the "multi-bundle". But, if this were to happen and somehow both Matchers were able to get the GC lock without any kind of contention, then both Matchers will run a GC cycle. This is highly unlikely, so it's ok to ignore. We should care more about the way more common case that there are no vulnerability updates, and we still run the GC for no real reason.

I also found we needlessly require a pgxpool when creating a vuln updater, so I remove that here, too

User-facing documentation

  • CHANGELOG update is not needed
  • documentation PR is not needed

Testing and quality

  • the change is production ready: the change is GA or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • modified existing tests

How I validated my change

Here is what I saw in my VEX PR testing:

Screenshot 2024-09-12 at 2 41 28 PM

Here is what I see now:

Screenshot 2024-09-16 at 3 22 54 PM

Note the scale of the graph. This shows activity every few hours, instead of every few minutes

@openshift-ci
Copy link

openshift-ci bot commented Sep 12, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Sep 12, 2024

Images are ready for the commit at 74f35f2.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.6.x-543-g74f35f2695.

@RTann RTann force-pushed the ross-scanner-v4-gc-fix branch from 2c22af7 to a7f756f Compare September 13, 2024 20:49
@codecov
Copy link

codecov bot commented Sep 13, 2024

Codecov Report

Attention: Patch coverage is 65.71429% with 12 lines in your changes missing coverage. Please review.

Project coverage is 48.12%. Comparing base (7b70fa2) to head (74f35f2).
Report is 10 commits behind head on master.

Files with missing lines Patch % Lines
scanner/matcher/updater/vuln/updater.go 65.71% 12 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #12695      +/-   ##
==========================================
+ Coverage   48.10%   48.12%   +0.02%     
==========================================
  Files        2438     2438              
  Lines      174799   174807       +8     
==========================================
+ Hits        84086    84126      +40     
+ Misses      83915    83885      -30     
+ Partials     6798     6796       -2     
Flag Coverage Δ
go-unit-tests 48.12% <65.71%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@RTann RTann force-pushed the ross-scanner-v4-gc-fix branch from a7f756f to 9cee02a Compare September 16, 2024 22:26
@RTann RTann marked this pull request as ready for review September 16, 2024 22:28
@RTann RTann requested a review from a team as a code owner September 16, 2024 22:28
Copy link
Contributor

@dcaravel dcaravel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am curious what the GC is doing when nothing needs cleanup, are those cycles perhaps trying to determine what to clean? Any debt there? (unrelated to this PR)

@RTann
Copy link
Contributor Author

RTann commented Sep 19, 2024

Am curious what the GC is doing when nothing needs cleanup, are those cycles perhaps trying to determine what to clean? Any debt there? (unrelated to this PR)

Yes. We just call Claircore's GC method, so not much we can do from here aside from not calling it so often. The docs say:

// GC is split into two phases, first it will identify any update operations
// which are older then the provided keep value and delete these.
//
// Next it will perform updater based deletions of any vulns from the vuln table
// which are not longer referenced by update operations.
//
// The GC is throttled to not overload the database with cascade deletes.
// If a full GC is required run this method until the returned int64 value
// is 0.

https://github.com/quay/claircore/blob/v1.5.31/datastore/postgres/gc.go#L56

@RTann RTann requested a review from dcaravel September 20, 2024 00:27
@RTann RTann force-pushed the ross-scanner-v4-gc-fix branch from 9cee02a to 74f35f2 Compare September 20, 2024 00:30
@RTann RTann merged commit f2075d7 into master Sep 20, 2024
@RTann RTann deleted the ross-scanner-v4-gc-fix branch September 20, 2024 17:29
shireenf-ibm pushed a commit to shireenf-ibm/stackrox that referenced this pull request Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants