ROX-19064: Scanner V4 CI Wait for Vulns to Load#19836
ROX-19064: Scanner V4 CI Wait for Vulns to Load#19836
Conversation
|
Skipping CI for Draft Pull Request. |
3c47842 to
3d0cf01
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #19836 +/- ##
=======================================
Coverage 49.60% 49.60%
=======================================
Files 2763 2763
Lines 208339 208339
=======================================
Hits 103341 103341
Misses 97331 97331
Partials 7667 7667
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🚀 Build Images ReadyImages are ready for commit 3d0cf01. To use with deploy scripts: export MAIN_IMAGE_TAG=4.11.x-561-g3d0cf018f5 |
🚀 Build Images ReadyImages are ready for commit 3c47842. To use with deploy scripts: export MAIN_IMAGE_TAG=4.11.x-561-g3c47842981 |
|
/test all |
| # (i.e. database connectivity). Call this separately in jobs that verify scan | ||
| # results, after deploy_stackrox has returned. | ||
| wait_for_scanner_v4_vuln_load() { | ||
| local max_seconds="${SCANNER_V4_VULN_LOAD_TIMEOUT:-2400}" |
There was a problem hiding this comment.
Is it 40 minutes wait in CI? Can we add a tag for scanner v4 tests and run everything else instead of waiting?
There was a problem hiding this comment.
Alternatively load smaller vulnerability list so it will be ready in a second. We have list of images in prefetcher config so let's load only what's really needed.
There was a problem hiding this comment.
Please see #19835 - Scanner V4 isn't enabled yet - working on reducing this as much as possible while managing scope.
Can we add a tag for scanner v4 tests and run everything else instead of waiting?
Possibly - many tests (UI, compliance, deployment, policy, etc.) rely on the ability to scan images, so this may add complexity and not buy much - in my testing the overall CI runs were completing in a similar timeframe as today with the other optimizations in review - will continue to find optimizations. FWIW CI today waits for Scanner V2 vulns to load (via pod readiness) so this isn't a 'new' concept.
Description
Adds the rails for CI jobs to wait for vuln loads to finish before starting tests.
This PR polls the Central API to determine if vulns are loaded (same API that is used by System Health).
Another option was considered to use the 'readiness' setting in Scanner V4 matcher so that the pod does not reach a readiness state until vulns are loaded. The polling approach was favor because it does not require making changes in CI for each different install type (manifest, helm, operator, etc.) and the cause of timeouts would be 'less obvious' when jobs fail - with polling the failure reason is directly in the build logs (amongst other things).
Prior to polling the available storage classes are listed for the cluster to assist troubleshooting if loads are slow (to verify if the DB PVC is using an SSD), additionally each poll dumps the current top pods cpu/mem consumption to assist in troubleshooting/measuring (as needed).
User-facing documentation
Testing and quality
Automated testing
The changes themselves are tests
How I validated my change
Against StackRox Scanner these changes will be tested by CI as part of this PR
Against Scanner V4 these changes were validated in #19236 and will be validated again in a future PR when Scanner V4 is officially turned on in CI.