Skip to content

ROX-29771: Use images from quay with prefetcher to avoid flakes#17327

Merged
vikin91 merged 8 commits intomasterfrom
piotr/ROX-29771-prefetcher
Oct 20, 2025
Merged

ROX-29771: Use images from quay with prefetcher to avoid flakes#17327
vikin91 merged 8 commits intomasterfrom
piotr/ROX-29771-prefetcher

Conversation

@vikin91
Copy link
Contributor

@vikin91 vikin91 commented Oct 16, 2025

Description

Extracted from: #17216

Note that using prefetched images in the existing tests is not so trivial (pandora box) and will be added in a followup: #17354!

This PR helps preventing running into Docker hub rate limiting problem. In particular (in context of #17216) the images used for the TestPods and TestContainerInstances were not taken from quay.io and thus were failing the test.

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Oct 16, 2025

Images are ready for the commit at f64ea98.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.10.x-59-gf64ea98002.

@codecov
Copy link

codecov bot commented Oct 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 48.83%. Comparing base (2c41166) to head (f64ea98).
⚠️ Report is 13 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #17327      +/-   ##
==========================================
- Coverage   48.84%   48.83%   -0.01%     
==========================================
  Files        2720     2718       -2     
  Lines      203032   202948      -84     
==========================================
- Hits        99166    99117      -49     
+ Misses      96079    96047      -32     
+ Partials     7787     7784       -3     
Flag Coverage Δ
go-unit-tests 48.83% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vikin91
Copy link
Contributor Author

vikin91 commented Oct 16, 2025

I shouldn't have touched those tests :(

=== RUN   TestExcludedScopes
INFO: Thu Oct 16 14:57:47 UTC 2025: Refreshing the GKE auth token
Fetching cluster endpoint and auth data.
kubeconfig entry generated for rox-ci-nongroovy-test-197881895396259020.
main: 2025/10/16 14:57:58.444886 excluded_scopes_test.go:53: Info: Received alerts:
    excluded_scopes_test.go:54: 
        	Error Trace:	/go/src/github.com/stackrox/stackrox/tests/excluded_scopes_test.go:54
        	            				/go/src/github.com/stackrox/stackrox/tests/excluded_scopes_test.go:127
        	            				/go/src/github.com/stackrox/stackrox/tests/excluded_scopes_test.go:32
        	Error:      	Failed to have 1 alerts, instead received 0 alerts
        	Test:       	TestExcludedScopes
--- FAIL: TestExcludedScopes (102.30s)

Should be fixed in 1a08d69

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

This ensures we can debug tests that deploy resources to the default namespace
(like TestExcludedScopes) by collecting pod logs from that namespace in CI artifacts.

Partially AI-generated.
For the prefetcher branch, we use standard docker.io images (nginx:1.18, debian:latest)
rather than quay.io mirrors. The image pull policy ensures these can be fetched from
the Internet when not present locally.

Partially AI-generated.
@vikin91 vikin91 force-pushed the piotr/ROX-29771-prefetcher branch from 12c343f to 8abc9e2 Compare October 17, 2025 10:48
Since multi-container-pod.yaml now uses debian:latest (docker.io image),
we need to expect both 'Latest tag' and 'No CPU request or memory limit specified'
policies to be violated.

Partially AI-generated.
The prefetcher requires this file to know which images to prefetch.
This version lists docker.io images (nginx:1.18, debian:latest) that
are used by the Go e2e tests, matching the images in multi-container-pod.yaml.

Partially AI-generated.
@vikin91 vikin91 added the auto-retest PRs with this label will be automatically retested if prow checks fails label Oct 17, 2025
@rhacs-bot
Copy link
Contributor

/test gke-nongroovy-e2e-tests

@vikin91 vikin91 added the ci-all-qa-tests Tells CI to run all API tests (not just BAT). label Oct 17, 2025
@vikin91
Copy link
Contributor Author

vikin91 commented Oct 17, 2025

/test ocp-4-12-nongroovy-e2e-tests ocp-4-18-nongroovy-e2e-tests ocp-4-19-nongroovy-e2e-tests gke-nongroovy-e2e-tests

@vikin91 vikin91 merged commit 2ceac5b into master Oct 20, 2025
111 checks passed
@vikin91 vikin91 deleted the piotr/ROX-29771-prefetcher branch October 20, 2025 07:00
)
subprocess.run(
[
"scripts/ci/lib.sh", "image_prefetcher_prebuilt_await"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vikin91 I'm not sure this is the best place to do this. Note the calls to this function in operator/tests/run.sh and qa-tests-backend/scripts/run-part-1.sh - they are intentionally done as late as possible in order to reduce idle waiting time. Adding this call here, right after image_prefetcher_prebuilt_start effectively makes the job wait for the prefetcher while it could be doing other things in the meantime.

Also, I don't think image_prefetcher_prebuilt_await is idempotent w.r.t. metrics collection, so I'm afraid this addition might duplicate the metrics on the jobs which run this elsewhere 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this! I wasn't sure whether races can happen, so I added the await. The total time required for the prefetcher to download all images (yes, there are only few currently) was low so I didn't see that as an issue.
I am open for suggestions: (1) whether to call it, or (2) where to call it.

I will also look into the metrics issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are few in the nongroovy suite, but plenty in the groovy one, and I think this class is used for both? 🤔
I'll try to take a look for a better place this week.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-review area/ci auto-retest PRs with this label will be automatically retested if prow checks fails ci-all-qa-tests Tells CI to run all API tests (not just BAT).

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants