ROX-29771: Use images from quay with prefetcher to avoid flakes#17327
ROX-29771: Use images from quay with prefetcher to avoid flakes#17327
Conversation
|
Images are ready for the commit at f64ea98. To use with deploy scripts, first |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #17327 +/- ##
==========================================
- Coverage 48.84% 48.83% -0.01%
==========================================
Files 2720 2718 -2
Lines 203032 202948 -84
==========================================
- Hits 99166 99117 -49
+ Misses 96079 96047 -32
+ Partials 7787 7784 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I shouldn't have touched those tests :( Should be fixed in 1a08d69 |
This ensures we can debug tests that deploy resources to the default namespace (like TestExcludedScopes) by collecting pod logs from that namespace in CI artifacts. Partially AI-generated.
For the prefetcher branch, we use standard docker.io images (nginx:1.18, debian:latest) rather than quay.io mirrors. The image pull policy ensures these can be fetched from the Internet when not present locally. Partially AI-generated.
12c343f to
8abc9e2
Compare
Since multi-container-pod.yaml now uses debian:latest (docker.io image), we need to expect both 'Latest tag' and 'No CPU request or memory limit specified' policies to be violated. Partially AI-generated.
The prefetcher requires this file to know which images to prefetch. This version lists docker.io images (nginx:1.18, debian:latest) that are used by the Go e2e tests, matching the images in multi-container-pod.yaml. Partially AI-generated.
|
/test gke-nongroovy-e2e-tests |
|
/test ocp-4-12-nongroovy-e2e-tests ocp-4-18-nongroovy-e2e-tests ocp-4-19-nongroovy-e2e-tests gke-nongroovy-e2e-tests |
| ) | ||
| subprocess.run( | ||
| [ | ||
| "scripts/ci/lib.sh", "image_prefetcher_prebuilt_await" |
There was a problem hiding this comment.
@vikin91 I'm not sure this is the best place to do this. Note the calls to this function in operator/tests/run.sh and qa-tests-backend/scripts/run-part-1.sh - they are intentionally done as late as possible in order to reduce idle waiting time. Adding this call here, right after image_prefetcher_prebuilt_start effectively makes the job wait for the prefetcher while it could be doing other things in the meantime.
Also, I don't think image_prefetcher_prebuilt_await is idempotent w.r.t. metrics collection, so I'm afraid this addition might duplicate the metrics on the jobs which run this elsewhere 🤔
There was a problem hiding this comment.
Thanks for looking into this! I wasn't sure whether races can happen, so I added the await. The total time required for the prefetcher to download all images (yes, there are only few currently) was low so I didn't see that as an issue.
I am open for suggestions: (1) whether to call it, or (2) where to call it.
I will also look into the metrics issue.
There was a problem hiding this comment.
There are few in the nongroovy suite, but plenty in the groovy one, and I think this class is used for both? 🤔
I'll try to take a look for a better place this week.
Description
Extracted from: #17216
Note that using prefetched images in the existing tests is not so trivial (pandora box) and will be added in a followup: #17354!
This PR helps preventing running into Docker hub rate limiting problem. In particular (in context of #17216) the images used for the TestPods and TestContainerInstances were not taken from quay.io and thus were failing the test.
User-facing documentation
Testing and quality
Automated testing
How I validated my change
TestContainerInstancesandTestPodsTestContainerInstancesandTestPodsare disabled ⛔ on this branch (they were trigger for this work; they will be enabled in ROX-29771: Unflake TestPods and TestContainerInstances #17216)