Skip to content

ROX-29771: Use prefetched images in non-groovy tests#17354

Merged
vikin91 merged 8 commits intomasterfrom
piotr/ROX-29771-exculded-scopes-test
Oct 29, 2025
Merged

ROX-29771: Use prefetched images in non-groovy tests#17354
vikin91 merged 8 commits intomasterfrom
piotr/ROX-29771-exculded-scopes-test

Conversation

@vikin91
Copy link
Contributor

@vikin91 vikin91 commented Oct 17, 2025

Description

This commit addresses flakiness in e2e tests by improving deployment management and test reliability:

  1. Fixed TestExcludedScopes (excluded_scopes_test.go):

    • Changed image from 'nginx' to 'quay.io/rhacs-eng/qa-multi-arch-nginx:latest'
    • This ensures the 'Latest tag' policy is triggered (required for test logic)
  2. Updated related tests for consistency:

    • backup_test.go: Use prefetched image
    • active_vuln_test.go: Use prefetched images from quay.io to avoid rate limiting

All changes maintain backward compatibility while improving debuggability and reducing flakiness in CI environments.

Partially AI-generated.

For the tests/active_vuln_test.go, I have downloaded images from docker.io and uploaded them to quay.io as follows:

docker pull docker.io/library/nginx:1.14.0@sha256:8b600a4d029481cc5b459f1380b30ff6cb98e27544fc02370de836e397e34030

docker pull docker.io/library/nginx:1.18.0@sha256:e90ac5331fe095cea01b121a3627174b2e33e06e83720e9a934c7b8ccc9c55a0

docker pull docker.io/library/nginx:1.20.0@sha256:ea4560b87ff03479670d15df426f7d02e30cb6340dcd3004cdfc048d6a1d54b4


docker tag docker.io/library/nginx@sha256:8b600a4d029481cc5b459f1380b30ff6cb98e27544fc02370de836e397e34030 \
    quay.io/rhacs-eng/qa-multi-arch:nginx-1.14.0

docker tag docker.io/library/nginx@sha256:e90ac5331fe095cea01b121a3627174b2e33e06e83720e9a934c7b8ccc9c55a0 \
    quay.io/rhacs-eng/qa-multi-arch:nginx-1.18.0

docker tag docker.io/library/nginx@sha256:ea4560b87ff03479670d15df426f7d02e30cb6340dcd3004cdfc048d6a1d54b4 \
    quay.io/rhacs-eng/qa-multi-arch:nginx-1.20.0

docker push quay.io/rhacs-eng/qa-multi-arch:nginx-1.14.0
docker push quay.io/rhacs-eng/qa-multi-arch:nginx-1.18.0
docker push quay.io/rhacs-eng/qa-multi-arch:nginx-1.20.0

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

  • CI gke-nongroovy-e2e-tests will tell the truth
Screenshot 2025-10-23 at 10 20 47

@openshift-ci
Copy link

openshift-ci bot commented Oct 17, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@vikin91 vikin91 changed the title ROX-29771: Use prefetched images in nongroovy tests : ROX-29771: Use prefetched images in non-groovy tests Oct 17, 2025
@rhacs-bot
Copy link
Contributor

rhacs-bot commented Oct 17, 2025

Images are ready for the commit at 4d0bb44.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.10.x-65-g4d0bb445dc.

@codecov
Copy link

codecov bot commented Oct 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 48.72%. Comparing base (39ca19f) to head (f5043c0).
⚠️ Report is 6 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #17354   +/-   ##
=======================================
  Coverage   48.71%   48.72%           
=======================================
  Files        2724     2724           
  Lines      203031   202980   -51     
=======================================
- Hits        98899    98893    -6     
+ Misses      96367    96334   -33     
+ Partials     7765     7753   -12     
Flag Coverage Δ
go-unit-tests 48.72% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Consider extracting the default namespace string into a constant to avoid hardcoding "default" in all setupDeploymentInNamespace calls.
  • The createDeploymentViaAPI function is quite long—splitting it into smaller helpers (e.g. spec construction, error handling, logging) would improve readability.
  • The waitForDeployment* functions mix *testing.T and testutils.T; unifying on a single test interface would make the API more consistent.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider extracting the default namespace string into a constant to avoid hardcoding "default" in all setupDeploymentInNamespace calls.
- The createDeploymentViaAPI function is quite long—splitting it into smaller helpers (e.g. spec construction, error handling, logging) would improve readability.
- The waitForDeployment* functions mix *testing.T and testutils.T; unifying on a single test interface would make the API more consistent.

## Individual Comments

### Comment 1
<location> `scripts/ci/lib.sh:893-895` </location>
<code_context>
     qa-e2e)
         cp "$SCRIPTS_ROOT/qa-tests-backend/scripts/images-to-prefetch.txt" "$image_list"
         ;;
+    qa-nongroovy-e2e)
+        cp "$SCRIPTS_ROOT/tests/images-to-prefetch.txt" "$image_list"
+        ;;
     *)
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Check for existence of images-to-prefetch.txt before copying.

Add a check to verify the file exists before copying, or handle missing file errors to prevent runtime failures.

```suggestion
    qa-nongroovy-e2e)
        if [ -f "$SCRIPTS_ROOT/tests/images-to-prefetch.txt" ]; then
            cp "$SCRIPTS_ROOT/tests/images-to-prefetch.txt" "$image_list"
        else
            echo "WARNING: $SCRIPTS_ROOT/tests/images-to-prefetch.txt not found, skipping image prefetch for qa-nongroovy-e2e."
        fi
        ;;
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@vikin91 vikin91 force-pushed the piotr/ROX-29771-exculded-scopes-test branch 2 times, most recently from 9a43c43 to 5a405dd Compare October 20, 2025 09:06
@porridge
Copy link
Contributor

I ❤️ Refactored deployment creation (common.go): - would it be a big pain to extract it as a separate PR?

@vikin91
Copy link
Contributor Author

vikin91 commented Oct 20, 2025

I ❤️ Refactored deployment creation (common.go): - would it be a big pain to extract it as a separate PR?

This PR is already an extraction from #17216, which was planned to be a quick look into a flake 😂
Yes, I can extract it.

@vikin91 vikin91 force-pushed the piotr/ROX-29771-exculded-scopes-test branch from dc2675a to 4d0bb44 Compare October 20, 2025 11:13
@vikin91 vikin91 marked this pull request as ready for review October 20, 2025 13:26
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@vikin91 vikin91 added the auto-retest PRs with this label will be automatically retested if prow checks fails label Oct 21, 2025
@rhacs-bot
Copy link
Contributor

/retest

2 similar comments
@rhacs-bot
Copy link
Contributor

/retest

@rhacs-bot
Copy link
Contributor

/retest

@vikin91
Copy link
Contributor Author

vikin91 commented Oct 21, 2025

gke-nongroovy-e2e-tests failed with

Failed to pull image "quay.io/rhacs-eng/qa-multi-arch-nginx:latest": failed to pull and unpack image "quay.io/rhacs-eng/qa-multi-arch-nginx:latest": failed to resolve reference "quay.io/rhacs-eng/qa-multi-arch-nginx:latest": unexpected status from HEAD request to https://quay.io/v2/rhacs-eng/qa-multi-arch-nginx/manifests/latest: 401 UNAUTHORIZED

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Oct 21, 2025

Images are ready for the commit at 3109e90.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.10.x-66-g3109e90b57.

@vikin91
Copy link
Contributor Author

vikin91 commented Oct 21, 2025

This is currently blocked by #17374
We cannot pull the prefetched images as we cannot utilize IMAGE_PULL_POLICY_FOR_QUAY_IO=Never for this. The #17374 changes the deployment creation to use the k8s API and that makes it easier to specify the pull policy.
One could have done it here, but that would be code written only for few hours and later deleted, so better to wait for #17374

@vikin91 vikin91 removed the auto-retest PRs with this label will be automatically retested if prow checks fails label Oct 21, 2025
@vikin91 vikin91 added the auto-retest PRs with this label will be automatically retested if prow checks fails label Oct 21, 2025
@vikin91
Copy link
Contributor Author

vikin91 commented Oct 21, 2025

/retest

@rhacs-bot
Copy link
Contributor

/test gke-nongroovy-e2e-tests

3 similar comments
@rhacs-bot
Copy link
Contributor

/test gke-nongroovy-e2e-tests

@rhacs-bot
Copy link
Contributor

/test gke-nongroovy-e2e-tests

@rhacs-bot
Copy link
Contributor

/test gke-nongroovy-e2e-tests

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Oct 22, 2025

Images are ready for the commit at f5043c0.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.10.x-178-gf5043c0bdd.

@rhacs-bot
Copy link
Contributor

/test gke-nongroovy-e2e-tests

1 similar comment
@rhacs-bot
Copy link
Contributor

/test gke-nongroovy-e2e-tests

@vikin91
Copy link
Contributor Author

vikin91 commented Oct 23, 2025

I see that TestDelegatedScanning is failing sometimes on the X-YY-nongroovy-e2e-tests with the following failure:

{Failed      common.go:68: Oct 22 12:51:42.893 Num images deleted from query "Image Sha:sha256:68b418b74715000e41a894428bd787442945592486a08d4cbea89a9b4fa03302": 1
    common.go:68: Oct 22 12:51:43.297 Creating deployment "dele-scan-icsp" with image: "icsp.invalid/rhacs-eng/qa@sha256:68b418b74715000e41a894428bd787442945592486a08d4cbea89a9b4fa03302"
    common.go:68: Oct 22 12:51:43.305 Creating deployment "dele-scan-icsp" with image "icsp.invalid/rhacs-eng/qa@sha256:68b418b74715000e41a894428bd787442945592486a08d4cbea89a9b4fa03302" in namespace "default"
    common.go:432: [DEBUG] POST https://api.rox-ci-12586240.ocp.ci.rox.systems:6443/apis/apps/v1/namespaces/default/deployments?timeout=30s
I1022 12:51:46.281063  131750 warnings.go:110] "Warning: would violate PodSecurity \"restricted:latest\": runAsNonRoot != true (pod or container \"dele-scan-icsp\" must set securityContext.runAsNonRoot=true)"
    common.go:68: Oct 22 12:51:46.513 Found successful scan of "icsp.invalid/rhacs-eng/qa@sha256:68b418b74715000e41a894428bd787442945592486a08d4cbea89a9b4fa03302"
    common.go:68: Oct 22 12:51:46.514 Waiting until "app=sensor" pods logs contain the image scan: [contains line(s) matching "Image \"icsp.invalid/rhacs-eng/qa@sha256:68b418b74715000e41a894428bd787442945592486a08d4cbea89a9b4fa03302\".* enriched with metadata using pull source"]
    common.go:68: Oct 22 12:51:46.875 Found Sensor log entries indiciating successful scan of "icsp.invalid/rhacs-eng/qa@sha256:68b418b74715000e41a894428bd787442945592486a08d4cbea89a9b4fa03302" after byte 3016217
    common.go:68: Oct 22 12:51:46.875 Tearing down deployment "dele-scan-icsp"
    common.go:225: Timed out waiting for deployment dele-scan-icsp to stop}

I enabled log collection for the default namespace and the affected pod is doing well apparently:

Name:             dele-scan-icsp-7b8f94bc9d-vd6dl
Namespace:        default
Priority:         0
Service Account:  default
Node:             rox-ci-12586240-tx42b-worker-b-xxcd6/10.0.128.3
Start Time:       Wed, 22 Oct 2025 12:51:46 +0000

pod logs:

/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: info: /etc/nginx/conf.d/default.conf differs from the packaged version
/docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2025/10/22 12:51:52 [notice] 1#1: using the "epoll" event method
2025/10/22 12:51:52 [notice] 1#1: nginx/1.27.1
2025/10/22 12:51:52 [notice] 1#1: built by gcc 12.2.0 (Debian 12.2.0-14) 
2025/10/22 12:51:52 [notice] 1#1: OS: Linux 4.18.0-372.164.1.el8_6.x86_64
2025/10/22 12:51:52 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2025/10/22 12:51:52 [notice] 1#1: start worker processes
2025/10/22 12:51:52 [notice] 1#1: start worker process 28
2025/10/22 12:51:52 [notice] 1#1: start worker process 29
2025/10/22 12:51:52 [notice] 1#1: start worker process 30
2025/10/22 12:51:52 [notice] 1#1: start worker process 31
2025/10/22 12:51:52 [notice] 1#1: start worker process 32
2025/10/22 12:51:52 [notice] 1#1: start worker process 33
2025/10/22 12:51:52 [notice] 1#1: start worker process 34
2025/10/22 12:51:52 [notice] 1#1: start worker process 35

I think this is another issue unrelated to this work - the pod seem to start after the timeout for stopping it.

The log line

common.go:432: [DEBUG] POST https://api.rox-ci-12586240.ocp.ci.rox.systems:6443/apis/apps/v1/namespaces/default/deployments?timeout=30s
I1022 12:51:46.281063  131750 warnings.go:110] "Warning: would violate PodSecurity \"restricted:latest\": runAsNonRoot != true (pod or container \"dele-scan-icsp\" must set securityContext.runAsNonRoot=true)"

is caused by changes introduced here in this PR and will be fixed before merging.

@vikin91 vikin91 force-pushed the piotr/ROX-29771-exculded-scopes-test branch from 933d3fc to 360782e Compare October 23, 2025 14:54
@rhacs-bot
Copy link
Contributor

/test gke-nongroovy-e2e-tests

@vikin91 vikin91 force-pushed the piotr/ROX-29771-exculded-scopes-test branch from 360782e to db54343 Compare October 24, 2025 09:35
@rhacs-bot
Copy link
Contributor

/test gke-nongroovy-e2e-tests

…e2e tests

Use prefetched quay.io images instead of docker.io to avoid rate limits and
unauthorized access errors. Introduce createDeploymentViaAPI to create K8s
deployments via API client instead of kubectl, enabling proper imagePullPolicy
configuration for prefetched images.

Key changes:
- Add createDeploymentViaAPI with security context to satisfy PodSecurity
  policies and ensure graceful pod termination (fixes teardown timeouts)
- Apply IMAGE_PULL_POLICY_FOR_QUAY_IO only to actual quay.io/ images, not
  mirrored images (icsp.invalid, idms.invalid, itms.invalid) used in mirror
  tests
- Collect logs from "default" namespace in delegated scanning tests for
  debugging deployment issues
- Update test images to use quay.io mirrors and add them to prefetch list

Fixes TestDelegatedScanning teardown timeout caused by missing security
context fields preventing graceful pod termination.

AI-generated: Initial createDeploymentViaAPI structure, security context
fields, and log collection additions
User-corrected: imagePullPolicy logic to exclude mirrors, security context
values to match PodSecurity requirements, and verified all deployment
creation call paths
…iaAPI

Test images (nginx, etc.) run as root by default. The RunAsNonRoot constraint
was causing pod creation failures with "container has runAsNonRoot and image
will run as root" error in TestBackup and other tests.

Keep other security context fields (AllowPrivilegeEscalation=false,
drop all capabilities, seccomp profile) to help ensure graceful termination
without breaking root containers.

AI-generated: Initial approach with full security constraints
User-corrected: Removed RunAsNonRoot after identifying root container
requirement from test failure
@vikin91 vikin91 force-pushed the piotr/ROX-29771-exculded-scopes-test branch from db54343 to a689c34 Compare October 28, 2025 08:37
@openshift-ci
Copy link

openshift-ci bot commented Oct 28, 2025

@vikin91: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/gke-ui-e2e-tests f5043c0 link true /test gke-ui-e2e-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@vikin91
Copy link
Contributor Author

vikin91 commented Oct 29, 2025

TestDelegatedScanning is passing on all 3 ocp nongroovy targets. I will merge this and if some corrections are needed, then I will cover in #17374

@vikin91 vikin91 merged commit fbae271 into master Oct 29, 2025
89 of 90 checks passed
@vikin91 vikin91 deleted the piotr/ROX-29771-exculded-scopes-test branch October 29, 2025 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-review area/ci auto-retest PRs with this label will be automatically retested if prow checks fails

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants