Skip to content

fix(ci): increase GKE disk to 120GB#19218

Open
davdhacs wants to merge 4 commits intomasterfrom
davdhacs/rox-33305-gc-8h
Open

fix(ci): increase GKE disk to 120GB#19218
davdhacs wants to merge 4 commits intomasterfrom
davdhacs/rox-33305-gc-8h

Conversation

@davdhacs
Copy link
Contributor

@davdhacs davdhacs commented Feb 26, 2026

default image garbage-collection expiration is 2 minutes (and cannot be increased). we're exceeding 85% disk and so GC was removing prefetched images

default is 2 minutes; we're exceeding 85% disk
and so GC was removing prefetched images
@davdhacs
Copy link
Contributor Author

/test gke-latest-qa-e2e-tests

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Feb 26, 2026

Images are ready for the commit at 9fb8f60.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-195-g9fb8f6037f.

node-accessible is disk minus OS/etc (~42GB gke-latest)
GC hits at 85% and evicts images not used in 2 minutes.
Neither the 85% nor the 2 minutes can be increased.
@davdhacs
Copy link
Contributor Author

/test gke-latest-qa-e2e-tests

@davdhacs
Copy link
Contributor Author

/test gke-latest-qa-e2e-tests

@davdhacs davdhacs changed the title fix(ci): extend gke image GC time to 8h fix(ci): increase GKE disk to 100GB Feb 26, 2026
@davdhacs davdhacs changed the title fix(ci): increase GKE disk to 100GB fix(ci): increase GKE disk to 120GB Feb 26, 2026
The prefetcher pulls images by tag via the CRI API, which stores them
indexed by tag name. When tests reference the image as
tag@sha256:<manifest-list-digest>, containerd 2.x cannot resolve it
with imagePullPolicy: Never because the manifest list digest is not
indexed as a named image by the CRI pull-by-tag path.

This caused ErrImageNeverPull on every node regardless of disk size,
as the image was present on disk but not findable by digest. Images
referenced by tag only (busybox-1-33-1, nginx-1-12-1, etc.) worked
fine with the same Never pull policy.

Remove the @sha256: digest from TEST_IMAGE so it matches how the
prefetcher stores the image. Keep TEST_IMAGE_SHA available for API
queries that need the digest.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@davdhacs davdhacs requested a review from janisz as a code owner February 26, 2026 21:53
@davdhacs
Copy link
Contributor Author

/test gke-latest-qa-e2e-tests

@openshift-ci
Copy link

openshift-ci bot commented Feb 27, 2026

@davdhacs: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/gke-latest-qa-e2e-tests 9fb8f60 link false /test gke-latest-qa-e2e-tests
ci/prow/ocp-4-12-qa-e2e-tests 9fb8f60 link false /test ocp-4-12-qa-e2e-tests
ci/prow/ocp-4-12-scanner-v4-install-tests 9fb8f60 link false /test ocp-4-12-scanner-v4-install-tests
ci/prow/ocp-4-12-operator-e2e-tests 9fb8f60 link false /test ocp-4-12-operator-e2e-tests
ci/prow/ocp-4-20-qa-e2e-tests 9fb8f60 link false /test ocp-4-20-qa-e2e-tests
ci/prow/ocp-4-20-nongroovy-e2e-tests 9fb8f60 link false /test ocp-4-20-nongroovy-e2e-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants