Skip to content

experiment: busybox + parallel pre-build + docker layer caching#19831

Draft
davdhacs wants to merge 26 commits intoROX-33958/resue-componentsfrom
davdhacs/busybox-parallel-build
Draft

experiment: busybox + parallel pre-build + docker layer caching#19831
davdhacs wants to merge 26 commits intoROX-33958/resue-componentsfrom
davdhacs/busybox-parallel-build

Conversation

@davdhacs
Copy link
Copy Markdown
Contributor

@davdhacs davdhacs commented Apr 4, 2026

Description

Experiment — busybox + docker caching with the existing parallel pre-build job structure. Compare with #19830 (combined single-job approach).

Same optimizations: stable ldflags, COPY --link, docker driver, inline cache, push:true, provenance:false. But keeps separate pre-build-go-binaries and pre-build-cli jobs running in parallel.

Expected: wall-clock longer than combined (waits for pre-build-cli ~2m), but build+push step should be ~30s warm.

🤖 Generated with Claude Code

davdhacs and others added 25 commits April 2, 2026 16:47
Enable GHA buildx cache for main, roxctl, and operator image builds.
Docker layers (base image pulls, package installs) are cached across
CI runs, avoiding redundant microdnf upgrade/install on every build.

Cache is opt-in via DOCKER_BUILDX_CACHE env var to avoid affecting
local builds. Scoped per image and architecture to prevent collisions.

Expected savings: ~60-80s off the 115s "Build main images" step on
warm runs (package install layers cached).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move OS package installation (microdnf upgrade, postgres RPMs,
util-linux) into a separate 'base-with-packages' stage. The final
stage uses FROM base-with-packages and only COPYs binaries.

Before: COPY binaries → RUN microdnf (rebuilds packages every commit)
After:  base-with-packages stage (cached) → COPY binaries (fast)

With Docker buildx GHA cache, the package install layer (~60s) is
cached across CI runs. Only the binary COPY steps rebuild per commit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The RUN step calls save-dir-contents which is in static-bin/.
Copy just the helper scripts needed for package installation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace make docker-build-main-image with docker/build-push-action@v6
which handles GHA buildx cache natively. The action manages cache
tokens and builder configuration that our docker-build.sh wrapper
was missing.

The base-with-packages Dockerfile stage (package installs) should
now cache across CI runs, skipping microdnf upgrade/install when
only binaries change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each COPY --link creates an independent overlay layer. Changing one
binary (e.g. bin/central) doesn't invalidate other COPY layers
(ui, static-data, etc). Combined with GHA buildx cache, this means
only the changed binaries need to be re-copied on warm builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use push: true instead of load: true on docker/build-push-action.
This pushes layers directly to the registry from buildkit, avoiding
the slow --load export to local docker (~90s overhead even with all
layers cached).

The main image is now built and pushed in one step. roxctl and
central-db still use the separate push step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Container env vars removed by Tomecz's PR aren't available as
env vars anymore. Use secrets.* directly in docker/login-action.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The push-main-manifests step expects per-arch tags (e.g. main:tag-amd64)
to create multi-arch manifest lists.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
docker/login-action with registry: quay.io/org doesn't work for
quay.io. Use the existing registry_rw_login helper which handles
quay.io authentication correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…r stackrox-io

docker/login-action can only authenticate to one quay.io org at a time.
Push main to rhacs-eng via build-push-action (fast, cached layers).
Use existing push_main_image_set for stackrox-io and other images
(handles multi-org login correctly by re-authenticating per org).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
build-push-action pushes directly to registry without loading locally.
The existing push_main_image_set expects local images. Instead:
- Push roxctl/central-db from local docker (built by make)
- Copy main from rhacs-eng to stackrox-io via skopeo

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
push:true creates manifest lists instead of plain images, breaking
the multi-arch manifest creation step. Revert to load:true which
loads into local docker and uses the existing push_main_image_set
pipeline unchanged.

GHA layer cache still works with load:true (17 cached layers confirmed).
Build step: 105s warm vs 110s baseline (5% faster). The --load export
overhead limits the savings but the Dockerfile restructuring and
COPY --link provide the foundation for future improvements.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use push:true + provenance:false to push main image directly from
buildx to the registry. provenance:false produces a plain image
manifest (not a manifest list), compatible with the downstream
push-main-manifests job that creates multi-arch manifest lists.

Login to both quay.io orgs before the build step so buildx can push
to both registries. roxctl and central-db still use docker push
(built locally by make).

Expected: Build+push main image ~55s (vs 105s with load:true).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…g copy

buildx's docker-container driver doesn't share host docker credentials.
Use docker/login-action (which injects creds into the buildx builder)
for rhacs-eng push. Copy main to stackrox-io via skopeo (lightweight,
blobs shared on quay.io).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
docker login can only hold one quay.io credential at a time. Use
skopeo --src-creds and --dest-creds to authenticate to both orgs
simultaneously for the rhacs-eng → stackrox-io copy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use type=registry cache backed by GHCR instead of type=gha.
Benefits:
- No 10GB size limit (GHA cache is shared across all workflows)
- Buildx pulls only needed layers (content-addressable), not full blob
- Faster restore for multi-stage builds with many cached layers

Cache images stored at ghcr.io/stackrox/stackrox/cache/main-{arch}.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Switch from docker-container to docker driver for buildx. The docker
driver uses Docker Engine's built-in buildkit — no separate container
to boot.

No cache for this run (baseline measurement). Will add inline cache
once driver change is validated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Docker driver (68s cold, no cache) is already faster than
docker-container + GHA cache (87s). Add inline cache on top:
- cache-to: type=inline embeds cache metadata in the pushed image
- cache-from pulls the previous build (cache-{arch} tag) as source
- With COPY --link, only changed layers need rebuilding

Push a stable cache-{arch} tag on every build so the next build
has a cache reference.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same optimizations as the combined-job experiment but keeping the
existing parallel pre-build job structure:
- Busybox binary (1 binary instead of 8)
- Stable ldflags (BUILD_TAG=0.0.0)
- Docker driver + COPY --link + inline cache
- push:true + provenance:false
- CLI builds only host-arch roxctl

Compare wall-clock with combined-job approach.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 4, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • Hard-coding BUILD_TAG=0.0.0 and SHORTCOMMIT="0000000" in the pre-build jobs changes behavior beyond pure performance tuning; consider guarding this behind a conditional (e.g. env flag or separate workflow) so normal builds still use the dynamically computed values.
  • In the Push remaining images step, registry_rw_login is called inside the loop for each image and registry; you can move the logins outside the loop to avoid repeated logins and slightly speed up the push phase.
  • The new explicit tag/push logic for main, roxctl, and central-db in the workflow diverges from the existing push_main_image_set helper; consider centralizing this again (or at least reusing shared helpers) to keep tag/push behavior consistent across workflows.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Hard-coding `BUILD_TAG=0.0.0` and `SHORTCOMMIT="0000000"` in the pre-build jobs changes behavior beyond pure performance tuning; consider guarding this behind a conditional (e.g. env flag or separate workflow) so normal builds still use the dynamically computed values.
- In the `Push remaining images` step, `registry_rw_login` is called inside the loop for each image and registry; you can move the logins outside the loop to avoid repeated logins and slightly speed up the push phase.
- The new explicit tag/push logic for `main`, `roxctl`, and `central-db` in the workflow diverges from the existing `push_main_image_set` helper; consider centralizing this again (or at least reusing shared helpers) to keep tag/push behavior consistent across workflows.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 49.60%. Comparing base (5d9f83e) to head (04b23aa).

Additional details and impacted files
@@                     Coverage Diff                     @@
##           ROX-33958/resue-components   #19831   +/-   ##
===========================================================
  Coverage                       49.60%   49.60%           
===========================================================
  Files                            2763     2763           
  Lines                          208254   208254           
===========================================================
+ Hits                           103309   103313    +4     
+ Misses                          97278    97274    -4     
  Partials                         7667     7667           
Flag Coverage Δ
go-unit-tests 49.60% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant