Skip to content

perf(ci): shallow clone for CI jobs#19294

Draft
davdhacs wants to merge 18 commits intomasterfrom
davdhacs/gha-shallow-clone
Draft

perf(ci): shallow clone for CI jobs#19294
davdhacs wants to merge 18 commits intomasterfrom
davdhacs/gha-shallow-clone

Conversation

@davdhacs
Copy link
Contributor

@davdhacs davdhacs commented Mar 5, 2026

Summary

Remove fetch-depth: 0 from all checkout steps. Shallow clones (depth=1, the default) skip git history download, saving ~10s and ~220 MB per job (~405 MB → ~184 MB checkout).

How it works

The job-preamble action resolves version info for shallow clones in ~2s:

  1. git ls-remote --tags lists all .x version tags (~1s, no object download)
  2. Tries each in reverse version order with git fetch --shallow-exclude until one succeeds (= nearest ancestor .x tag). On master this is the 1st try; on release branches it skips newer non-ancestor tags.
  3. Counts commits via git rev-list --count and sets BUILD_TAG env var, which make/env.mk checks before git describe.

Version tag behavior

On master and PRs to master: BUILD_TAG is 4.11.x-270-g<sha>, matching what git describe produces on a full clone.

On release branches (e.g. release-4.9): BUILD_TAG is 4.9.x-1200-g<sha> — the nearest .x branch-point tag. This differs from what git describe would produce on a full clone (4.9.3-31-g<sha>, the nearest release tag). The .x suffix is functionally correct: it triggers QUAY_TAG_EXPIRATION=13w in make/env.mk (CI builds should expire), and nothing parses the commit count. The SHA is identical either way.

Changes

  • job-preamble (+22 lines): shallow clone detection, .x tag lookup via ls-remote, history fetch via --shallow-exclude, BUILD_TAG env var
  • 14 workflow files (-17 lines): removed fetch-depth: 0

No changes to make/env.mk, Makefiles, scripts, or the release process.

Exceptions

Kept fetch-depth: 0 for:

  • check-crd-compatibility: checks out previous released version by tag
  • local-roxctl-tests: bats tests use git describe for version comparison
  • misc-checks: needs full history for git diff --check
  • push-scanner-manifests: no job-preamble to resolve version tag
  • update_*_periodic workflows: need all branches/tags for version bumps

Test plan

  • Unit Tests pass (go, go-postgres, go-bench, sensor, shell, openshift-ci)
  • Style passes (golangci-lint, check-generated-files, style-check)
  • Build passes (pre-build-cli, pre-build-go-binaries, pre-build-docs, main images)
  • Scanner builds pass
  • Version strings correct (X.Y.x-N-g<sha> format)
  • check-crd-compatibility passes
  • local-roxctl-tests passes
  • Prow e2e tests pass

🤖 Generated with Claude Code

@openshift-ci
Copy link

openshift-ci bot commented Mar 5, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In the make/env.mk fallback branch, git tag --list '[0-9]*.[0-9]*.x' can return nothing in a shallow clone, producing a version like -0-g<sha>; consider explicitly handling the “no tag found” case (e.g., defaulting to a clearer placeholder or skipping the tag- prefix when empty).
  • The job-preamble step relies on git ls-remote --tags origin | grep -oP 'refs/tags/\K[0-9]+\.[0-9]+\.x$' which assumes tags with a strict <major>.<minor>.x format and the availability of grep -P; consider making this more robust to different tag naming schemes (e.g., v1.2.x) and environments where grep -P may not be present.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In the `make/env.mk` fallback branch, `git tag --list '[0-9]*.[0-9]*.x'` can return nothing in a shallow clone, producing a version like `-0-g<sha>`; consider explicitly handling the “no tag found” case (e.g., defaulting to a clearer placeholder or skipping the `tag-` prefix when empty).
- The `job-preamble` step relies on `git ls-remote --tags origin | grep -oP 'refs/tags/\K[0-9]+\.[0-9]+\.x$'` which assumes tags with a strict `<major>.<minor>.x` format and the availability of `grep -P`; consider making this more robust to different tag naming schemes (e.g., `v1.2.x`) and environments where `grep -P` may not be present.

## Individual Comments

### Comment 1
<location path=".github/actions/job-preamble/action.yaml" line_range="156" />
<code_context>
+          echo "::warning::No version tag found via ls-remote"
+          exit 0
+        fi
+        git fetch --shallow-exclude="$tag" origin HEAD 2>/dev/null || true
+      shell: bash
+
</code_context>
<issue_to_address>
**question (bug_risk):** Clarify/verify that `--shallow-exclude` with `HEAD` actually yields enough history for `git describe` in all cases.

This approach assumes `HEAD` is on `origin/HEAD` and that the tag is on the same branch, so that `--shallow-exclude="$tag" origin HEAD` makes the tag an ancestor of `HEAD`.

If tags are created on other branches, this may either fetch much more history than intended or still leave the tag unreachable from `HEAD`. Consider explicitly fetching the expected branch (e.g. `origin/main`/`origin/master`) or resolving the tracked remote ref for `HEAD` and fetching that, to avoid surprises when default branches or tag placement change.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@davdhacs davdhacs force-pushed the davdhacs/gha-shallow-clone branch 3 times, most recently from 2227924 to 5bc30cc Compare March 5, 2026 02:36
@davdhacs davdhacs changed the title perf(ci): shallow clone for all CI jobs perf(ci): shallow clone for CI jobs Mar 5, 2026
@rhacs-bot
Copy link
Contributor

rhacs-bot commented Mar 5, 2026

Images are ready for the commit at 3c9324b.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-274-g3c9324b48a.

@codecov
Copy link

codecov bot commented Mar 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 49.61%. Comparing base (1e56031) to head (3c9324b).

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #19294      +/-   ##
==========================================
- Coverage   49.65%   49.61%   -0.04%     
==========================================
  Files        2689     2689              
  Lines      202505   202505              
==========================================
- Hits       100552   100475      -77     
- Misses      94460    94524      +64     
- Partials     7493     7506      +13     
Flag Coverage Δ
go-unit-tests 49.61% <ø> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@davdhacs davdhacs force-pushed the davdhacs/gha-shallow-clone branch 12 times, most recently from a419396 to 1f6b3ad Compare March 5, 2026 05:40
Remove fetch-depth: 0 from all checkout steps. Shallow clones (depth=1,
the default) skip git history download, saving ~10s and ~220 MB per job.

The job-preamble action resolves version info for shallow clones in ~3s:
1. git ls-remote finds the nearest X.Y.x version tag (~0.4s)
2. git fetch --shallow-exclude deepens history to the tag (~2s)
3. A local tag at the boundary enables git describe and make tag
4. BUILD_TAG env var is set as a fallback for make/env.mk

Tries tags in reverse version order to handle both master and release
branches. Commit count is off by 1 (--shallow-exclude stops above the
tag commit) which is acceptable for version strings.

Kept fetch-depth: 0 for misc-checks (git diff --check needs full
history), push-scanner-manifests (no preamble), and update_*_periodic
workflows (need all branches/tags).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@davdhacs davdhacs force-pushed the davdhacs/gha-shallow-clone branch from 1f6b3ad to 5f5f03e Compare March 5, 2026 13:46
@davdhacs
Copy link
Contributor Author

davdhacs commented Mar 5, 2026

/test all

@davdhacs davdhacs force-pushed the davdhacs/gha-shallow-clone branch 3 times, most recently from ce30ad3 to f1051af Compare March 5, 2026 16:19
@davdhacs davdhacs force-pushed the davdhacs/gha-shallow-clone branch 3 times, most recently from e2c3843 to ef1c51f Compare March 5, 2026 17:08
Remove fetch-depth: 0 from all checkout steps. Shallow clones (depth=1,
the default) skip git history download, saving ~10s and ~220 MB per job.

The job-preamble action resolves version info for shallow clones in ~3s:
1. git ls-remote finds the nearest version tag (~0.4s)
2. git fetch --shallow-exclude deepens history to the tag (~2s)
3. A local tag at the boundary enables git describe and make tag
4. BUILD_TAG env var is set as a fallback for make/env.mk

On master: finds the latest X.Y.x tag.
On release branches: detects version from branch name and finds the
nearest X.Y.Z or RC tag (e.g. release-4.10 finds 4.10.0, not 4.11.x).

Kept fetch-depth: 0 for misc-checks (git diff --check needs full
history), push-scanner-manifests (no preamble), and update_*_periodic
workflows (need all branches/tags).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@davdhacs davdhacs force-pushed the davdhacs/gha-shallow-clone branch from ef1c51f to f9ac8d2 Compare March 5, 2026 17:15
@davdhacs davdhacs closed this Mar 5, 2026
@davdhacs davdhacs reopened this Mar 5, 2026
davdhacs and others added 15 commits March 5, 2026 10:18
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- check-crd-compatibility: checks out previous released version by tag
- local-roxctl-tests: bats tests use git describe for version comparison

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The broad version tag pattern matched a bogus 9.9.99-rc-9 tag,
causing the shallow-exclude to fetch 8673 commits (~97s). Restrict
to X.Y.x tags only (20 tags vs 1111) for fast, predictable matching.

On release branches this means the version string uses the .x tag
(branch point) rather than the nearest release/RC tag. The SHA is
still correct and unique.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Try the GraphQL compare API first to check tag ancestry and get the
commit count in a single lightweight call (~200 bytes response) without
modifying the shallow clone. Falls back to the git-based approach
(--shallow-exclude + rev-list) when the API is unavailable.

The GraphQL API avoids downloading 2+ MB from the REST compare endpoint
and doesn't require fetching additional git history, making it faster
(~0.5s vs ~3s) and cleaner (no side effects on the shallow clone).

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the git ls-remote + curl loop with a single GitHub GraphQL
query that checks all .x version tags (3.60-3.80, 4.0-4.30) for
ancestry and commit count in one call. Non-existent tags return null,
the nearest ancestor has behindBy=0 with the smallest aheadBy.

Separated into two steps: API (primary) and git (fallback). The git
fallback only runs when the API step fails (no token, network issues).

Benefits:
- No git ls-remote needed (saves ~1s)
- One API call checks 52 tags at once (~1.4 KB response)
- No git history modification (no --shallow-exclude side effects)
- Cleaner step separation for debugging

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the python-based GraphQL query builder with pure bash:
- Build aliased refs() queries with inline compare() in a bash loop
- JSON-encode with sed, parse response with grep+sed
- No python, no jq, no gh dependency — just curl+grep+sed

Uses refs(query:"M.N.x",first:1) which returns the exact .x tag
(alphabetically before any -nightly suffix) with inline compare
for ancestry check and commit count. Covers 4.x/5.x/6.x ranges.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use GitHub's latestRelease to discover the current version, then
check only ±3 minor versions across ±1 major (21 aliases instead
of 93). Fully self-adjusting — no hardcoded version ranges to
maintain when the project moves to 5.x or beyond.

Two GraphQL calls (~1.4s total):
  1. latestRelease { tagName } → e.g. "4.10.0" → major=4, minor=10
  2. refs+compare for 3.7.x–3.13.x, 4.7.x–4.13.x, 5.7.x–5.13.x

Extracted gql() helper to reduce curl boilerplate.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the 21-alias batched GraphQL query with direct tag derivation:
- Release branches: derive tag from branch name (release-4.10 → 4.10.x)
- Master: highest release version + 1 minor (4.10.0 → 4.11.x)

This reduces the API step to 1-2 targeted GraphQL calls instead of
a batched query with many aliases. Much easier to read and review.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On release branches, git describe finds the nearest release tag
(e.g. 4.9.3-31-gabcdef), not the .x branch point tag. Use the
releases API to find the latest M.N.Z release matching the branch
version. This shares the same releases query with the master path.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TEMPORARY: run branch-aware API, curl-loop API, and git-only
approaches side-by-side on every job. Only branch-aware sets
BUILD_TAG; the other two log their results via ::notice:: for
timing comparison. Remove after evaluation.

Note: git-only runs last since it modifies the shallow clone.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the three comparison steps with the final approach:
- Use release branch names for version discovery (GraphQL branch
  listing, ~400ms) instead of releases API
- Release branches: derive tag from branch name (0ms, no API call)
- Master: highest release branch + 1 minor (release-4.10 → 4.11.x)
- Background git verification runs in parallel with subsequent
  preamble steps; a verification step at the end fails on mismatch

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The API (aheadBy) and git (rev-list --count +1) commit counts
differ by 1 because --shallow-exclude excludes the tag commit
itself. Compare only the version prefix (e.g. 4.11.x) to verify
the correct tag was resolved.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The GraphQL approach saved ~1s per job but added significant
complexity (GITHUB_TOKEN, gql() helper, branch parsing, background
verification). With container init at 55s, the 1s difference is
negligible. The git approach is 22 lines, needs no API token, and
handles all branch types via the tag ancestry loop.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@davdhacs davdhacs force-pushed the davdhacs/gha-shallow-clone branch from f16125e to 4fd975e Compare March 6, 2026 06:35
This workflow checks out external repos (acs-fleet-manager) and
runs kind cluster tests that may need full history. Revert to
avoid unrelated failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@davdhacs davdhacs force-pushed the davdhacs/gha-shallow-clone branch from 4fd975e to 3c9324b Compare March 6, 2026 06:50
@davdhacs
Copy link
Contributor Author

davdhacs commented Mar 6, 2026

/test all

@openshift-ci
Copy link

openshift-ci bot commented Mar 6, 2026

@davdhacs: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/gke-qa-e2e-tests 3c9324b link false /test gke-qa-e2e-tests
ci/prow/gke-nongroovy-e2e-tests 3c9324b link true /test gke-nongroovy-e2e-tests
ci/prow/gke-upgrade-tests 3c9324b link false /test gke-upgrade-tests
ci/prow/gke-ui-e2e-tests 3c9324b link true /test gke-ui-e2e-tests
ci/prow/ocp-4-12-nongroovy-e2e-tests 3c9324b link false /test ocp-4-12-nongroovy-e2e-tests
ci/prow/ocp-4-20-nongroovy-e2e-tests 3c9324b link false /test ocp-4-20-nongroovy-e2e-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants