perf(ci): shard go and go-postgres unit test jobs#19744
perf(ci): shard go and go-postgres unit test jobs#19744
Conversation
Fix tests and scripts to work on ubuntu-latest host runners (no container) in addition to CI container environments. - rate_limited_logger.go: add /home/runner/work/ path prefix (container used /__w/) - download_zip_test.go: accept ErrPermission for /root/ paths (non-root runner returns permission denied, not not-found) - detection_test.go: delete circular test - lib.sh: escape & in bash 5.2+ replacements (verified 4.4-5.3) - helpers.bash: add yq_multidoc helper (yq 4.x adds --- separators) - roxctl-netpol-generate-*.bats: use yq_multidoc All changes are backward-compatible with container environments. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add SKIP_DEPS guard to deps target: when set, skip go mod tidy for jobs that don't need it (e.g. test-only jobs). - Split go-postgres-unit-tests into main/migrator subtargets to enable independent sharding in CI. - Remove -count=1 from integration-unit-tests to allow Go test cache hits. All changes are backward-compatible: existing callers work unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
grep -v returns exit code 1 when no lines match (e.g. yq output is only --- separators), which would look like a yq failure. sed always returns 0 regardless of whether lines were deleted. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Run Go test jobs directly on ubuntu-latest instead of in the CI container image. This saves ~55s container init overhead per job and enables faster cache I/O (no overlay filesystem). Changes per job: - Remove container: block - Add actions/setup-go with GOTOOLCHAIN=auto override - Replace su postgres with docker run postgres - Add pg_isready retry loop (docker postgres needs warmup time) - go-bench: skip on PRs (if: github.event_name == 'push') Depends on #19712 (host-runner test fixes). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split the monolithic go job into 5 shards (pkg-helm, pkg-other, central-1, central-2, rest) and go-postgres into 2 shards (main, migrator) for parallel execution. Also split integration and operator tests into separate jobs (go-integration, go-operator-integration) so they run in parallel with unit tests instead of sequentially after them. Depends on #19743 (container removal) and #19742 (SKIP_DEPS + postgres subtargets). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The
EXCLUDE/exclude-patternregexes used to buildPACKAGES(e.g./central/[o-z],/helm/) rely on substring matches ingo listoutput and could easily over-/under-match if paths change; consider anchoring these patterns to path segments or using more explicit regexes to make the sharding more robust. - In the
go-integrationjob,go list ./... | grep -E 'registries|scanners|notifiers'will select any package whose import path contains those substrings (including potential helpers or unrelated dirs); if you intend to restrict to specific packages/directories, tighten this filter to explicit path patterns.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `EXCLUDE`/`exclude-pattern` regexes used to build `PACKAGES` (e.g. `/central/[o-z]`, `/helm/`) rely on substring matches in `go list` output and could easily over-/under-match if paths change; consider anchoring these patterns to path segments or using more explicit regexes to make the sharding more robust.
- In the `go-integration` job, `go list ./... | grep -E 'registries|scanners|notifiers'` will select any package whose import path contains those substrings (including potential helpers or unrelated dirs); if you intend to restrict to specific packages/directories, tighten this filter to explicit path patterns.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Images are ready for the commit at 9c2e078. To use with deploy scripts, first |
📝 WalkthroughWalkthroughThis PR refactors the CI/CD workflow to use job sharding and setup-go actions, adds new integration test jobs, updates database setup patterns, fixes XML escaping in JUnit output, removes an obsolete test, adds GitHub Actions workspace path trimming, enhances error handling in path traversal tests, and introduces a YAML multi-document helper function for test assertions. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/unit-tests.yaml:
- Around line 185-186: The "Cache Go dependencies" step using
./.github/actions/cache-go-dependencies is missing a key-suffix to separate
caches for different matrix runs; update that step to include a key-suffix like
${{ matrix.gotags }} (or ${{ matrix.gotags }}-<other-identifiers>) so the cache
key differs per gotags matrix entry and prevents collisions between matrix
variations.
- Around line 83-87: The unit-tests job is always setting GOTAGS to "test" and
ignores the matrix dimension; update the GOTAGS environment passed to
scripts/go-test.sh so it includes the matrix value (`${{ matrix.gotags }}`) as a
prefix before the existing test tag (so the matrix variations actually run with
different GOTAGS), i.e. change the GOTAGS assignment used when invoking
go-test.sh (the GOTAGS=... before scripts/go-test.sh) to incorporate `${{
matrix.gotags }}` while preserving the existing ",test" behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: dffb98ea-2637-41a2-8e05-52bc5f3ec7e3
📒 Files selected for processing (8)
.github/workflows/unit-tests.yamlpkg/containers/detection_test.gopkg/logging/rate_limited_logger.goroxctl/common/zipdownload/download_zip_test.goscripts/ci/lib.shtests/roxctl/bats-tests/helpers.bashtests/roxctl/bats-tests/local/roxctl-netpol-generate-development.batstests/roxctl/bats-tests/local/roxctl-netpol-generate-release.bats
💤 Files with no reviewable changes (1)
- pkg/containers/detection_test.go
| GOTAGS="${GOTAGS:+$GOTAGS,}test" CGO_ENABLED=1 GOEXPERIMENT=cgocheck2 MUTEX_WATCHDOG_TIMEOUT_SECS=30 \ | ||
| scripts/go-test.sh -timeout 25m -race -cover \ | ||
| -coverprofile test-output/coverage.out -v \ | ||
| $PACKAGES \ | ||
| | tee test-output/test.log |
There was a problem hiding this comment.
Missing ${{ matrix.gotags }} causes both matrix variations to run identical tests.
The gotags matrix dimension is defined (line 24) but not used in this command. Both GOTAGS="" and GOTAGS=release matrix jobs will execute with identical GOTAGS=test, defeating the purpose of testing with release tags.
Compare with go-integration job (line 147) which correctly uses ${{ matrix.gotags }} prefix.
🐛 Proposed fix
# shellcheck disable=SC2086
- GOTAGS="${GOTAGS:+$GOTAGS,}test" CGO_ENABLED=1 GOEXPERIMENT=cgocheck2 MUTEX_WATCHDOG_TIMEOUT_SECS=30 \
+ ${{ matrix.gotags }} GOTAGS="${GOTAGS:+$GOTAGS,}test" CGO_ENABLED=1 GOEXPERIMENT=cgocheck2 MUTEX_WATCHDOG_TIMEOUT_SECS=30 \
scripts/go-test.sh -timeout 25m -race -cover \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/unit-tests.yaml around lines 83 - 87, The unit-tests job
is always setting GOTAGS to "test" and ignores the matrix dimension; update the
GOTAGS environment passed to scripts/go-test.sh so it includes the matrix value
(`${{ matrix.gotags }}`) as a prefix before the existing test tag (so the matrix
variations actually run with different GOTAGS), i.e. change the GOTAGS
assignment used when invoking go-test.sh (the GOTAGS=... before
scripts/go-test.sh) to incorporate `${{ matrix.gotags }}` while preserving the
existing ",test" behavior.
| - name: Cache Go dependencies | ||
| uses: ./.github/actions/cache-go-dependencies |
There was a problem hiding this comment.
Missing key-suffix for cache differentiation between gotags matrix runs.
Other jobs include key-suffix: ${{ matrix.gotags }}-... to prevent cache collisions between matrix variations. This job omits it, which could cause cache conflicts.
🔧 Proposed fix
- name: Cache Go dependencies
uses: ./.github/actions/cache-go-dependencies
+ with:
+ key-suffix: ${{ matrix.gotags }}-operator-integration📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Cache Go dependencies | |
| uses: ./.github/actions/cache-go-dependencies | |
| - name: Cache Go dependencies | |
| uses: ./.github/actions/cache-go-dependencies | |
| with: | |
| key-suffix: ${{ matrix.gotags }}-operator-integration |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/unit-tests.yaml around lines 185 - 186, The "Cache Go
dependencies" step using ./.github/actions/cache-go-dependencies is missing a
key-suffix to separate caches for different matrix runs; update that step to
include a key-suffix like ${{ matrix.gotags }} (or ${{ matrix.gotags
}}-<other-identifiers>) so the cache key differs per gotags matrix entry and
prevents collisions between matrix variations.
…o davdhacs/shard-go-tests
…est shard Use ./... with exclude-pattern /pkg/|/central/ instead of listing every top-level directory. Automatically picks up new directories. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Match the other test jobs' stable ldflag env vars. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The operator Makefile needs the scanner module in GOMODCACHE for proto generation. Previously inherited from the go job's make deps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Operator tests use their own test runner, not scripts/go-test.sh, so test-output/test.log is never created. The generate-junit-reports make target fails when the log doesn't exist. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Explain why each job overrides GOTOOLCHAIN from local to auto: setup-go sets GOTOOLCHAIN=local, but sub-module tools with newer go directives need auto to download the required toolchain. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Explain that setup-go reads go.mod to install Go, then exports GOTOOLCHAIN=local to force that version. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Clarify that setup-go exports GOTOOLCHAIN matching the go.mod version, not hardcoded to "local". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
07f23c0 to
b12da02
Compare
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The go-bench PR skip is handled separately in #19758. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #19744 +/- ##
===========================================
- Coverage 49.59% 49.02% -0.58%
===========================================
Files 2760 768 -1992
Lines 208144 74767 -133377
===========================================
- Hits 103239 36652 -66587
+ Misses 97239 34762 -62477
+ Partials 7666 3353 -4313
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Description
Split Go test jobs into parallel shards and add supporting Makefile changes:
Makefile changes:
SKIP_DEPS: wrapdepstarget inifdef SKIP_DEPSguard to skipgo mod tidyin test-only CI jobsgo-postgres-unit-testsintogo-postgres-unit-tests-main/go-postgres-unit-tests-migratorsubtargets for independent sharding-count=1fromintegration-unit-teststo allow Go test cache hitsWorkflow changes (unit-tests.yaml):
go job → 5 shards:
pkg-helm,pkg-other,central-1,central-2,restgo-postgres → 2 shards:
main(central/pkg/tools),migrator(with-p 1)Split out from go job:
go-integration: registries/scanners/notifiers integration testsgo-operator-integration: operator test-integration + test-helmAll Makefile changes are backward-compatible.
Includes #19712 (host-runner test fixes).
Split from #19678, replaces #19742.
User-facing documentation
Testing and quality
Automated testing
How I validated my change
Previously validated on #19678.
🤖 Generated with Claude Code