Skip to content

perf(ci): remove container from unit-tests jobs#19678

Draft
davdhacs wants to merge 53 commits intomasterfrom
davdhacs/matrix-unittests
Draft

perf(ci): remove container from unit-tests jobs#19678
davdhacs wants to merge 53 commits intomasterfrom
davdhacs/matrix-unittests

Conversation

@davdhacs
Copy link
Copy Markdown
Contributor

@davdhacs davdhacs commented Mar 29, 2026

Description

Comprehensive unit-tests workflow optimization: remove all CI containers, shard Go tests, skip GOMODCACHE, cache integration tests, skip benchmarks on PRs, and optimize Cypress.

Master: 33-36m / 188m compute → 3.2m / 54m compute. 10-11x faster, 71% less compute.

Changes

  1. Container removal: All 9 jobs run on ubuntu-latest. Saves 55-77s init per job.
  2. Go test sharding (5-way): pkg-helm, pkg-other, central-1, central-2, rest.
  3. Go-postgres sharding (2-way): main / migrator.
  4. Integration test split: go-integration (cacheable, removed -count=1) + go-operator-integration.
  5. Roxctl test split: dev / release variants in parallel.
  6. Benchmarks: master-only (skipped on PRs).
  7. Skip GOMODCACHE: Warm test shards don't need 2.9GB module cache. Cache step: 60s → 4s.
  8. SKIP_DEPS: Bypass go mod tidy in make targets (validated in style workflow).
  9. Shallow checkout: Test jobs don't need git history.
  10. Cypress optimization: cypress-io/github-action (caches binary), CYPRESS_VIDEO=false, artifacts upload only on failure.

Warm run results (run 23728217606)

Job Time
go (pkg-helm) 1.0m
go-postgres (migrator) 1.0m
shell-unit-tests 1.0m
openshift-ci 1.5m
go-integration 1.6-1.7m
go (central-2) 2.0-2.2m
go (pkg-other) 2.0-2.2m
go-postgres (main) 2.1m
go (rest) 2.4-2.7m
go (central-1) 2.8-3.2m
go-operator-integration 2.8-2.9m
local-roxctl (release) 2.9m
sensor-integration 3.0m
ui-component 3.1m
local-roxctl (dev) 3.2m
Metric Master This PR (warm) This PR (cold)
Wall-clock 33-36m 3.2m 11m
Total compute 188m 54m 70m
Speedup 10-11x 3x
Compute savings 71% 63%
Jobs 11 25 25

Key findings

  • GOMODCACHE (2.9GB) was the cache bottleneck, not GOCACHE. Skipping saves ~60s/job.
  • go mod tidy in make deps downloaded all modules even when tests were cached. SKIP_DEPS=1 bypasses this.
  • -count=1 on integration tests from 2019 was unnecessary — 38/50 packages use mocks.
  • 71% less total compute despite 2.3x more jobs — caching eliminates redundant work.
  • Smaller PRs = faster CI — test cache hits are per-package.

Test fixes for host-runner environment

TestContainerDetection, zip path perms, postgres locale, rate_limited_logger paths, bash 5.2+ XML escaping, GOTOOLCHAIN=auto, bats-action (SHA-pinned), yq v4 separators.

Based on #19618. Depends on #19617, #19395, #19580. Suggested by @janisz.

User-facing documentation

  • CHANGELOG.md update not needed
  • Documentation PR not needed

Testing and quality

  • Production ready
  • CI results inspected

Automated testing

  • modified existing tests

How I validated my change

  • 15+ CI runs across cold/warm cycles, all green
  • Cold: 11m wall-clock, 70m compute
  • Warm: 3.2m wall-clock, 54m compute

Partially generated by AI.

davdhacs and others added 12 commits March 27, 2026 22:59
Tests don't need real version info. Use fixed version strings
(0.0.0-test) instead of git-describe values for test ldflags.
This makes link ActionIDs stable across commits, enabling Go's
test result cache to hit.

Also adds -buildvcs=false to stop Go 1.18+'s VCS stamping which
independently changes link ActionIDs on every commit.

Builds keep the current XDef/status.sh mechanism unchanged.

Verified locally: tests cache across commits with fixed ldflags.

Combined with the mtime fix (#19395), this enables full warm
test caching in CI (2.5-8.2x faster test jobs).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Go's test cache validates inputs by (path, size, mtime) — not content.
git checkout sets all file mtimes to the current time, which differs
between CI runs, causing test results to never be cached.

Fix: set all tracked file mtimes to a fixed date (2001-01-01) after
checkout and cache restore. This makes the test cache hit across runs
when file content hasn't changed.

Measured impact (from PR #19201):
- go unit tests: 35m → 8m (4.5x)
- go-postgres: 31m → 36s (52x)
- sensor-integration: 26m → 2.5m (11x)
- Test cache hit rate: 97.6% (681/698 packages)

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
git ls-files only outputs files, not directories. Tests that read
directories at runtime (os.ReadDir, filepath.Walk) check directory
mtimes for cache validation. Without stable dir mtimes, these tests
always miss the cache.

The helm chart tests (pkg/helm/charts/tests/*) use testdata directories
and were the largest uncached tests at 712s combined. This fix caches
them, dropping the Go Unit Tests step from 486s to 85s.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… entries

pre-build-go-binaries: default and prerelease (GOTAGS=release) share 93%
of cache entries (only 6 packages use release build tags). Separating them
doubles cache storage and causes the trim to delete shared entries. Keep
key-suffix only for race-condition-debug which uses -race/CGO_ENABLED=1
and has 39% unique entries.

build-and-push-operator: branding (RHACS vs STACKROX) is a runtime env
var, not a build tag. Compiled output is identical. Remove key-suffix
entirely.

Verified locally:
  default:    8911 entries (baseline)
  prerelease: +691 new (7% unique, 93% shared)
  race:       +6300 new (39% unique, 61% shared)
  branding:   +0 new (0% unique, 100% shared)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
unit-tests.yaml: go and go-postgres have GOTAGS matrix (""/ release)
sharing one cache key. With the GOCACHE trim, each variant deletes
the other's entries on every run, causing 3GB of churn. Add key-suffix
to give each variant its own cache.

scanner-build.yaml: pre-build-scanner-go-binary cross-compiles for
multiple architectures but doesn't set GOARCH on the cache step,
so all arches share one amd64-labeled cache key. Add env GOARCH.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address sourcery-ai review: make GHA expression grouping explicit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The operator branding key-suffix is unnecessary for GOCACHE (0% unique
entries), but PR #19417 will unify operator builds entirely. Removing
the key-suffix here would create a merge conflict with that PR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address sourcery-ai feedback: clarify why only race-condition-debug
gets a separate cache key (39% unique entries from -race/CGO_ENABLED=1)
while default and prerelease share (93% overlap).

GOARCH is intentionally set only on the cache step, not job-level,
to avoid changing the default arch for other steps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the CI container image from go, go-postgres, go-bench,
local-roxctl-tests, shell-unit-tests, openshift-ci-unit-tests,
and ui jobs. Use actions/setup-go and actions/setup-node instead.
Saves ~55s container init per job.

For postgres jobs, use docker run instead of the built-in postgres
from the CI image. ui-component keeps its container (Cypress needs
system libraries not on ubuntu-latest).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 29, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path=".github/actions/cache-go-dependencies/action.yaml" line_range="99-108" />
<code_context>
       run: make deps --always-make
       shell: bash
+
+    - name: Stabilize file mtimes for Go test cache
+      run: |
+        # Go's test cache validates inputs by (path, size, mtime) — NOT content.
+        # git checkout sets all mtimes to "now", which differs between CI runs,
+        # causing every test to miss the cache. Setting a fixed mtime makes the
+        # test cache hit across runs when file content hasn't changed.
+        #
+        # Risk: if a non-.go testdata file changes content but not size between
+        # commits, the test cache could return a stale result. This is extremely
+        # rare and Go's own test cache has the same inherent limitation.
+        git ls-files | xargs touch -t 200101010000
+        # Also stabilize directory mtimes — tests using os.ReadDir or
+        # filepath.Walk check directory mtimes too. git ls-files only
</code_context>
<issue_to_address>
**issue (bug_risk):** Use NUL-delimited output from `git ls-files` to handle paths with spaces or special characters safely.

The current `git ls-files | xargs touch -t ...` pipeline will mis-handle paths containing spaces, quotes, or other shell-special characters because `xargs` splits on whitespace. Using NUL-delimited output avoids that:

```bash
git ls-files -z | xargs -0 touch -t 200101010000
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

- Add bats-core/bats-action@4.0.0 for shell-unit-tests and local-roxctl-tests
- Remove POSTGRES_USER="$USER" — tests expect default 'postgres' role
- Keep openshift-ci-unit-tests in container (needs pycodestyle/pylint)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@davdhacs davdhacs force-pushed the davdhacs/matrix-unittests branch from 4c72a6f to 0df8a9d Compare March 29, 2026 05:03
The test helpers (test_helpers.bats, helpers.bash) expect bats-support
and bats-assert at /usr/lib/node_modules/ (CI image layout). Configure
bats-action to install there. Skip detik and file libs (not used).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@rhacs-bot
Copy link
Copy Markdown
Contributor

rhacs-bot commented Mar 29, 2026

Images are ready for the commit at e803987.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-538-g556daecc16.

- TestContainerDetection: remove container assumption, just log result
- TestExtractZipToFolder_PreventPathTraversal: accept any error (not just
  ErrNotExist) since non-root gets ErrPermission on /root/
- Postgres locale: use --locale=C to match CI container collation
  (default en_US.utf8 sorts differently, failing splunk tests)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
davdhacs and others added 2 commits March 29, 2026 07:54
getTrimmedFilePath only knew the container path (/__w/), not the host
runner path (/home/runner/work/). Without the container, file paths
in log mock expectations didn't match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In bash 5.2+, & in ${var//pattern/replacement} refers to the matched
pattern, not a literal &. This broke XML escaping in save_junit_failure
where &quot; became "quot; (& expanded to the matched ").

Prefix & with \ in all replacement strings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 28.87%. Comparing base (a90efa3) to head (556daec).

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #19678       +/-   ##
===========================================
- Coverage   49.64%   28.87%   -20.78%     
===========================================
  Files        2747     3989     +1242     
  Lines      207261   269256    +61995     
===========================================
- Hits       102905    77752    -25153     
- Misses      96702   186299    +89597     
+ Partials     7654     5205     -2449     
Flag Coverage Δ
go-unit-tests 28.87% <100.00%> (-20.78%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

davdhacs and others added 3 commits March 29, 2026 08:52
actions/setup-go sets GOTOOLCHAIN=local which blocks sub-module tools
needing newer Go (e.g. protoc-gen-go-grpc needs go 1.25.6). Override
with GOTOOLCHAIN=auto so they can auto-download.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
yq v4.x outputs --- separators between multi-document results.
Add yq_multidoc helper to helpers.bash that strips them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
davdhacs and others added 8 commits March 29, 2026 21:43
The CI container added 77s init overhead. Cypress installs its own
binary via npm ci. Ubuntu-latest has the required system libraries
for Cypress (libgtk, libnss, etc.) since ubuntu-22.04.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The cypress action caches the Cypress binary (~200MB) automatically,
avoiding re-download/verification on each run. Also handles npm ci
and test execution in one step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
make build-prep → deps → go mod tidy downloads ALL modules even when
tests are cached. Replace make targets with direct go-test.sh calls
for go shards, go-postgres, and go-integration. go mod tidy is already
validated in the style workflow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- go-postgres: restore make target call (bypassing deps caused foreign
  key constraint errors). Also restore GOMODCACHE (needed for make deps).
- ui-component: add continue-on-error to junit2jira (cypress action
  outputs to different path, tests themselves pass).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add SKIP_DEPS guard to Makefile deps target. When set, skips go.sum
dependency chain (go mod tidy) which downloads all modules. Test jobs
with warm GOCACHE don't need module downloads. go mod tidy validation
is handled by the style workflow.

Enable SKIP_DEPS on go-postgres and local-roxctl-tests, re-enable
skip-mod-cache on go-postgres.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CYPRESS_VIDEO=false: skip video recording (saves disk + upload time)
- CYPRESS_SCREENSHOT_ON_RUN_FAILURE=true: keep screenshots on failure
- Upload artifacts only on failure (was always)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Flipping the default would break build/style workflows. Keep default
false (safe for build/compile jobs), set true explicitly on test jobs
that don't need GOMODCACHE. Operator-integration keeps false (needs
protoc tools).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
davdhacs and others added 3 commits March 30, 2026 12:00
Replace skip-mod-cache with cache-mod (default false) from #19688.
GOMODCACHE disabled by default — Go lazy-downloads from proxy.
Remove make deps from action, go mod tidy from build.yaml.
Scanner proto fixes: targeted go mod download.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@davdhacs davdhacs closed this Mar 30, 2026
@davdhacs davdhacs reopened this Mar 30, 2026
Master merge re-added integration-unit-tests, operator test-integration,
and operator test-helm to the go sharded job. These have their own
dedicated jobs (go-integration, go-operator-integration). Also add
test-helm to go-operator-integration (new step from master).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Chores

    • Optimized CI: optional module caching, per-architecture cache keys, removed redundant pre-build steps, and moved many jobs to host-run sharded matrices for faster, more granular runs.
    • Makefile and build scripts updated to support shard-aware targets and conditional dependency handling.
    • Improved JUnit/XML escaping and log file-path trimming for clearer CI output.
  • Tests

    • Added and split integration test jobs (including operator/scanner areas) and updated test runners and artifacts.
    • Introduced a yq helper for multi-document YAML and adjusted several Bats tests.
    • Relaxed a few test assertions to be less environment-dependent.

Walkthrough

Split and host-run CI test jobs into sharded matrices, added conditional Go module cache control, removed some explicit module-resolution steps, adjusted Go/tooling scripts and Makefile targets, and updated tests/tools for multi-document YAML handling and CI path normalization.

Changes

Cohort / File(s) Summary
GitHub Actions: cache & build workflows
.github/actions/cache-go-dependencies/action.yaml, .github/workflows/build.yaml, .github/workflows/scanner-build.yaml
Added cache-mod input to control GOMODCACHE cache/restore; removed explicit go mod tidy steps from build workflows; added GOARCH env on scanner cache step for arch-scoped caching.
Unit test CI restructuring
.github/workflows/unit-tests.yaml
Rewrote go job into a host-run matrix with matrix.shard splits, added go-integration and go-operator-integration, removed containerized Postgres in favor of docker-run service, updated caching keys to include shard/gotags, moved/centralized junit->Jira reporting, switched to scripts/go-test.sh, and updated many test job behaviors (ui, bats, component tests).
Makefile & protogen
Makefile, make/protogen.mk
Added SKIP_DEPS handling to deps target, split go-postgres-unit-tests into main/migrator, adjusted test tee/overwrite behavior, removed -count=1 from integration invocation, and added go mod download pre-step when discovering SCANNER_DIR.
Tooling & CI scripts
scripts/go-tool.sh, scripts/ci/lib.sh, qa-tests-backend/scripts/copy_scanner_protos.sh
scripts/go-tool.sh skips status.sh parsing for TOOL=test and uses fixed ldflags for test builds; scripts/ci/lib.sh tightened JUnit XML escaping replacements; scanner proto copy now runs go mod download before locating module dir.
Tests: behavior & assertions
pkg/containers/detection_test.go, roxctl/common/zipdownload/download_zip_test.go
TestContainerDetection no longer asserts CI-local container expectations (only logs value); zip extraction test relaxed to assert any error from os.Stat rather than specifically fs.ErrNotExist.
Bats tests & helpers
tests/roxctl/bats-tests/helpers.bash, tests/roxctl/bats-tests/local/roxctl-netpol-generate-development.bats, tests/roxctl/bats-tests/local/roxctl-netpol-generate-release.bats
Added yq_multidoc() helper to strip --- separators; replaced yq e calls with yq_multidoc e across multi-document YAML tests and removed obsolete commented yq v4 blocks.
Logging path normalization
pkg/logging/rate_limited_logger.go
Added GitHub Actions runner path prefix trimming so CI file paths produce consistent log keys.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Runner
participant SetupGo as actions/setup-go
participant CacheAction as cache-go-dependencies
participant DockerPostgres as Docker Postgres
participant Tester as scripts/go-test.sh
participant JUnit2Jira as junit2jira

Runner->>SetupGo: setup Go (go-version-file / GOTOOLCHAIN)
Runner->>CacheAction: run cache action (cache-mod & GOARCH/gotags key)
alt Postgres needed
    Runner->>DockerPostgres: docker run postgres service
    DockerPostgres-->>Runner: pg_isready polling
end
Runner->>Tester: run tests (scripts/go-test.sh, shard-specific PACKAGES)
Tester-->>Runner: test logs/artifacts (tee -> test-output)
Runner->>JUnit2Jira: publish results (junit2jira step)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'perf(ci): remove container from unit-tests jobs' is specific and directly reflects the main change: removing container images from CI unit-test jobs.
Description check ✅ Passed The PR description is comprehensive and covers the core requirements: it details the changes, provides metrics validating the improvement claim, explains testing approach (modified existing tests, 15+ CI runs), and notes no documentation changes needed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch davdhacs/matrix-unittests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/scanner-build.yaml:
- Around line 104-107: The job using the composite action "Cache Go
dependencies" currently calls uses: ./.github/actions/cache-go-dependencies
without enabling the module cache; set the composite action input that enables
GOMODCACHE restore by adding a with: cache-mod: 'true' (alongside the existing
env: GOARCH: ${{ matrix.goarch }}), so the action restores the
architecture-independent module cache across matrix entries and avoids
re-downloading modules for each GOARCH.

In @.github/workflows/unit-tests.yaml:
- Around line 252-257: The readiness loops that run the shell snippet "for i in
$(seq 1 30); do docker exec postgres pg_isready -h 127.0.0.1 && break; sleep 1;
done" currently continue even on timeout; change each such loop so after it
finishes you test whether pg_isready succeeded (e.g., check the loop result or
call docker exec pg_isready once more), and if it failed print "docker logs
postgres" to stdout/stderr and exit with non-zero (exit 1) so the workflow step
fails fast on timeout. Update both occurrences of that loop to include the
post-loop failure check, logs dump, and exit.
- Around line 441-447: The job's matrix defines matrix.variant.env but the
"Local roxctl tests" step does not use it; update the "Local roxctl tests" step
to consume the matrix variable by either adding an env: block that sets
NO_BATS_ROXCTL_REBUILD using ${{ matrix.variant.env }} or by inlining the value
when running bats (for example prefix the command with ${{ matrix.variant.env
}}); target the "Local roxctl tests" step in the workflow and ensure it
references matrix.variant.env so the NO_BATS_ROXCTL_REBUILD=1 optimization is
actually applied.

In `@make/protogen.mk`:
- Line 37: SCANNER_DIR is using recursive expansion ("=") which runs the
embedded go mod download during Makefile parse and causes module fetch latency;
change its definition to an immediate/simple assignment using ":=" so the shell
command executes once and the value is cached, preventing repeated parse-time
downloads that also affect SCANNER_PROTO_BASE_PATH, ALL_SCANNER_PROTOS and
SCANNER_M_ARGS downstream references.

In `@Makefile`:
- Around line 560-576: The migrator subtarget is truncating the shared
GO_TEST_OUTPUT_PATH and overwriting the main shard's logs; update the
go-postgres-unit-tests-migrator recipe so its final pipe uses append mode
(change its "tee $(GO_TEST_OUTPUT_PATH)" to "tee -a $(GO_TEST_OUTPUT_PATH)") so
migrator results are appended instead of replacing the main shard output (leave
go-postgres-unit-tests-main behavior as the initial/overwrite writer).

In `@pkg/containers/detection_test.go`:
- Around line 7-10: TestContainerDetection is non-verifying because it only logs
IsRunningInContainer(); refactor the detection logic to be testable and add
deterministic unit tests that assert expected results using controlled
inputs/fixtures. Specifically, make the core detection functions accept
injectable sources (e.g., readers/paths or an interface used by
IsRunningInContainer) or extract parsing helpers (e.g., parseCgroup, parseProc1)
so you can write table-driven tests that call those helpers with fixture data;
then update TestContainerDetection to either assert the smoke behavior against a
mocked source or keep it as a simple runtime smoke test but add new unit tests
(e.g., TestParseCgroup_Docker, TestParseProc1_Systemd) that validate
deterministic outcomes. Ensure references: IsRunningInContainer,
TestContainerDetection, and any new helper names you create are used so
reviewers can locate and verify the changes.

In `@tests/roxctl/bats-tests/helpers.bash`:
- Around line 14-16: The helper yq_multidoc currently masks failures because the
pipeline returns grep's exit code; modify yq_multidoc so the pipeline preserves
yq's exit status (e.g., run the pipeline under pipefail within the function) and
ensure the function exits with yq's error code when yq fails while still
filtering out '---' lines with grep -v '^---$'; update the yq_multidoc function
to use that pattern so callers reliably see yq failures.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 5ba8def7-56ea-4d28-b90b-47cc33faaf6c

📥 Commits

Reviewing files that changed from the base of the PR and between a90efa3 and 8351f40.

📒 Files selected for processing (15)
  • .github/actions/cache-go-dependencies/action.yaml
  • .github/workflows/build.yaml
  • .github/workflows/scanner-build.yaml
  • .github/workflows/unit-tests.yaml
  • Makefile
  • make/protogen.mk
  • pkg/containers/detection_test.go
  • pkg/logging/rate_limited_logger.go
  • qa-tests-backend/scripts/copy_scanner_protos.sh
  • roxctl/common/zipdownload/download_zip_test.go
  • scripts/ci/lib.sh
  • scripts/go-tool.sh
  • tests/roxctl/bats-tests/helpers.bash
  • tests/roxctl/bats-tests/local/roxctl-netpol-generate-development.bats
  • tests/roxctl/bats-tests/local/roxctl-netpol-generate-release.bats
💤 Files with no reviewable changes (1)
  • .github/workflows/build.yaml

Comment on lines +252 to 257
run: |
for i in $(seq 1 30); do
docker exec postgres pg_isready -h 127.0.0.1 && break
sleep 1
done

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check if the workflow file exists and examine the structure
if [ -f .github/workflows/unit-tests.yaml ]; then
  echo "File exists. Checking lines 252-257:"
  sed -n '252,257p' .github/workflows/unit-tests.yaml
  echo ""
  echo "Checking lines 322-327:"
  sed -n '322,327p' .github/workflows/unit-tests.yaml
  echo ""
  echo "Total lines in file:"
  wc -l .github/workflows/unit-tests.yaml
else
  echo "File not found"
fi

Repository: stackrox/stackrox

Length of output: 487


🏁 Script executed:

# Search for all readiness check blocks in the file to see the pattern
rg -n "pg_isready" .github/workflows/unit-tests.yaml -B 3 -A 3

Repository: stackrox/stackrox

Length of output: 463


🏁 Script executed:

# Check the complete context around these lines to understand the shell script behavior
sed -n '248,260p' .github/workflows/unit-tests.yaml
echo "---"
sed -n '318,330p' .github/workflows/unit-tests.yaml

Repository: stackrox/stackrox

Length of output: 670


🏁 Script executed:

# Verify there are no additional pg_isready blocks
rg -c "pg_isready" .github/workflows/unit-tests.yaml
rg -n "pg_isready" .github/workflows/unit-tests.yaml

Repository: stackrox/stackrox

Length of output: 198


Make the readiness checks fail on timeout.

Both loops exit successfully after 30 sleeps even if pg_isready never returns 0, so the workflow only fails later with misleading database errors. The readiness step should exit non-zero itself and ideally dump docker logs postgres on timeout.

Suggested change
     - name: Is Postgres ready
       run: |
-        for i in $(seq 1 30); do
-          docker exec postgres pg_isready -h 127.0.0.1 && break
-          sleep 1
-        done
+        ready=false
+        for i in $(seq 1 30); do
+          if docker exec postgres pg_isready -h 127.0.0.1; then
+            ready=true
+            break
+          fi
+          sleep 1
+        done
+        $ready || { docker logs postgres; exit 1; }

Also applies to lines 307-310.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
run: |
for i in $(seq 1 30); do
docker exec postgres pg_isready -h 127.0.0.1 && break
sleep 1
done
run: |
ready=false
for i in $(seq 1 30); do
if docker exec postgres pg_isready -h 127.0.0.1; then
ready=true
break
fi
sleep 1
done
$ready || { docker logs postgres; exit 1; }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/unit-tests.yaml around lines 252 - 257, The readiness
loops that run the shell snippet "for i in $(seq 1 30); do docker exec postgres
pg_isready -h 127.0.0.1 && break; sleep 1; done" currently continue even on
timeout; change each such loop so after it finishes you test whether pg_isready
succeeded (e.g., check the loop result or call docker exec pg_isready once
more), and if it failed print "docker logs postgres" to stdout/stderr and exit
with non-zero (exit 1) so the workflow step fails fast on timeout. Update both
occurrences of that loop to include the post-loop failure check, logs dump, and
exit.

Comment on lines +441 to +447
variant:
- name: development
env: NO_BATS_ROXCTL_REBUILD=1
files: tests/roxctl/bats-tests/local/*development* tests/roxctl/bats-tests/local/central-generate-interactive-flavors.bats tests/roxctl/bats-tests/local/central-help-output.bats tests/roxctl/bats-tests/local/deployment-bundles-psps.bats tests/roxctl/bats-tests/local/expect
- name: release
env: NO_BATS_ROXCTL_REBUILD=1
files: tests/roxctl/bats-tests/local/*release*
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n .github/workflows/unit-tests.yaml | sed -n '430,500p'

Repository: stackrox/stackrox

Length of output: 2807


🏁 Script executed:

cat -n .github/workflows/unit-tests.yaml | sed -n '410,480p'

Repository: stackrox/stackrox

Length of output: 3092


🏁 Script executed:

grep -n "matrix.variant" .github/workflows/unit-tests.yaml | head -20

Repository: stackrox/stackrox

Length of output: 160


🏁 Script executed:

grep -n "strategy:" .github/workflows/unit-tests.yaml

Repository: stackrox/stackrox

Length of output: 149


🏁 Script executed:

cat -n .github/workflows/unit-tests.yaml | sed -n '480,550p'

Repository: stackrox/stackrox

Length of output: 2682


🏁 Script executed:

cat -n .github/workflows/unit-tests.yaml | sed -n '185,250p'

Repository: stackrox/stackrox

Length of output: 2450


🏁 Script executed:

rg -B5 "^\s+env:" .github/workflows/unit-tests.yaml | head -40

Repository: stackrox/stackrox

Length of output: 1175


🏁 Script executed:

cat -n .github/workflows/unit-tests.yaml | sed -n '420,495p'

Repository: stackrox/stackrox

Length of output: 3187


🏁 Script executed:

grep -n "env:" .github/workflows/unit-tests.yaml | grep -A2 -B2 "matrix"

Repository: stackrox/stackrox

Length of output: 43


🏁 Script executed:

cat -n .github/workflows/unit-tests.yaml | wc -l

Repository: stackrox/stackrox

Length of output: 64


🏁 Script executed:

rg -A15 "matrix:" .github/workflows/unit-tests.yaml | grep -A15 "env:"

Repository: stackrox/stackrox

Length of output: 713


matrix.variant.env is not wired into the test step.

The matrix at lines 427 and 430 defines env: NO_BATS_ROXCTL_REBUILD=1, but the "Local roxctl tests" step (lines 470-477) never references ${{ matrix.variant.env }}. The variable is not exported or used inline, so both variants run with the default environment and this optimization is not applied. Wire it into the step's env: section or inline it before invoking bats.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/unit-tests.yaml around lines 441 - 447, The job's matrix
defines matrix.variant.env but the "Local roxctl tests" step does not use it;
update the "Local roxctl tests" step to consume the matrix variable by either
adding an env: block that sets NO_BATS_ROXCTL_REBUILD using ${{
matrix.variant.env }} or by inlining the value when running bats (for example
prefix the command with ${{ matrix.variant.env }}); target the "Local roxctl
tests" step in the workflow and ensure it references matrix.variant.env so the
NO_BATS_ROXCTL_REBUILD=1 optimization is actually applied.

Comment on lines 560 to +576
.PHONY: go-postgres-unit-tests
go-postgres-unit-tests: build-prep test-prep
go-postgres-unit-tests: go-postgres-unit-tests-main go-postgres-unit-tests-migrator

.PHONY: go-postgres-unit-tests-main
go-postgres-unit-tests-main: build-prep test-prep
set -o pipefail ; \
CGO_ENABLED=1 GOEXPERIMENT=cgocheck2 MUTEX_WATCHDOG_TIMEOUT_SECS=30 GOTAGS=$(GOTAGS),test,sql_integration scripts/go-test.sh -timeout 15m -race -cover -coverprofile test-output/coverage.out -v \
$(shell git grep -rl "//go:build sql_integration" central pkg tools | sed -e 's@^@./@g' | xargs -n 1 dirname | sort | uniq | xargs go list -tags sql_integration | grep -v '^github.com/stackrox/rox/tests$$' | grep -Ev $(UNIT_TEST_IGNORE)) \
| tee $(GO_TEST_OUTPUT_PATH)

.PHONY: go-postgres-unit-tests-migrator
go-postgres-unit-tests-migrator: build-prep test-prep
@# The -p 1 passed to go test is required to ensure that tests of different packages are not run in parallel, so as to avoid conflicts when interacting with the DB.
set -o pipefail ; \
CGO_ENABLED=1 GOEXPERIMENT=cgocheck2 MUTEX_WATCHDOG_TIMEOUT_SECS=30 GOTAGS=$(GOTAGS),test,sql_integration scripts/go-test.sh -p 1 -race -cover -coverprofile test-output/migrator-coverage.out -v \
$(shell git grep -rl "//go:build sql_integration" migrator | sed -e 's@^@./@g' | xargs -n 1 dirname | sort | uniq | xargs go list -tags sql_integration | grep -v '^github.com/stackrox/rox/tests$$' | grep -Ev $(UNIT_TEST_IGNORE)) \
| tee -a $(GO_TEST_OUTPUT_PATH)
| tee $(GO_TEST_OUTPUT_PATH)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check the specific lines mentioned (560-576)
sed -n '560,576p' Makefile

Repository: stackrox/stackrox

Length of output: 1496


🏁 Script executed:

# Check for GO_TEST_OUTPUT_PATH definition
rg "GO_TEST_OUTPUT_PATH\s*=" Makefile

Repository: stackrox/stackrox

Length of output: 111


🏁 Script executed:

# Check the context around the postgres test targets
sed -n '545,580p' Makefile

Repository: stackrox/stackrox

Length of output: 2425


🏁 Script executed:

# Look for generate-junit-reports target
rg -A 5 "generate-junit-reports" Makefile

Repository: stackrox/stackrox

Length of output: 335


go-postgres-unit-tests loses the main shard's log when both subtargets run.

When the parent target executes both go-postgres-unit-tests-main and go-postgres-unit-tests-migrator, the second subtarget's tee command truncates $(GO_TEST_OUTPUT_PATH) instead of appending. This overwrites the main shard's output, causing downstream consumers like generate-junit-reports to only see migrator results.

Suggested change
-		| tee $(GO_TEST_OUTPUT_PATH)
+		| tee -a $(GO_TEST_OUTPUT_PATH)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 560 - 576, The migrator subtarget is truncating the
shared GO_TEST_OUTPUT_PATH and overwriting the main shard's logs; update the
go-postgres-unit-tests-migrator recipe so its final pipe uses append mode
(change its "tee $(GO_TEST_OUTPUT_PATH)" to "tee -a $(GO_TEST_OUTPUT_PATH)") so
migrator results are appended instead of replacing the main shard output (leave
go-postgres-unit-tests-main behavior as the initial/overwrite writer).

Comment on lines +7 to +10
// Verify container detection runs without error. The result depends on the
// environment (container vs host runner) and both are valid CI configurations.
func TestContainerDetection(t *testing.T) {
if _, ok := os.LookupEnv("GITHUB_ACTIONS"); ok {
assert.True(t, IsRunningInContainer())
} else {
assert.False(t, IsRunningInContainer())
}
t.Logf("IsRunningInContainer() = %v", IsRunningInContainer())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This test is non-verifying and now always passes.

TestContainerDetection currently only logs output, so regressions in IsRunningInContainer() won’t be caught. If CI can run in both host/container modes, keep this as a smoke test only alongside deterministic unit tests that assert behavior via controlled inputs (e.g., injectable detection sources/fixtures), instead of runtime-environment expectations.

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/containers/detection_test.go` around lines 7 - 10, TestContainerDetection
is non-verifying because it only logs IsRunningInContainer(); refactor the
detection logic to be testable and add deterministic unit tests that assert
expected results using controlled inputs/fixtures. Specifically, make the core
detection functions accept injectable sources (e.g., readers/paths or an
interface used by IsRunningInContainer) or extract parsing helpers (e.g.,
parseCgroup, parseProc1) so you can write table-driven tests that call those
helpers with fixture data; then update TestContainerDetection to either assert
the smoke behavior against a mocked source or keep it as a simple runtime smoke
test but add new unit tests (e.g., TestParseCgroup_Docker,
TestParseProc1_Systemd) that validate deterministic outcomes. Ensure references:
IsRunningInContainer, TestContainerDetection, and any new helper names you
create are used so reviewers can locate and verify the changes.

Comment on lines +14 to +16
yq_multidoc() {
yq "$@" | grep -v '^---$'
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== pipefail settings in roxctl bats tests =="
rg -n --iglob '*.{bash,bats}' 'set -o pipefail|set -euo pipefail' tests/roxctl/bats-tests

echo
echo "== yq_multidoc helper and call sites =="
rg -n --iglob '*.{bash,bats}' '\byq_multidoc\b|yq "\$@" \| grep -v' tests/roxctl/bats-tests

Repository: stackrox/stackrox

Length of output: 105


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== helpers.bash content (lines 10-25) ==="
head -25 tests/roxctl/bats-tests/helpers.bash | tail -16

echo
echo "=== Search for yq_multidoc usage (broader pattern) ==="
rg -n 'yq_multidoc' tests/roxctl/bats-tests/ || echo "No matches found"

echo
echo "=== Check for global pipefail in test setup/config ==="
rg -n 'pipefail' tests/roxctl/bats-tests/ || echo "No pipefail found"

echo
echo "=== Check shebang and initial lines of helpers.bash ==="
head -5 tests/roxctl/bats-tests/helpers.bash

Repository: stackrox/stackrox

Length of output: 2774


Preserve yq failure status in yq_multidoc.

Line 15 masks yq failures because the pipeline will return the exit code of grep only when set -o pipefail is not enabled. Since no pipefail setting exists in the test area and this helper is used across 11 test cases, a yq failure can produce a false-positive (exit 0) and hide errors.

🔧 Proposed fix
 yq_multidoc() {
-  yq "$@" | grep -v '^---$'
+  ( set -o pipefail; yq "$@" | grep -v '^---$' )
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
yq_multidoc() {
yq "$@" | grep -v '^---$'
}
yq_multidoc() {
( set -o pipefail; yq "$@" | grep -v '^---$' )
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/roxctl/bats-tests/helpers.bash` around lines 14 - 16, The helper
yq_multidoc currently masks failures because the pipeline returns grep's exit
code; modify yq_multidoc so the pipeline preserves yq's exit status (e.g., run
the pipeline under pipefail within the function) and ensure the function exits
with yq's error code when yq fails while still filtering out '---' lines with
grep -v '^---$'; update the yq_multidoc function to use that pattern so callers
reliably see yq failures.

- protogen.mk: use := (simple) instead of = (recursive) to avoid
  re-running go mod download on every make invocation
- unit-tests: remove NO_BATS_ROXCTL_REBUILD env from matrix (unused,
  each shard only builds one roxctl variant anyway)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The new Go unit-test sharding logic (matrix.shard.packages + grep -Ev exclude-pattern) is fairly regex-heavy and implicit; consider extracting this into a small script or helper with tests so future package moves don’t silently drop or duplicate coverage across shards.
  • The Cache timing debug step in unit-tests.yaml will run on every CI execution and may add noise/overhead long-term; you might want to gate it behind a debug flag or remove it once the cache behavior is validated.
  • The go-tool.sh change relies on TOOL=test to select fixed ldflags for caching; it may be worth centralizing or validating how TOOL is set for all test entry points to avoid inconsistent behavior between different go test invocations.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new Go unit-test sharding logic (matrix.shard.packages + grep -Ev exclude-pattern) is fairly regex-heavy and implicit; consider extracting this into a small script or helper with tests so future package moves don’t silently drop or duplicate coverage across shards.
- The `Cache timing debug` step in `unit-tests.yaml` will run on every CI execution and may add noise/overhead long-term; you might want to gate it behind a debug flag or remove it once the cache behavior is validated.
- The `go-tool.sh` change relies on `TOOL=test` to select fixed ldflags for caching; it may be worth centralizing or validating how `TOOL` is set for all test entry points to avoid inconsistent behavior between different `go test` invocations.

## Individual Comments

### Comment 1
<location path="scripts/ci/lib.sh" line_range="2340-2344" />
<code_context>
-        description="${description//>/&gt;}"
+        # XML escape description (\ before & needed for bash 5.2+ where
+        # & in replacement refers to the matched pattern)
+        description="${description//&/\&amp;}"
+        description="${description//\"/\&quot;}"
+        description="${description//\'/\&#39;}"
+        description="${description//</\&lt;}"
+        description="${description//>/\&gt;}"

         cat << _EO_CASE_HEADER_ >> "${junit_file}"
</code_context>
<issue_to_address>
**issue (bug_risk):** The added backslashes in the XML-escape replacements will produce invalid escaped output.

In bash parameter expansion the replacement part is literal; `&` is not treated specially. Prefixing it with a backslash causes an actual backslash to appear in the output, e.g. `
&amp;` instead of `&amp;`, which corrupts the XML. The original replacements without backslashes (`"${description//&/&amp;}"`, etc.) are correct in bash 5.2+. If you saw an issue specific to 5.2, please reproduce it in a minimal script rather than changing the escaping logic here.
</issue_to_address>

### Comment 2
<location path="make/protogen.mk" line_range="37" />
<code_context>
 GENERATED_API_SWAGGER_SPECS_V2 = $(API_SERVICE_PROTOS_V2:%.proto=$(GENERATED_BASE_PATH)/%.swagger.json)

-SCANNER_DIR = $(shell go list -f '{{.Dir}}' -m github.com/stackrox/scanner)
+SCANNER_DIR := $(shell go mod download github.com/stackrox/scanner >/dev/null 2>&1; go list -f '{{.Dir}}' -m github.com/stackrox/scanner)
 ifneq ($(SCANNER_DIR),)
 SCANNER_PROTO_BASE_PATH = $(SCANNER_DIR)/proto
</code_context>
<issue_to_address>
**suggestion (performance):** Running `go mod download` unconditionally at variable expansion time may introduce unnecessary network and performance overhead.

Because `SCANNER_DIR` uses `:=` with an inline `go mod download`, this runs at Makefile evaluation for any invocation that includes this file, even when scanner protos aren’t used. That adds startup latency and can cause unexpected network access or offline failures. Instead, move the download into a dedicated target (e.g., as a dependency of the protogen targets) or guard it conditionally (e.g., only run if `go list` fails) so it doesn’t execute on every `make` run.

```suggestion
SCANNER_DIR := $(shell go list -f '{{.Dir}}' -m github.com/stackrox/scanner 2>/dev/null || (go mod download github.com/stackrox/scanner >/dev/null 2>&1 && go list -f '{{.Dir}}' -m github.com/stackrox/scanner))
```
</issue_to_address>

### Comment 3
<location path="roxctl/common/zipdownload/download_zip_test.go" line_range="210-211" />
<code_context>
 		_, err := os.Stat(path)
-		// Expect "no such file or directory" - meaning the file wasn't created
-		assert.ErrorIs(t, err, fs.ErrNotExist, "Malicious file should not exist at %s", path)
+		// Expect an error (not exist or permission denied) - meaning the file wasn't created
+		assert.Error(t, err, "Malicious file should not be accessible at %s", path)
 	}
 }
</code_context>
<issue_to_address>
**suggestion (testing):** Relaxing the assertion to any error weakens the guarantee that path traversal is prevented.

The previous `assert.ErrorIs(t, err, fs.ErrNotExist)` ensured we specifically failed when the file didn’t exist. Using only `assert.Error` now would also pass on unrelated errors (e.g. earlier I/O failures), which can mask regressions. Since we want to permit `ErrNotExist` or `ErrPermission`, please assert those explicitly, for example:

```go
require.Error(t, err, "Malicious file should not be accessible at %s", path)
require.True(t, errors.Is(err, fs.ErrNotExist) || errors.Is(err, fs.ErrPermission),
    "unexpected error for malicious path %s: %v", path, err)
```

This keeps the test environment-agnostic while still verifying the intended security behavior precisely.

Suggested implementation:

```golang
	"bytes"
	"errors"
	"io/fs"
	"os"
	"path/filepath"
	"strings"

```

```golang
	for _, path := range checkPaths {
		_, err := os.Stat(path)
		// Expect an error (not exist or permission denied) - meaning the file wasn't created
		require.Error(t, err, "Malicious file should not be accessible at %s", path)
		require.True(t, errors.Is(err, fs.ErrNotExist) || errors.Is(err, fs.ErrPermission),
			"unexpected error for malicious path %s: %v", path, err)
	}
}

```

1. Ensure `"github.com/stretchr/testify/require"` is imported in this file, similar to the existing `"github.com/stretchr/testify/assert"` import.
2. If `assert` is no longer used elsewhere in this test file, you can safely remove the `assert` import; otherwise, keep it.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +2340 to +2344
description="${description//&/\&amp;}"
description="${description//\"/\&quot;}"
description="${description//\'/\&#39;}"
description="${description//</\&lt;}"
description="${description//>/\&gt;}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): The added backslashes in the XML-escape replacements will produce invalid escaped output.

In bash parameter expansion the replacement part is literal; & is not treated specially. Prefixing it with a backslash causes an actual backslash to appear in the output, e.g. &amp; instead of &amp;, which corrupts the XML. The original replacements without backslashes ("${description//&/&amp;}", etc.) are correct in bash 5.2+. If you saw an issue specific to 5.2, please reproduce it in a minimal script rather than changing the escaping logic here.

make/protogen.mk Outdated
GENERATED_API_SWAGGER_SPECS_V2 = $(API_SERVICE_PROTOS_V2:%.proto=$(GENERATED_BASE_PATH)/%.swagger.json)

SCANNER_DIR = $(shell go list -f '{{.Dir}}' -m github.com/stackrox/scanner)
SCANNER_DIR := $(shell go mod download github.com/stackrox/scanner >/dev/null 2>&1; go list -f '{{.Dir}}' -m github.com/stackrox/scanner)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (performance): Running go mod download unconditionally at variable expansion time may introduce unnecessary network and performance overhead.

Because SCANNER_DIR uses := with an inline go mod download, this runs at Makefile evaluation for any invocation that includes this file, even when scanner protos aren’t used. That adds startup latency and can cause unexpected network access or offline failures. Instead, move the download into a dedicated target (e.g., as a dependency of the protogen targets) or guard it conditionally (e.g., only run if go list fails) so it doesn’t execute on every make run.

Suggested change
SCANNER_DIR := $(shell go mod download github.com/stackrox/scanner >/dev/null 2>&1; go list -f '{{.Dir}}' -m github.com/stackrox/scanner)
SCANNER_DIR := $(shell go list -f '{{.Dir}}' -m github.com/stackrox/scanner 2>/dev/null || (go mod download github.com/stackrox/scanner >/dev/null 2>&1 && go list -f '{{.Dir}}' -m github.com/stackrox/scanner))

Comment on lines +210 to +211
// Expect an error (not exist or permission denied) - meaning the file wasn't created
assert.Error(t, err, "Malicious file should not be accessible at %s", path)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Relaxing the assertion to any error weakens the guarantee that path traversal is prevented.

The previous assert.ErrorIs(t, err, fs.ErrNotExist) ensured we specifically failed when the file didn’t exist. Using only assert.Error now would also pass on unrelated errors (e.g. earlier I/O failures), which can mask regressions. Since we want to permit ErrNotExist or ErrPermission, please assert those explicitly, for example:

require.Error(t, err, "Malicious file should not be accessible at %s", path)
require.True(t, errors.Is(err, fs.ErrNotExist) || errors.Is(err, fs.ErrPermission),
    "unexpected error for malicious path %s: %v", path, err)

This keeps the test environment-agnostic while still verifying the intended security behavior precisely.

Suggested implementation:

	"bytes"
	"errors"
	"io/fs"
	"os"
	"path/filepath"
	"strings"
	for _, path := range checkPaths {
		_, err := os.Stat(path)
		// Expect an error (not exist or permission denied) - meaning the file wasn't created
		require.Error(t, err, "Malicious file should not be accessible at %s", path)
		require.True(t, errors.Is(err, fs.ErrNotExist) || errors.Is(err, fs.ErrPermission),
			"unexpected error for malicious path %s: %v", path, err)
	}
}
  1. Ensure "github.com/stretchr/testify/require" is imported in this file, similar to the existing "github.com/stretchr/testify/assert" import.
  2. If assert is no longer used elsewhere in this test file, you can safely remove the assert import; otherwise, keep it.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
.github/workflows/unit-tests.yaml (2)

645-654: ⚠️ Potential issue | 🟡 Minor

Missing new integration jobs in Slack notification needs.

The go-integration and go-operator-integration jobs are not included in the needs list. Failures in these new jobs won't trigger Slack notifications.

Proposed fix
     needs:
       - go
       - go-bench
+      - go-integration
+      - go-operator-integration
       - go-postgres
       - local-roxctl-tests
       - openshift-ci-unit-tests
       - sensor-integration-tests
       - shell-unit-tests
       - ui
       - ui-component
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/unit-tests.yaml around lines 645 - 654, The Slack
notification "needs" list is missing the new integration jobs; update the YAML
`needs` array used for Slack notifications to include the job names
"go-integration" and "go-operator-integration" so failures in those jobs will
trigger notifications; locate the `needs:` block that currently contains items
like "go", "go-bench", "go-postgres", etc., and append "go-integration" and
"go-operator-integration" to that list.

664-664: ⚠️ Potential issue | 🟡 Minor

Duplicate reference and missing ui-component in mention_author.

needs.go.outputs.new-jiras appears twice. The second occurrence should likely be needs.ui-component.outputs.new-jiras since that job also has a new-jiras output.

Proposed fix
-        mention_author: ${{ needs.go.outputs.new-jiras || needs.go-postgres.outputs.new-jiras || needs.local-roxctl-tests.outputs.new-jiras || needs.ui.outputs.new-jiras || needs.go.outputs.new-jiras || needs.shell-unit-tests.outputs.new-jiras || needs.sensor-integration-tests.outputs.new-jiras }}
+        mention_author: ${{ needs.go.outputs.new-jiras || needs.go-postgres.outputs.new-jiras || needs.local-roxctl-tests.outputs.new-jiras || needs.ui.outputs.new-jiras || needs.ui-component.outputs.new-jiras || needs.shell-unit-tests.outputs.new-jiras || needs.sensor-integration-tests.outputs.new-jiras }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/unit-tests.yaml at line 664, The mention_author expression
contains a duplicated needs.go.outputs.new-jiras and is missing the ui-component
job; update the list used for mention_author to replace the second
needs.go.outputs.new-jiras with needs.ui-component.outputs.new-jiras so the
expression reads ...needs.go.outputs.new-jiras ||
needs.go-postgres.outputs.new-jiras ||
needs.local-roxctl-tests.outputs.new-jiras || needs.ui.outputs.new-jiras ||
needs.ui-component.outputs.new-jiras || needs.shell-unit-tests.outputs.new-jiras
|| needs.sensor-integration-tests.outputs.new-jiras (referencing the
mention_author variable and the jobs needs.go, needs.ui, needs.ui-component,
needs.go-postgres, needs.local-roxctl-tests, needs.shell-unit-tests, and
needs.sensor-integration-tests).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/unit-tests.yaml:
- Around line 114-144: The go-integration job is missing the GOTOOLCHAIN=auto
override; update the actions/setup-go@v6 step (the step with "uses:
actions/setup-go@v6" inside the go-integration job) to set GOTOOLCHAIN=auto
(e.g., add an env: GOTOOLCHAIN: auto on that step or otherwise export
GOTOOLCHAIN=auto before running setup-go) so the job uses the same toolchain
override as the other Go jobs.

---

Outside diff comments:
In @.github/workflows/unit-tests.yaml:
- Around line 645-654: The Slack notification "needs" list is missing the new
integration jobs; update the YAML `needs` array used for Slack notifications to
include the job names "go-integration" and "go-operator-integration" so failures
in those jobs will trigger notifications; locate the `needs:` block that
currently contains items like "go", "go-bench", "go-postgres", etc., and append
"go-integration" and "go-operator-integration" to that list.
- Line 664: The mention_author expression contains a duplicated
needs.go.outputs.new-jiras and is missing the ui-component job; update the list
used for mention_author to replace the second needs.go.outputs.new-jiras with
needs.ui-component.outputs.new-jiras so the expression reads
...needs.go.outputs.new-jiras || needs.go-postgres.outputs.new-jiras ||
needs.local-roxctl-tests.outputs.new-jiras || needs.ui.outputs.new-jiras ||
needs.ui-component.outputs.new-jiras || needs.shell-unit-tests.outputs.new-jiras
|| needs.sensor-integration-tests.outputs.new-jiras (referencing the
mention_author variable and the jobs needs.go, needs.ui, needs.ui-component,
needs.go-postgres, needs.local-roxctl-tests, needs.shell-unit-tests, and
needs.sensor-integration-tests).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: d063e5d8-7b74-4569-ae22-176d3fd899fc

📥 Commits

Reviewing files that changed from the base of the PR and between 8351f40 and e803987.

📒 Files selected for processing (2)
  • .github/workflows/unit-tests.yaml
  • make/protogen.mk

Comment on lines +114 to +144
go-integration:
strategy:
fail-fast: false
matrix:
gotags: [ 'GOTAGS=""', 'GOTAGS=release' ]
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v6

- name: Go Operator Helm Tests
run: ${{ matrix.gotags }} make -C operator/ test-helm
- uses: actions/setup-go@v6
with:
go-version-file: go.mod
cache: false

- uses: ./.github/actions/job-preamble
with:
gcp-account: ${{ secrets.GCP_SERVICE_ACCOUNT_STACKROX_CI }}

- name: Cache Go dependencies
uses: ./.github/actions/cache-go-dependencies
with:
key-suffix: ${{ matrix.gotags }}-integration

- name: Go Integration Unit Tests
run: |
mkdir -p test-output
set -o pipefail
PACKAGES=$(go list ./... | grep -E 'registries|scanners|notifiers')
${{ matrix.gotags }} GOTAGS="${GOTAGS:+$GOTAGS,}test,integration" CGO_ENABLED=1 GOEXPERIMENT=cgocheck2 MUTEX_WATCHDOG_TIMEOUT_SECS=30 \
scripts/go-test.sh -v $PACKAGES | tee test-output/test.log
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing GOTOOLCHAIN=auto override.

The go-integration job uses actions/setup-go but lacks the GOTOOLCHAIN=auto override that other jobs (like go, go-postgres, go-bench) include. This could cause issues if sub-module tools require newer Go versions.

Proposed fix
     - uses: actions/setup-go@v6
       with:
         go-version-file: go.mod
         cache: false

+    # setup-go sets GOTOOLCHAIN=local; override so sub-module tools
+    # with newer go directives can auto-download.
+    - name: Override GOTOOLCHAIN
+      run: echo "GOTOOLCHAIN=auto" >> "$GITHUB_ENV"
+
     - uses: ./.github/actions/job-preamble
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/unit-tests.yaml around lines 114 - 144, The go-integration
job is missing the GOTOOLCHAIN=auto override; update the actions/setup-go@v6
step (the step with "uses: actions/setup-go@v6" inside the go-integration job)
to set GOTOOLCHAIN=auto (e.g., add an env: GOTOOLCHAIN: auto on that step or
otherwise export GOTOOLCHAIN=auto before running setup-go) so the job uses the
same toolchain override as the other Go jobs.

Changed SCANNER_DIR and M_ARGS_STR from := (immediate) to = (lazy).
With :=, go mod download ran at Makefile parse time for EVERY make
invocation, even unrelated targets like make test or make clean.
With =, the shell command only runs when a recipe expands the variable
(i.e., proto generation targets).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants