perf(ci): remove container from style.yaml jobs, run on host#19309
perf(ci): remove container from style.yaml jobs, run on host#19309
Conversation
Remove the container: directive from all 4 container-based jobs in the Style workflow (check-generated-files, misc-checks, style-check, openshift-ci-lint) and run them directly on the ubuntu-latest runner. Container initialization consistently takes ~55s per job. Replacing it with on-host tool setup (setup-go, setup-node, yq, xmlstarlet, pip) costs ~5-15s, saving ~40-50s per job. Most tools (Go, Node.js, Make, GCC, Git, Helm, shellcheck, jq) are already preinstalled on the runner. For style-check, the shell-based tool installations (yq, xmlstarlet, pip) run as parallel background subshells to minimize wall-clock time. Also updates cache-ui-dependencies to use portable ~/. paths instead of container-specific /github/home/ paths, with a cache key bump (v2->v3). Generated with the assistance of AI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The yq installation logic is duplicated across multiple jobs; consider extracting this into a small composite action or reusable workflow step so version bumps or install changes are centralized.
- The workflow uses a bare
pipcommand on the hosted runner; switching topython3 -m pip(or explicitly setting up a Python version) would make the linter install more deterministic and robust across runner images.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The yq installation logic is duplicated across multiple jobs; consider extracting this into a small composite action or reusable workflow step so version bumps or install changes are centralized.
- The workflow uses a bare `pip` command on the hosted runner; switching to `python3 -m pip` (or explicitly setting up a Python version) would make the linter install more deterministic and robust across runner images.
## Individual Comments
### Comment 1
<location path=".github/workflows/style.yaml" line_range="44" />
<code_context>
+
+ - name: Install yq
+ run: |
+ sudo wget -qO /usr/local/bin/yq "https://github.com/mikefarah/yq/releases/download/v4.44.6/yq_linux_amd64"
+ sudo chmod +x /usr/local/bin/yq
+
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Consider verifying the downloaded `yq` binary to reduce supply-chain risk
Since this downloads and executes a remote binary, please add an integrity check (e.g., verify the SHA256 checksum against the official release) before marking it executable to mitigate supply-chain risk.
</issue_to_address>
### Comment 2
<location path=".github/workflows/style.yaml" line_range="322-323" />
<code_context>
with:
gcp-account: ${{ secrets.GCP_SERVICE_ACCOUNT_STACKROX_CI }}
+ - name: Install Python linters
+ run: pip install -r .openshift-ci/dev-requirements.txt
+
- name: pycodestyle
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Use an explicit Python interpreter when installing linters for predictability
Calling `pip` directly is sensitive to PATH and default Python version changes on GitHub runners. Please invoke it via the interpreter you depend on, e.g. `python3 -m pip install -r .openshift-ci/dev-requirements.txt`, to make the setup more stable over time.
```suggestion
- name: Install Python linters
run: python3 -m pip install -r .openshift-ci/dev-requirements.txt
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Address sourcery-ai review feedback: - Add SHA256 checksum verification after downloading yq binary to mitigate supply-chain risk - Use `python3 -m pip` instead of bare `pip` for deterministic Python interpreter resolution across runner image updates Generated with the assistance of AI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
actions/setup-go sets GOTOOLCHAIN=local by default, which prevents Go from auto-downloading a newer toolchain when a dependency (e.g. buf) requires one. The CI container didn't have this restriction. Set GOTOOLCHAIN=auto at the workflow level so Go tools that require a newer patch version (e.g. buf needing go1.25.6 when go.mod says 1.25.0) can download the required toolchain automatically. Generated with the assistance of AI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Images are ready for the commit at 5bdb821. To use with deploy scripts, first |
actions/setup-go writes GOTOOLCHAIN=local to GITHUB_ENV, which overrides any workflow-level env setting. This prevents Go tools in sub-modules with newer go.mod directives (e.g. tools/proto requires go 1.25.6) from auto-downloading the required toolchain. Write GOTOOLCHAIN=auto to GITHUB_ENV in a step after setup-go so it takes effect in all subsequent steps. This matches the behavior in the CI container where GOTOOLCHAIN was unset (defaulting to auto). The project uses GOTOOLCHAIN=local only in targeted make targets (go mod tidy) as a compatibility guard — not as a global setting. Generated with the assistance of AI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #19309 +/- ##
==========================================
+ Coverage 49.62% 49.64% +0.01%
==========================================
Files 2680 2689 +9
Lines 202231 202505 +274
==========================================
+ Hits 100362 100537 +175
- Misses 94382 94472 +90
- Partials 7487 7496 +9
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@davdhacs: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
The operator's bundle_helpers virtualenv uses packages (pip, pytest) that rely on pkgutil.ImpImporter, which was removed in Python 3.12. The CI container had Python 3.9 (UBI8) where this worked. The ubuntu-latest runner has Python 3.12. Add actions/setup-python with python-version 3.9 to match the container's Python version until operator/bundle_helpers is updated for Python 3.12 compatibility. Generated with the assistance of AI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| run: | | ||
| # Override GOTOOLCHAIN=local set by setup-go. | ||
| echo "GOTOOLCHAIN=auto" >> "$GITHUB_ENV" | ||
| sudo wget -qO /usr/local/bin/yq "https://github.com/mikefarah/yq/releases/download/v4.44.6/yq_linux_amd64" |
There was a problem hiding this comment.
It's a go package so maybe we should have in tools module to easily track the version and updates on the other hand we will need to compile it
There was a problem hiding this comment.
It looks like yq should be there
https://hollycummins.com/using-yq-in-github-actions/
|
|
||
| failed=0 | ||
| for pid in $(jobs -p); do | ||
| wait "$pid" || failed=1 |
There was a problem hiding this comment.
How much time do we save on doing it in parallel. Is it worth complex code?
There was a problem hiding this comment.
good point. I think it is only seconds so I'll remove it. I was prematurely optimizing this because I didn't like the waiting step by step sequentially
|
|
||
| - name: Install yq | ||
| run: | | ||
| # Override GOTOOLCHAIN=local set by setup-go. |
There was a problem hiding this comment.
Why do we need to override it. It's based on go.mod so it should be the right version
There was a problem hiding this comment.
unfortunately it looked like setup-go exported GOTOOLCHAIN=local always. I'll test to see if it can be told/verified to use the go.mod value
Description
Remove the
container:directive from all 4 container-based jobs in the Style workflow and run them directly on theubuntu-latestrunner.Problem: Container initialization consistently takes ~55s per job (measured across recent master runs). The CI container image (
apollo-ci:stackrox-test-0.5.3) is ~3-4 GB and provides tools that are mostly already preinstalled on GitHub-hosted runners (Go, Node.js, Make, GCC, Git, Helm, shellcheck, jq, Docker).Approach: Replace container with targeted tool setup:
actions/setup-goandactions/setup-nodefor language runtimes (~1-3s each, tool-cache hit)yq,xmlstarlet, Python linters — run as parallel background subshells instyle-checkto minimize wall-clock timepip installfor Python linters (pycodestyle,pylint) needed byopenshift-ci-lintExpected savings per job:
Also updates
cache-ui-dependenciesto use portable~/paths instead of container-specific/github/home/paths, with a cache key bump (v2 -> v3) to avoid stale entries. The~paths resolve identically inside containers (/github/home), so other workflows using this action are unaffected.User-facing documentation
Testing and quality
Automated testing
How I validated my change
CI validation — this PR's own Style workflow run validates that all 4 migrated jobs pass without the container. Comparing step timings against recent master runs will confirm the expected savings.