Skip to content

ROX-30577: Add process baseline auto locking to cluster config#16669

Merged
JoukoVirtanen merged 16 commits intomasterfrom
jv-ROX-30577-add-process-baseline-auto-locking-to-cluster-config
Sep 11, 2025
Merged

ROX-30577: Add process baseline auto locking to cluster config#16669
JoukoVirtanen merged 16 commits intomasterfrom
jv-ROX-30577-add-process-baseline-auto-locking-to-cluster-config

Conversation

@JoukoVirtanen
Copy link
Contributor

@JoukoVirtanen JoukoVirtanen commented Sep 4, 2025

Description

Adds auto locking to the cluster protobuf. Also makes it so that the cluster configuration is used to control auto locking. Thus it will be possible to control process baseline auto locking at the cluster level. The feature flag is still in place and in order to enable process baseline auto locking for a cluster the feature flag needs to be enabled and it needs to be enabled for the cluster via the cluster config.

After this change it will not be possible to control this new cluster field via helm or operator. That will be done in other PRs.

The PR to control process baseline auto-locking via helm can be found here #16462

The PR to control process baseline auto-locking via operator can be found here #16697

This PR replaces #16427

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

et the following environment variables

export ROX_BASELINE_GENERATION_DURATION=5m
export ROX_AUTOLOCK_PROCESS_BASELINES=true

Deployed ACS.

Created a pod that could be used to run some processes and entered it.

kubectl run ubuntu-pod --image=ubuntu --restart=Never --command -- sleep infinity
kubectl exec ubuntu-pod -it -- /bin/bash

Logged into the UI and checked "Risk".

image

The process baseline was still unlocked after more than five minutes.

Ran the following script to enable process baseline auto locking for the cluster

#!/usr/bin/env bash
set -eou pipefail

ROX_ENDPOINT=${1:-https://localhost:8000}

start_time=$(date +%s)

json_clusters="$(curl --location --silent --request GET "${ROX_ENDPOINT}/v1/clusters" -k -H "Authorization: Bearer $ROX_API_TOKEN")"

json_cluster="$(echo "$json_clusters" | jq .clusters.[0])"
id="$(echo "$json_cluster" | jq -r .id)"

json_cluster="$(echo "$json_cluster" | jq '.dynamicConfig.autoLockProcessBaselines.enabled = true')"
echo "$json_cluster" | jq

echo
echo
echo
echo

json_clusters_response="$(curl --location --silent --request PUT "${ROX_ENDPOINT}/v1/clusters/${id}" -k -H "Authorization: Bearer $ROX_API_TOKEN" --data "$json_cluster")"

json_clusters="$(curl --location --silent --request GET "${ROX_ENDPOINT}/v1/clusters" -k -H "Authorization: Bearer $ROX_API_TOKEN")"
echo "$json_clusters" | jq

Created another pod, entered it, and ran a command

kubectl run ubuntu-pod-2 --image=ubuntu --restart=Never --command -- sleep infinity
kubectl exec ubuntu-pod-2 -it -- /bin/bash
cat /proc/1/net/tcp
image

Initially the process baseline is unlocked.

image

After a little more than five minutes the baseline is locked.

Running a new process results in a violation

tac /proc/1/net/tcp
image

@openshift-ci
Copy link

openshift-ci bot commented Sep 4, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Sep 4, 2025

Images are ready for the commit at f93a916.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.9.x-707-gf93a9162eb.

@codecov
Copy link

codecov bot commented Sep 4, 2025

Codecov Report

❌ Patch coverage is 18.75000% with 65 lines in your changes missing coverage. Please review.
✅ Project coverage is 48.74%. Comparing base (977602c) to head (f93a916).
⚠️ Report is 45 commits behind head on master.

Files with missing lines Patch % Lines
central/graphql/resolvers/generated.go 10.52% 34 Missing ⚠️
central/detection/lifecycle/manager_impl.go 28.20% 27 Missing and 1 partial ⚠️
central/detection/lifecycle/manager.go 0.00% 2 Missing ⚠️
central/detection/lifecycle/singleton.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #16669      +/-   ##
==========================================
+ Coverage   48.62%   48.74%   +0.11%     
==========================================
  Files        2664     2677      +13     
  Lines      199343   200063     +720     
==========================================
+ Hits        96929    97517     +588     
- Misses      94818    94938     +120     
- Partials     7596     7608      +12     
Flag Coverage Δ
go-unit-tests 48.74% <18.75%> (+0.11%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-30577-add-process-baseline-auto-locking-to-cluster-config branch from b99522d to da86bf4 Compare September 8, 2025 00:01
@JoukoVirtanen
Copy link
Contributor Author

/test gke-nongroovy-e2e-tests

Copy link
Contributor

@clickboo clickboo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left comments

Copy link
Contributor

@dashrews78 dashrews78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a nit otherwise I'm OK with this as long as the config experts are.

Copy link
Contributor

@pedrottimark pedrottimark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming convention in storage/cluster.proto is solid from UI viewpoint and Jouko confirmed details via Slack for small sibling contribution to display on clusters page.

@openshift-ci
Copy link

openshift-ci bot commented Sep 11, 2025

@JoukoVirtanen: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ocp-4-12-qa-e2e-tests f93a916 link false /test ocp-4-12-qa-e2e-tests
ci/prow/ocp-4-19-qa-e2e-tests f93a916 link false /test ocp-4-19-qa-e2e-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Contributor

@clickboo clickboo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the cluster config changes, lgtm. Thanks @JoukoVirtanen for making suggested changes.

@JoukoVirtanen JoukoVirtanen merged commit 97b7825 into master Sep 11, 2025
98 of 101 checks passed
@JoukoVirtanen JoukoVirtanen deleted the jv-ROX-30577-add-process-baseline-auto-locking-to-cluster-config branch September 11, 2025 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants