Skip to content

ROX-30579: Configure process baseline auto locking via operator#16697

Merged
JoukoVirtanen merged 28 commits intomasterfrom
jv-ROX-30579-configure-process-baseline-auto-locking-via-operator
Oct 1, 2025
Merged

ROX-30579: Configure process baseline auto locking via operator#16697
JoukoVirtanen merged 28 commits intomasterfrom
jv-ROX-30579-configure-process-baseline-auto-locking-via-operator

Conversation

@JoukoVirtanen
Copy link
Contributor

@JoukoVirtanen JoukoVirtanen commented Sep 6, 2025

Description

Makes it so that the process baseline auto-locking feature can be controlled at the cluster level via the operator.

This PR is built on top of

Configure process baseline auto locking via helm
#16462

and

Add process baseline auto locking to cluster config
#16669

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

Performed an operator upgrade test using the instructions at https://spaces.redhat.com/pages/viewpage.action?spaceKey=StackRox&title=How+to+test+upstream+OLM+operator+upgrade

Summary

An openshift-4 cluster was created and the 4.8 operator was installed there. It was upgraded to this branch. The cluster config was checked and it was confirmed that process baseline auto-locking was disabled. The securedcluster CR was then edited to enable process baseline auto-locking and it was confirmed that process baseline auto-locking was enabled. The securedcluster CR was then edited to disable process baseline auto-locking and it was confirmed to be disabled.

Details

Created an openshift-4 cluster called jv-0915-ocp and downloaded the artifacts.

$ export INFRA_NAME=jv-0915-ocp
$ export KUBECONFIG=/tmp/artifacts-jv-0915-ocp/kubeconfig
$ export ROX_PRODUCT_BRANDING=RHACS_BRANDING

$ make -C operator deploy-previous-via-olm

Got the password and url of the openshift console

$ cat /tmp/artifacts-jv-0915-ocp/kubeadmin-password
$ cat /tmp/artifacts-jv-0915-ocp/url

Created the stackrox namespace in the OpenShift console UI.

Created stackrox pull secrets

$ deploy/common/pull-secret.sh stackrox quay.io | kubectl -n stackrox apply -f -

Created central using the openshift console UI.

Got the password for central

$ oc -n stackrox get secret central-htpasswd -o go-template='{{index .data "password" | base64decode}}'

Created a port forward

$ kubectl -n stackrox port-forward deploy/central 8000:8443 > /dev/null 2>&1 &

Logged into localhost:8000 and created and downloaded the init bundle.

$ oc create --namespace stackrox --filename ProcessBaselineAutoLockingTest-Operator-secrets-cluster-init-bundle.yaml

Created the secured cluster in the OpenShift console UI.

Checked the state of the system

$ ks get pod
NAME                                  READY   STATUS    RESTARTS   AGE
admission-control-667d7f9cd9-cdzfb    1/1     Running   0          4m52s
admission-control-667d7f9cd9-h5rqg    1/1     Running   0          4m52s
admission-control-667d7f9cd9-rt4sg    1/1     Running   0          4m52s
central-c744c9568-wdh4w               1/1     Running   0          11m
central-db-78db77b7d-pjsmx            1/1     Running   0          11m
collector-8cvf6                       3/3     Running   0          4m52s
collector-fdg9l                       3/3     Running   0          4m52s
collector-j2tnn                       3/3     Running   0          4m52s
collector-l966h                       3/3     Running   0          4m52s
collector-s942d                       3/3     Running   0          4m52s
collector-vp85d                       3/3     Running   0          4m52s
config-controller-64bc9f4d4c-82ndn    1/1     Running   0          11m
scanner-84dbdb9848-b5tbc              1/1     Running   0          11m
scanner-84dbdb9848-kgn7z              1/1     Running   0          11m
scanner-db-6bfc887464-fb6cp           1/1     Running   0          11m
scanner-v4-db-57b59d78d6-ngdkr        1/1     Running   0          11m
scanner-v4-indexer-7b85889b5-k655k    1/1     Running   0          11m
scanner-v4-indexer-7b85889b5-np9wq    1/1     Running   0          11m
scanner-v4-matcher-74dcd45f84-5m75m   1/1     Running   0          11m
scanner-v4-matcher-74dcd45f84-rkb5w   1/1     Running   0          11m
scanner-v4-matcher-74dcd45f84-rrflw   1/1     Running   0          3m7s
sensor-668875b5d6-2zllm               1/1     Running   0          4m52s

Did the upgrade

$ make -C operator upgrade-via-olm

Checked the state of the system again

$ ks get pod
NAME                                 READY   STATUS    RESTARTS        AGE
admission-control-5b9c6d95c7-69zwg   1/1     Running   0               8m53s
admission-control-5b9c6d95c7-gq8vg   1/1     Running   0               8m26s
admission-control-5b9c6d95c7-wfn2h   1/1     Running   0               9m20s
central-65f998fcc5-fd2hx             1/1     Running   0               8m33s
central-db-5fc787d674-z6hz6          1/1     Running   0               8m33s
collector-62qg9                      3/3     Running   0               6m34s
collector-8frsq                      3/3     Running   0               5m43s
collector-96td4                      3/3     Running   0               5m11s
collector-ls7w6                      3/3     Running   0               8m47s
collector-xkhpt                      3/3     Running   0               7m52s
collector-xxdb9                      3/3     Running   0               9m13s
config-controller-67877c8f79-krtrb   1/1     Running   0               8m33s
scanner-79f89c69d8-gvxfb             1/1     Running   0               8m32s
scanner-79f89c69d8-klnf4             1/1     Running   0               8m32s
scanner-db-79596c846b-rpgkr          1/1     Running   0               8m32s
scanner-v4-db-7d9595c88d-z2gnf       1/1     Running   0               8m32s
scanner-v4-indexer-c5fcf6b54-7jsrt   1/1     Running   2 (8m23s ago)   8m33s
scanner-v4-indexer-c5fcf6b54-khzkl   1/1     Running   0               7m49s
scanner-v4-matcher-98b4f5c46-2jl4x   1/1     Running   2 (8m20s ago)   8m33s
scanner-v4-matcher-98b4f5c46-7zfzd   1/1     Running   0               7m46s
sensor-6f64bc95c8-gwqhz              1/1     Running   0               9m19s

Checked that the image in central had been updated

$ ks describe deployment central | grep -i image
    Image:       quay.io/rhacs-eng/main:4.9.x-805-g9ee7e41fcf
jvirtane@jvirtane-thinkpadp1gen3:~/go/src/github.com/stackrox/stackrox$ make tag
warning: tag '4.9' is externally known as '4.9.x'
warning: tag '4.9' is externally known as '4.9.x'
4.9.x-805-g9ee7e41fcf

Checked the securedcluster CR

$ ks get securedcluster stackrox-secured-cluster-services -o yaml
apiVersion: platform.stackrox.io/v1alpha1
kind: SecuredCluster
metadata:
  annotations:
    feature-defaults.platform.stackrox.io/admissionControllerEnforce: "true"
    feature-defaults.platform.stackrox.io/scannerV4: AutoSense
  creationTimestamp: "2025-09-16T21:32:47Z"
  finalizers:
  - uninstall-helm-release
  generation: 1
  name: stackrox-secured-cluster-services
  namespace: stackrox
  resourceVersion: "68801"
  uid: 2f1a890b-a996-446c-acbd-0bdc311bc2a4
spec:
  admissionControl:
    bypass: BreakGlassAnnotation
    contactImageScanners: DoNotScanInline
    listenOnCreates: true
    listenOnEvents: true
    listenOnUpdates: true
    replicas: 3
    timeoutSeconds: 10
  auditLogs:
    collection: Auto
  clusterName: my-cluster
  monitoring:
    openshift:
      enabled: true
  network:
    policies: Enabled
  perNode:
    collector:
      collection: CORE_BPF
      forceCollection: false
      imageFlavor: Regular
    taintToleration: TolerateTaints
  scanner:
    analyzer:
      scaling:
        autoScaling: Enabled
        maxReplicas: 5
        minReplicas: 2
        replicas: 3
    scannerComponent: AutoSense
  scannerV4:
    db:
      persistence:
        persistentVolumeClaim:
          claimName: scanner-v4-db
    indexer:
      scaling:
        autoScaling: Enabled
        maxReplicas: 5
        minReplicas: 2
        replicas: 3
status:
  clusterName: my-cluster
  conditions:
  - lastTransitionTime: "2025-09-16T21:33:07Z"
    message: |
      StackRox Secured Cluster Services 4.9.x-805-g9ee7e41fcf has been installed.



      Thank you for using StackRox!
    reason: UpgradeSuccessful
    status: "True"
    type: Deployed
  - lastTransitionTime: "2025-09-16T21:33:07Z"
    status: "True"
    type: Initialized
  - lastTransitionTime: "2025-09-16T21:33:07Z"
    status: "False"
    type: Irreconcilable
  - lastTransitionTime: "2025-09-16T21:33:07Z"
    message: Proxy configuration has been applied successfully
    reason: ProxyConfigApplied
    status: "False"
    type: ProxyConfigFailed
  - lastTransitionTime: "2025-09-16T21:33:07Z"
    status: "False"
    type: ReleaseFailed
  deployedRelease: {}
  productVersion: 4.9.x-805-g9ee7e41fcf

The cluster config was checked and it was the following

{
  "clusters": [
    {
      "id": "dd5595b1-db5e-4da5-b1c0-f27ed2de465a",
      "name": "my-cluster",
      "type": "OPENSHIFT4_CLUSTER",
      "labels": {},
      "mainImage": "quay.io/rhacs-eng/main",
      "collectorImage": "quay.io/rhacs-eng/collector",
      "centralApiEndpoint": "central.stackrox.svc:443",
      "runtimeSupport": true,
      "collectionMethod": "CORE_BPF",
      "admissionController": true,
      "admissionControllerUpdates": true,
      "admissionControllerEvents": true,

...

      "dynamicConfig": {
        "admissionControllerConfig": {
          "enabled": true,
          "timeoutSeconds": 10,
          "scanInline": true,
          "disableBypass": false,
          "enforceOnUpdates": true
        },
        "registryOverride": "",
        "disableAuditLogs": false,
        "autoLockProcessBaselinesConfig": {
          "enabled": false
        }
      },

...

      "helmConfig": {
        "dynamicConfig": {
          "admissionControllerConfig": {
            "enabled": true,
            "timeoutSeconds": 10,
            "scanInline": true,
            "disableBypass": false,
            "enforceOnUpdates": true
          },
          "registryOverride": "",
          "disableAuditLogs": false,
          "autoLockProcessBaselinesConfig": {
            "enabled": false
          }
        },

...

Note that process baseline auto-locking is disabled.

The spec section of the securedcluster CR was changed to the following

spec:
  admissionControl:
    bypass: BreakGlassAnnotation
    contactImageScanners: DoNotScanInline
    listenOnCreates: true
    listenOnEvents: true
    listenOnUpdates: true
    replicas: 3
    timeoutSeconds: 10
  auditLogs:
    collection: Auto
  processBaselines:
    autoLock: Enabled

Afterwards sensor restarted

$ ks get pod
NAME                                 READY   STATUS    RESTARTS      AGE
admission-control-5b9c6d95c7-69zwg   1/1     Running   0             20m
admission-control-5b9c6d95c7-gq8vg   1/1     Running   0             20m
admission-control-5b9c6d95c7-wfn2h   1/1     Running   0             21m
central-65f998fcc5-fd2hx             1/1     Running   0             20m
central-db-5fc787d674-z6hz6          1/1     Running   0             20m
collector-62qg9                      3/3     Running   0             18m
collector-8frsq                      3/3     Running   0             17m
collector-96td4                      3/3     Running   0             16m
collector-ls7w6                      3/3     Running   0             20m
collector-xkhpt                      3/3     Running   0             19m
collector-xxdb9                      3/3     Running   0             20m
config-controller-67877c8f79-krtrb   1/1     Running   0             20m
scanner-79f89c69d8-gvxfb             1/1     Running   0             20m
scanner-79f89c69d8-klnf4             1/1     Running   0             20m
scanner-db-79596c846b-rpgkr          1/1     Running   0             20m
scanner-v4-db-7d9595c88d-z2gnf       1/1     Running   0             20m
scanner-v4-indexer-c5fcf6b54-7jsrt   1/1     Running   2 (20m ago)   20m
scanner-v4-indexer-c5fcf6b54-khzkl   1/1     Running   0             19m
scanner-v4-matcher-98b4f5c46-2jl4x   1/1     Running   2 (20m ago)   20m
scanner-v4-matcher-98b4f5c46-7zfzd   1/1     Running   0             19m
sensor-78cfc54c9b-wtz7l              1/1     Running   0             21s

The cluster config was checked again

{
  "clusters": [
    {
      "id": "dd5595b1-db5e-4da5-b1c0-f27ed2de465a",
      "name": "my-cluster",
      "type": "OPENSHIFT4_CLUSTER",
      "labels": {},
      "mainImage": "quay.io/rhacs-eng/main",
      "collectorImage": "quay.io/rhacs-eng/collector",
      "centralApiEndpoint": "central.stackrox.svc:443",
      "runtimeSupport": true,
      "collectionMethod": "CORE_BPF",
      "admissionController": true,
      "admissionControllerUpdates": true,
      "admissionControllerEvents": true,

...

      },
      "dynamicConfig": {
        "admissionControllerConfig": {
          "enabled": true,
          "timeoutSeconds": 10,
          "scanInline": true,
          "disableBypass": false,
          "enforceOnUpdates": true
        },
        "registryOverride": "",
        "disableAuditLogs": false,
        "autoLockProcessBaselinesConfig": {
          "enabled": true
        }
      },

...

      "helmConfig": {
        "dynamicConfig": {
          "admissionControllerConfig": {
            "enabled": true,
            "timeoutSeconds": 10,
            "scanInline": true,
            "disableBypass": false,
            "enforceOnUpdates": true
          },
          "registryOverride": "",
          "disableAuditLogs": false,
          "autoLockProcessBaselinesConfig": {
            "enabled": true
          }
        },

...

Note that process baseline auto-locking is now enabled.

The spec section of the securedcluster CR was then changed to the following

spec:
  admissionControl:
    bypass: BreakGlassAnnotation
    contactImageScanners: DoNotScanInline
    listenOnCreates: true
    listenOnEvents: true
    listenOnUpdates: true
    replicas: 3
    timeoutSeconds: 10
  auditLogs:
    collection: Auto
  clusterName: my-cluster
  monitoring:
    openshift:
      enabled: true
  network:
    policies: Enabled
  perNode:
    collector:
      collection: CORE_BPF
      forceCollection: false
      imageFlavor: Regular
    taintToleration: TolerateTaints
  processBaselines:
    autoLock: Disabled

Again sensor restarted. The cluster config was checked again and process baseline auto-locking had been disabled.

Fresh install

Followed the directions here https://github.com/stackrox/stackrox/blob/master/operator/README.md#installing-operator-via-olm

ROX_PRODUCT_BRANDING=RHACS_BRANDING make deploy-via-olm

Found the url and password for the OpenShift console UI

$ cat /tmp/artifacts-jv-0915-ocp/kubeadmin-password
$ cat /tmp/artifacts-jv-0915-ocp/url

Created central

Got the password for central

$ oc -n stackrox get secret central-htpasswd -o go-template='{{index .data "password" | base64decode}}'

Created a port forward

$ kubectl -n stackrox port-forward deploy/central 8000:8443 > /dev/null 2>&1 &

Logged into localhost:8000 and created and downloaded the init bundle.

$ oc create --namespace stackrox --filename ProcessBaselineAutoLockingTest_FreshInstall-Operator-secrets-cluster-init-bundle.yaml 

Created the secured cluster in the OpenShift console UI.

image

Enabled process baseline auto-locking

image

The cluster config was checked and process baselines auto-locking was enabled.

{
  "clusters": [
    {
      "id": "52b3616d-4078-4c83-a37b-cbf1e46726ac",
      "name": "my-cluster",
      "type": "OPENSHIFT4_CLUSTER",
      "labels": {},
      "mainImage": "quay.io/rhacs-eng/main",
      "collectorImage": "quay.io/rhacs-eng/collector",
      "centralApiEndpoint": "central.stackrox.svc:443",
      "runtimeSupport": true,
      "collectionMethod": "CORE_BPF",
      "admissionController": true,
      "admissionControllerUpdates": true,
      "admissionControllerEvents": true,

...

      "dynamicConfig": {
        "admissionControllerConfig": {
          "enabled": true,
          "timeoutSeconds": 10,
          "scanInline": true,
          "disableBypass": false,
          "enforceOnUpdates": true
        },
        "registryOverride": "",
        "disableAuditLogs": false,
        "autoLockProcessBaselinesConfig": {
          "enabled": true
        }
      },
      "tolerationsConfig": {
        "disabled": false
      },

...

      "helmConfig": {
        "dynamicConfig": {
          "admissionControllerConfig": {
            "enabled": true,
            "timeoutSeconds": 10,
            "scanInline": true,
            "disableBypass": false,
            "enforceOnUpdates": true
          },
          "registryOverride": "",
          "disableAuditLogs": false,
          "autoLockProcessBaselinesConfig": {
            "enabled": true
          }
        },

Process baseline auto-locking was then disabled by editing the secured cluster and the API was checked again to confirm that it was disabled.

@openshift-ci
Copy link

openshift-ci bot commented Sep 6, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Sep 6, 2025

Images are ready for the commit at c0b1eaa.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.9.x-970-gc0b1eaa351.

@codecov
Copy link

codecov bot commented Sep 6, 2025

Codecov Report

❌ Patch coverage is 32.25806% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 48.78%. Comparing base (0c136c8) to head (c0b1eaa).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
operator/api/v1alpha1/zz_generated.deepcopy.go 0.00% 19 Missing ⚠️
operator/api/v1alpha1/securedcluster_types.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #16697      +/-   ##
==========================================
- Coverage   48.79%   48.78%   -0.01%     
==========================================
  Files        2712     2712              
  Lines      202362   202393      +31     
==========================================
  Hits        98736    98736              
- Misses      95844    95870      +26     
- Partials     7782     7787       +5     
Flag Coverage Δ
go-unit-tests 48.78% <32.25%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-30578-configure-process-baseline-auto-locking-via-helm branch 2 times, most recently from a1d98bd to 8bd4d50 Compare September 8, 2025 14:19
@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-30579-configure-process-baseline-auto-locking-via-operator branch from 44a4b01 to 978f69e Compare September 8, 2025 14:34
@JoukoVirtanen JoukoVirtanen marked this pull request as ready for review September 8, 2025 22:25
@JoukoVirtanen JoukoVirtanen requested review from a team as code owners September 8, 2025 22:25
@JoukoVirtanen JoukoVirtanen requested review from mclasmeier and removed request for a team September 8, 2025 22:25
@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-30578-configure-process-baseline-auto-locking-via-helm branch 2 times, most recently from d874377 to f0ef02f Compare September 10, 2025 01:30
@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-30579-configure-process-baseline-auto-locking-via-operator branch 2 times, most recently from e3aa5fa to c9ca7ab Compare September 10, 2025 05:14
Copy link
Contributor

@porridge porridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to take a fresh look at https://github.com/stackrox/stackrox/blob/master/operator/EXTENDING_CRDS.md we've recently improved it to describe the necessary steps in a more streamlined way, and it also has some new important info on defaults.

More comments inline.

@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-30578-configure-process-baseline-auto-locking-via-helm branch from d3b715e to 0dd72f0 Compare September 11, 2025 02:14
@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-30579-configure-process-baseline-auto-locking-via-operator branch from 3dfe3ba to 581c5e7 Compare September 11, 2025 02:16
Copy link
Contributor

@porridge porridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments inline. Also:

  • lint is failing due to formatting
  • operator build is failing due to dirty repo, not sure why, perhaps one of the base PRs are not clean?
  • please follow the EXTENDING_CRDS checklist, in particular the pr prep step

@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-30578-configure-process-baseline-auto-locking-via-helm branch from 0dd72f0 to 7f31d68 Compare September 11, 2025 22:43
@JoukoVirtanen JoukoVirtanen requested a review from a team as a code owner September 11, 2025 22:43
@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-30579-configure-process-baseline-auto-locking-via-operator branch 2 times, most recently from 95e779e to 9ee7e41 Compare September 16, 2025 03:32
@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-30579-configure-process-baseline-auto-locking-via-operator branch from 06153ae to c0b1eaa Compare October 1, 2025 17:38
@JoukoVirtanen
Copy link
Contributor Author

/test gke-upgrade-tests
/test gke-scanner-v4-install-tests
/test gke-qa-e2e-tests
/test gke-nongroovy-e2e-tests
/test gke-operator-e2e-tests
/test gke-ui-e2e-tests

@JoukoVirtanen JoukoVirtanen merged commit 75675ec into master Oct 1, 2025
141 of 146 checks passed
@JoukoVirtanen JoukoVirtanen deleted the jv-ROX-30579-configure-process-baseline-auto-locking-via-operator branch October 1, 2025 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants