Skip to content

ROX-30390: Add indicator key functions#16621

Merged
vikin91 merged 9 commits intomasterfrom
piotr/ROX-30390-indicator-key
Sep 3, 2025
Merged

ROX-30390: Add indicator key functions#16621
vikin91 merged 9 commits intomasterfrom
piotr/ROX-30390-indicator-key

Conversation

@vikin91
Copy link
Contributor

@vikin91 vikin91 commented Sep 2, 2025

Description

The goal of this PR is to add ID functions for all indicators. There are two functions:

  • keyString() that consumes more memory, but is CPU-optimized and can be used for clusters with little netFlow events,
  • keyHash() that optimizes memory usage but consumes more CPU. It should be used as default, especially in clusters with many open endpoints, processes and network connections.

The produced keys are not used in this PR. They will be used as deduper key in the next PR.

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

  • Unit tests.
  • I benchmarked both variants and looked at the CPU profiles and memory profiles.
  • I calculated the collision probability for selected hash functions and chose a balance between memory consumption, CPU performance and collision probability (key data is documented in the comments).
    • I chose a 64 bit hash. If the current collision probability (2.7% for 100M items) is too high, we may switch to a 128-bit hashing function, however that would consume more memory for the deduper.
    • The consequence of hash collision will be: not sending an update to Central when an update should have been sent (not in this PR).

@openshift-ci
Copy link

openshift-ci bot commented Sep 2, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Consider centralizing the repeated Key(h) switch logic into a helper or interface to reduce duplication across NetworkConn, ContainerEndpoint, and ProcessListening.
  • The hashToHexString function can be simplified to fmt.Sprintf("%016x", h.Sum64()) instead of manually building a byte slice, which will improve readability.
  • Rather than silently defaulting unknown HashingAlgo values to the hash implementation, consider validating the input and handling invalid algos explicitly to avoid unintentional fallbacks.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider centralizing the repeated Key(h) switch logic into a helper or interface to reduce duplication across NetworkConn, ContainerEndpoint, and ProcessListening.
- The hashToHexString function can be simplified to fmt.Sprintf("%016x", h.Sum64()) instead of manually building a byte slice, which will improve readability.
- Rather than silently defaulting unknown HashingAlgo values to the hash implementation, consider validating the input and handling invalid algos explicitly to avoid unintentional fallbacks.

## Individual Comments

### Comment 1
<location> `sensor/common/networkflow/manager/indicator/key.go:125` </location>
<code_context>
+	}
+}
+
+func hashStrings(h hash.Hash64, strs ...string) {
+	for _, s := range strs {
+		_, _ = h.Write([]byte(s))
</code_context>

<issue_to_address>
Potential risk of hash collisions due to lack of delimiters between concatenated strings.

Add a delimiter or length prefix between each string when writing to the hash to prevent collisions, especially if strings may contain arbitrary data.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Sep 2, 2025

Images are ready for the commit at 7d81935.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.9.x-640-g7d81935960.

@vikin91
Copy link
Contributor Author

vikin91 commented Sep 2, 2025

  • The hashToHexString function can be simplified to fmt.Sprintf("%016x", h.Sum64()) instead of manually building a byte slice, which will improve readability.

Yes, it can, but fmt.Sprintf is less performant. I optimized that function, added comment that it does the same as fmt.Sprintf("%016x", h.Sum64()) and persisted benchmark summary in a comment.

@vikin91
Copy link
Contributor Author

vikin91 commented Sep 2, 2025

  • Rather than silently defaulting unknown HashingAlgo values to the hash implementation, consider validating the input and handling invalid algos explicitly to avoid unintentional fallbacks.

I find this overkill. I can consider that once we add an env variable to switch the algorithm (that would come later).

@codecov
Copy link

codecov bot commented Sep 2, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 48.66%. Comparing base (d616685) to head (f73527e).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #16621      +/-   ##
==========================================
- Coverage   48.72%   48.66%   -0.06%     
==========================================
  Files        2658     2661       +3     
  Lines      198307   198976     +669     
==========================================
+ Hits        96625    96833     +208     
- Misses      94112    94552     +440     
- Partials     7570     7591      +21     
Flag Coverage Δ
go-unit-tests 48.66% <100.00%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vikin91 vikin91 added the auto-retest PRs with this label will be automatically retested if prow checks fails label Sep 2, 2025
@rhacs-bot
Copy link
Contributor

/retest

@red-hat-konflux

This comment was marked as off-topic.

@rhacs-bot
Copy link
Contributor

/retest

@vikin91 vikin91 force-pushed the piotr/ROX-30390-indicator-key branch from 7d81935 to d328baa Compare September 3, 2025 07:53
@rhacs-bot
Copy link
Contributor

rhacs-bot commented Sep 3, 2025

Images are ready for the commit at d9102ef.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.9.x-657-gd328baa384.

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Sep 3, 2025

Images are ready for the commit at f73527e.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.9.x-659-gf73527e8db.

Copy link
Contributor

@lvalerom lvalerom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Love the bitwise operations ❤️

@openshift-ci
Copy link

openshift-ci bot commented Sep 3, 2025

@vikin91: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ocp-4-12-nongroovy-e2e-tests f73527e link false /test ocp-4-12-nongroovy-e2e-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@vikin91
Copy link
Contributor Author

vikin91 commented Sep 3, 2025

CI failures of ocp-4-12-nongroovy-e2e-tests:

main: 2025/09/03 11:29:56.105095 container_instances_test.go:93: Info: {[]}
    container_instances_test.go:45: 
        	Error Trace:	/go/src/github.com/stackrox/stackrox/tests/container_instances_test.go:45
        	            				/go/src/github.com/stackrox/stackrox/pkg/testutils/retry.go:65
        	            				/go/src/github.com/stackrox/stackrox/tests/container_instances_test.go:40
        	            				/go/src/github.com/stackrox/stackrox/pkg/testutils/retry.go:65
        	            				/go/src/github.com/stackrox/stackrox/tests/container_instances_test.go:28
        	Error:      	"[]" should have 2 item(s), but has 0
        	Test:       	TestContainerInstances
--- FAIL: TestContainerInstances (247.67s)

and

main: 2025/09/03 11:38:14.245260 pods_test.go:222: Info: Get Events: {Pod:{IDStruct:{ID:a5f177b0-9cad-50d8-bf48-567d69646dc7} Name:end-to-end-api-test-pod-multi-container ContainerCount:2 Started:2025-09-03 11:36:52 +0000 UTC Events:[]}}
    pods_test.go:75: 20: Events: []
    pods_test.go:80: 
        	Error Trace:	/go/src/github.com/stackrox/stackrox/tests/pods_test.go:80
        	            				/go/src/github.com/stackrox/stackrox/pkg/testutils/retry.go:65
        	            				/go/src/github.com/stackrox/stackrox/tests/pods_test.go:44
        	Error:      	"21" is not less than or equal to "20"
        	Test:       	TestPod
--- FAIL: TestPod (270.62s)

They must be unrelated to this particular change, so I allow myself to skip investigating them deeper.

@rhacs-bot
Copy link
Contributor

/retest

@vikin91 vikin91 merged commit 817b235 into master Sep 3, 2025
141 of 170 checks passed
@vikin91 vikin91 deleted the piotr/ROX-30390-indicator-key branch September 3, 2025 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-review area/sensor auto-retest PRs with this label will be automatically retested if prow checks fails

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants