Skip to content

ROX-33557: fix node resolution from hostname#19435

Merged
Stringy merged 4 commits intomasterfrom
giles/ROX-33557-node-violations-missing
Mar 17, 2026
Merged

ROX-33557: fix node resolution from hostname#19435
Stringy merged 4 commits intomasterfrom
giles/ROX-33557-node-violations-missing

Conversation

@Stringy
Copy link
Contributor

@Stringy Stringy commented Mar 16, 2026

Description

On AWS (OSD, ROSA, EKS, etc), the node name does not match the hostname. Instead, the node name is the private DNS name of the form <hostname>.<domain> This changes the look up to perform a prefix match when the hostname doesn't match directly.

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

Verified on a ROSA cluster, triggering a node file activity event and seeing that the node is resolved correctly based on the hostname.

On AWS/EKS, the node name does not match the hostname. Instead, the
node name is the private DNS name of the form <hostname>.<domain>
This changes the look up to perform a prefix match when the hostname
doesn't match directly.
@Stringy Stringy requested a review from clickboo March 16, 2026 12:22
@Stringy Stringy requested a review from a team as a code owner March 16, 2026 12:22
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In getNodeByPrefix, the read lock is acquired and immediately released before iterating the nodes map, which both defeats the purpose of the lock and risks a data race; keep the RLock held for the duration of the iteration (e.g., via defer s.mutex.RUnlock() inside the function).
  • Within getNodeByPrefix, you can avoid an extra map lookup by ranging over both key and value (for k, v := range s.nodes) and returning v instead of indexing s.nodes[k] again.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `getNodeByPrefix`, the read lock is acquired and immediately released before iterating the `nodes` map, which both defeats the purpose of the lock and risks a data race; keep the `RLock` held for the duration of the iteration (e.g., via `defer s.mutex.RUnlock()` inside the function).
- Within `getNodeByPrefix`, you can avoid an extra map lookup by ranging over both key and value (`for k, v := range s.nodes`) and returning `v` instead of indexing `s.nodes[k]` again.

## Individual Comments

### Comment 1
<location path="sensor/kubernetes/listener/resources/node_store.go" line_range="103-107" />
<code_context>
 	return s.nodes[nodeName]
 }

+func (s *nodeStoreImpl) getNodeByPrefix(prefix string) *nodeWrap {
+	s.mutex.RLock()
+	s.mutex.RUnlock()
+
+	for k := range s.nodes {
+		if strings.HasPrefix(k, prefix) {
+			return s.nodes[k]
</code_context>
<issue_to_address>
**issue (bug_risk):** The read lock is acquired and immediately released before iterating over the map, leaving the map access unprotected and racy.

`getNodeByPrefix` releases the read lock before iterating `s.nodes`, so the iteration and map lookups run without synchronization and can race with concurrent modifications. The lock should be held for the entire loop; if you keep early returns, use `defer s.mutex.RUnlock()` to ensure the lock is always released.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Mar 16, 2026

Images are ready for the commit at ad43479.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-333-gad434795e0.

@codecov
Copy link

codecov bot commented Mar 16, 2026

Codecov Report

❌ Patch coverage is 0% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.36%. Comparing base (3ab317c) to head (ad43479).
⚠️ Report is 21 commits behind head on master.

Files with missing lines Patch % Lines
sensor/kubernetes/listener/resources/node_store.go 0.00% 28 Missing ⚠️
sensor/common/detector/detector.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #19435      +/-   ##
==========================================
- Coverage   49.73%   49.36%   -0.38%     
==========================================
  Files        2703     2713      +10     
  Lines      204040   205005     +965     
==========================================
- Hits       101483   101200     -283     
- Misses      94981    96274    +1293     
+ Partials     7576     7531      -45     
Flag Coverage Δ
go-unit-tests 49.36% <0.00%> (-0.38%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@vikin91 vikin91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (with one comment about potential edge-case)

@Stringy Stringy force-pushed the giles/ROX-33557-node-violations-missing branch from 8118737 to ad43479 Compare March 17, 2026 12:56
@Stringy Stringy merged commit f5d5cf0 into master Mar 17, 2026
91 of 92 checks passed
@Stringy Stringy deleted the giles/ROX-33557-node-violations-missing branch March 17, 2026 15:50
rhacs-bot pushed a commit that referenced this pull request Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants