Skip to content

ROX-33555: Use ComplianceACK in Sensor-Compliance messages#19419

Open
vikin91 wants to merge 3 commits intomasterfrom
piotr/ROX-33555-emit-compliance-ack-from-sensor
Open

ROX-33555: Use ComplianceACK in Sensor-Compliance messages#19419
vikin91 wants to merge 3 commits intomasterfrom
piotr/ROX-33555-emit-compliance-ack-from-sensor

Conversation

@vikin91
Copy link
Contributor

@vikin91 vikin91 commented Mar 13, 2026

Description

This PR completes node-scanning ACK wiring between Sensor and Compliance using ComplianceACK.

What changed:

  • Sensor now emits ComplianceACK for pre-Central node-scanning rejections:
    • Central unreachable
    • Node unknown to Sensor
  • Compliance now handles ComplianceACK for node-scanning message types only:
    • NODE_INVENTORY
    • NODE_INDEX_REPORT
      and maps ACK/NACK to existing UMH handlers (HandleACK/HandleNACK).
  • Added focused unit tests in Compliance and updated Sensor compliance tests for the new behavior.

Why:

  • Missing implementation for ComplianceACK in Compliance (unimplemented yet).
  • This change moves all Node-Scanning ACKs between Sensor and Compliance to the new message.

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

  • CI
  • By deploying it to a cluster and observing correct behavior regarding node scanning acks by:
    • Looking at the debug logs
    • Looking at the metrics

vikin91 added 2 commits March 13, 2026 16:58
Emit `ComplianceACK` (instead of legacy `NodeInventoryACK`) when Sensor rejects node inventory/index before Central handling (Central unreachable, node unknown), and simplify helper naming/metrics via `sendComplianceAck`. Update compliance handler tests to count `ComplianceACK` paths.

AI-assisted: implementation and test updates. User-defined scope/compat assumptions, reviewed and corrected behavior.
@vikin91
Copy link
Contributor Author

vikin91 commented Mar 13, 2026

This change is part of the following stack:

Change managed by git-spice.

@openshift-ci
Copy link

openshift-ci bot commented Mar 13, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In handleNodeScanningComplianceAck, consider guarding against nil umhNodeInventory/umhNodeIndex handlers before calling HandleACK/HandleNACK to avoid potential panics when the corresponding UMH is not initialized.
  • The sendComplianceAck helper now takes both a reason string and a metricReason enum; for the new call sites that cast metrics.AckReason* to string, double-check that the enum’s underlying type is actually a string and, if not, consider using an explicit string reason to avoid sending numeric codes in the protocol payload.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `handleNodeScanningComplianceAck`, consider guarding against `nil` `umhNodeInventory`/`umhNodeIndex` handlers before calling `HandleACK`/`HandleNACK` to avoid potential panics when the corresponding UMH is not initialized.
- The `sendComplianceAck` helper now takes both a `reason` string and a `metricReason` enum; for the new call sites that cast `metrics.AckReason*` to `string`, double-check that the enum’s underlying type is actually a string and, if not, consider using an explicit string reason to avoid sending numeric codes in the protocol payload.

## Individual Comments

### Comment 1
<location path="sensor/common/compliance/node_inventory_handler_test.go" line_range="712" />
<code_context>
-					ms.ACKCount++
-				case sensor.MsgToCompliance_NodeInventoryACK_NACK:
-					ms.NACKCount++
+				if ack := msg.Msg.GetAck(); ack != nil {
+					t.Logf("Executing ++ on legacy ack action %s", ack.GetAction())
+					switch ack.GetAction() {
</code_context>
<issue_to_address>
**suggestion (testing):** Differentiate assertions for legacy ACK and new ComplianceACK to better prove the new behavior

`consumeAndCountCompliance` currently only increments `ACKCount`/`NACKCount` without distinguishing legacy `NodeInventoryACK` from `ComplianceACK`. Since this PR is about moving node-scanning ACKs to `ComplianceACK`, the tests should assert which ACK type is sent in each scenario. Consider extending `messageStats` to track separate counts (and possibly message types) for legacy ACK vs `ComplianceACK`, and then assert in `TestHandlerNodeUnknown` / `TestHandlerCentralNotReady` that all NACKs are `ComplianceACK` of the expected `MessageType` to directly validate the new wiring.

Suggested implementation:

```golang
				if ack := msg.Msg.GetAck(); ack != nil {
					t.Logf("Executing ++ on legacy ack action %s", ack.GetAction())
					switch ack.GetAction() {
					case sensor.MsgToCompliance_NodeInventoryACK_ACK:
						ms.LegacyACKCount++
					case sensor.MsgToCompliance_NodeInventoryACK_NACK:
						ms.LegacyNACKCount++
					}
					continue
				}
				if complianceAck := msg.Msg.GetComplianceAck(); complianceAck != nil {
					t.Logf("Executing ++ on compliance ack action %s", complianceAck.GetAction())
					switch complianceAck.GetAction() {

```

To fully implement your review comment, the following additional changes are needed elsewhere in `sensor/common/compliance/node_inventory_handler_test.go`:

1. **Extend `messageStats` to distinguish legacy vs ComplianceACK and to track ComplianceACK message types**, for example:
   - Replace the existing `ACKCount` / `NACKCount` fields with:
     - `LegacyACKCount int`
     - `LegacyNACKCount int`
     - `ComplianceACKCount int`
     - `ComplianceNACKCount int`
   - Add a slice to track ComplianceACK types:
     - `ComplianceAckTypes []sensor.MsgToCompliance_ComplianceACK_MessageType` (or the correct enum type for `ComplianceACK.Type` in your proto).

2. **Update the ComplianceACK accounting block** in `consumeAndCountCompliance` (the `if complianceAck := msg.Msg.GetComplianceAck(); ...` section) to:
   - Increment `ms.ComplianceACKCount` / `ms.ComplianceNACKCount` instead of the generic `ACKCount` / `NACKCount`.
   - Append the ComplianceACK message type to `ms.ComplianceAckTypes`, for example:
     ```go
     ms.ComplianceAckTypes = append(ms.ComplianceAckTypes, complianceAck.GetType())
     ```
   - Ensure this block `continue`s after handling the ComplianceACK so non-ACK messages are handled separately as before.

3. **Adjust any other usages of `ACKCount` / `NACKCount`** in this test file:
   - Replace usages that are intended to count *Compliance* ACKs with `ComplianceACKCount` / `ComplianceNACKCount`.
   - Replace usages that are intended to count *legacy* NodeInventory ACKs with `LegacyACKCount` / `LegacyNACKCount`.

4. **Update `TestHandlerNodeUnknown` and `TestHandlerCentralNotReady`** to assert specifically on ComplianceACK behavior:
   - Verify that:
     - `ms.LegacyACKCount == 0` and `ms.LegacyNACKCount == 0` (if the new wiring should not emit legacy ACKs in these scenarios).
     - The expected number of NACKs is observed via `ms.ComplianceNACKCount`.
     - All `ms.ComplianceAckTypes` equal the expected `ComplianceACK` message type (e.g., `sensor.MsgToCompliance_ComplianceACK_NODE_SCAN` or whatever is correct for node scanning).
   - If these tests previously asserted on `ACKCount` / `NACKCount`, update them to use the new fields and to include explicit checks of `ComplianceAckTypes`.

These additional changes will allow the tests to clearly differentiate between legacy `NodeInventoryACK` and the new `ComplianceACK`, and to directly validate that node-scanning NACKs are sent as `ComplianceACK` with the expected message type.
</issue_to_address>

### Comment 2
<location path="sensor/common/compliance/node_inventory_handler_test.go" line_range="845" />
<code_context>
 	// expect centralConsumer to get 0 messages - sensor should drop inventory when node is not found
 	centralConsumer := consumeAndCount(h.ResponsesC(), 0)
-	// expect complianceConsumer to get 10 NACK messages
+	// expect complianceConsumer to get 10 NACK messages (ComplianceACK only)
 	complianceConsumer := consumeAndCount(h.ComplianceC(), 10)
 	s.NoError(producer.Stopped().Wait())
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen tests to assert NACKs are actually ComplianceACK with correct fields

The comments now specify "10 NACK messages (ComplianceACK only)", but the tests still only assert the total count. To validate the new behavior, please also assert that:
1) All 10 messages are `ComplianceACK` (no legacy ACKs), and
2) Each message has the expected `MessageType` and `ResourceId` / `Reason` (e.g., `NODE_INVENTORY` vs `NODE_INDEX_REPORT`, `CentralUnreachable` vs `NodeUnknown`).

This will ensure regressions in ACK type or message fields are caught even if the count remains 10.

Suggested implementation:

```golang
	// expect complianceConsumer to get 10 NACK messages (ComplianceACK only)
	// and verify they are ComplianceACK with the expected message fields
	nackMessagesNodeUnknown := make([]*sensor.MsgToCompliance, 0, 10)
	for i := 0; i < 10; i++ {
		select {
		case msg := <-h.ComplianceC():
			nackMessagesNodeUnknown = append(nackMessagesNodeUnknown, msg)
		case <-time.After(5 * time.Second):
			s.Fail("timed out waiting for Compliance NACK messages for unknown node")
		}
	}
	s.Len(nackMessagesNodeUnknown, 10, "expected exactly 10 Compliance NACK messages for unknown node")

	for _, msg := range nackMessagesNodeUnknown {
		ackMsg, ok := msg.Msg.(*sensor.MsgToCompliance_ComplianceACK)
		s.Require().True(ok, "expected ComplianceACK message (no legacy ACKs)")
		ack := ackMsg.ComplianceACK
		s.Require().NotNil(ack, "ComplianceACK payload must not be nil")

		// All messages must be NACKs with the expected type/reason for unknown node inventory
		s.Equal(sensor.ComplianceACK_NACK, ack.GetAckType(), "expected NACK ComplianceACK")
		s.Equal(sensor.ComplianceMessage_NODE_INVENTORY, ack.GetMessageType(), "unexpected ComplianceACK message type for unknown node")
		s.Equal(sensor.ComplianceAckReason_NODE_UNKNOWN, ack.GetReason(), "unexpected ComplianceACK reason for unknown node")
		// ResourceId should identify the (unknown) node; exact value may be test-specific
		s.NotEmpty(ack.GetResourceId(), "ComplianceACK ResourceId should be populated for unknown node")
	}

	s.NoError(producer.Stopped().Wait())
	s.NoError(centralConsumer.Stopped().Wait())

	s.NoError(h.Start())
	// expect centralConsumer to get 0 messages - sensor should NACK to compliance when the connection with central is not ready
	centralConsumer := consumeAndCount(h.ResponsesC(), 0)

	// expect complianceConsumer to get 10 NACK messages (ComplianceACK only)
	// and verify they are ComplianceACK with the expected message fields
	nackMessagesCentralUnreachable := make([]*sensor.MsgToCompliance, 0, 10)
	for i := 0; i < 10; i++ {
		select {
		case msg := <-h.ComplianceC():
			nackMessagesCentralUnreachable = append(nackMessagesCentralUnreachable, msg)
		case <-time.After(5 * time.Second):
			s.Fail("timed out waiting for Compliance NACK messages when central is unreachable")
		}
	}
	s.Len(nackMessagesCentralUnreachable, 10, "expected exactly 10 Compliance NACK messages when central is unreachable")

	for _, msg := range nackMessagesCentralUnreachable {
		ackMsg, ok := msg.Msg.(*sensor.MsgToCompliance_ComplianceACK)
		s.Require().True(ok, "expected ComplianceACK message (no legacy ACKs)")
		ack := ackMsg.ComplianceACK
		s.Require().NotNil(ack, "ComplianceACK payload must not be nil")

		// All messages must be NACKs with the expected type/reason for central-unreachable inventory/index reports
		s.Equal(sensor.ComplianceACK_NACK, ack.GetAckType(), "expected NACK ComplianceACK")
		s.Equal(sensor.ComplianceMessage_NODE_INDEX_REPORT, ack.GetMessageType(), "unexpected ComplianceACK message type when central is unreachable")
		s.Equal(sensor.ComplianceAckReason_CENTRAL_UNREACHABLE, ack.GetReason(), "unexpected ComplianceACK reason when central is unreachable")
		s.NotEmpty(ack.GetResourceId(), "ComplianceACK ResourceId should be populated when central is unreachable")
	}

	s.NoError(producer.Stopped().Wait())
	s.NoError(centralConsumer.Stopped().Wait())

```

1. Ensure the test file imports the `time` package if it is not already imported:
   `import "time"`.
2. Replace the enum/constants used in the assertions with the actual ones from your protobufs:
   - `sensor.ComplianceACK_NACK` should match the real enum value for a NACK ComplianceACK.
   - `sensor.ComplianceMessage_NODE_INVENTORY` and `sensor.ComplianceMessage_NODE_INDEX_REPORT` should match the actual message-type enum values for node inventory and node index report messages.
   - `sensor.ComplianceAckReason_NODE_UNKNOWN` and `sensor.ComplianceAckReason_CENTRAL_UNREACHABLE` should match the actual reason enums for "node unknown" and "central unreachable".
3. If `sensor.MsgToCompliance` and its oneof access (`msg.Msg.(*sensor.MsgToCompliance_ComplianceACK)`) differ in your generated code, adjust the type assertion and field access accordingly.
4. If `consumeAndCount` also needs to be used elsewhere for counting only, keep its implementation; in these two cases we intentionally bypass it to validate message contents as well as counts.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@vikin91
Copy link
Contributor Author

vikin91 commented Mar 13, 2026

@sourcery-ai re-review

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Mar 13, 2026

Sure! I'm generating a new review now.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Mar 13, 2026

Hey @vikin91, I've posted a new review for you!

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Mar 13, 2026

Images are ready for the commit at 3854815.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-321-g3854815024.

@vikin91 vikin91 marked this pull request as ready for review March 13, 2026 16:31
@vikin91 vikin91 requested a review from a team as a code owner March 13, 2026 16:31
@codecov
Copy link

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 75.00000% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.64%. Comparing base (66a258b) to head (3854815).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
...r/common/compliance/node_inventory_handler_impl.go 67.79% 19 Missing ⚠️
compliance/compliance.go 95.23% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #19419      +/-   ##
==========================================
- Coverage   49.70%   49.64%   -0.07%     
==========================================
  Files        2701     2702       +1     
  Lines      203453   203808     +355     
==========================================
+ Hits       101134   101176      +42     
- Misses      94790    95108     +318     
+ Partials     7529     7524       -5     
Flag Coverage Δ
go-unit-tests 49.64% <75.00%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants