ROX-31227: Compute TP equivalence hash in Central#19924
ROX-31227: Compute TP equivalence hash in Central#19924
Conversation
Add namespace and set_values to the internal API ComplianceOperatorProfileV2 so Sensor forwards these fields to Central for hash computation. Add equivalence_hash to the storage proto for persisting the computed result. Previously the hash was computed in Sensor, which caused inconsistent hashes across clusters during rolling upgrades when the algorithm changed between Sensor versions. Computing in Central (single binary) eliminates this problem. Partially generated by AI.
Move equivalence hash computation from Sensor to Central to ensure consistent hashes across clusters regardless of Sensor version. Sensor changes: - Populate namespace and set_values on the V2 profile message so Central has all fields needed for hash computation. Central changes: - computeEquivalenceHash() in the conversion layer produces a SHA-256 digest from name, namespace, description, title, sorted rules, and sorted set_values (name+value only; rationale excluded as it is documentation, not configuration). - Hash is computed only for TAILORED_PROFILE kind; regular profiles use name-based equivalence and get an empty hash. No DB migration needed: CO resources skip deduplication and are re-sent on every Sensor reconnect, so hashes are populated on next sync. Partially generated by AI.
|
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The manual string concatenation in
computeEquivalenceHash(with ad‑hoc NUL separators and custom sorting) tightly couples the hash to this implementation detail; consider basing the hash on a canonicalised struct or protobuf serialization of the relevant fields so that future changes to the representation are less likely to introduce subtle collisions or omissions. - Right now only the rule name is included in the hash input; if other rule attributes (e.g. remediation or severity) ever become relevant for equivalence, it will be easy to forget to update
computeEquivalenceHash, so it may be worth adding a short comment near the rules loop stating that only the name is intentionally considered.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The manual string concatenation in `computeEquivalenceHash` (with ad‑hoc NUL separators and custom sorting) tightly couples the hash to this implementation detail; consider basing the hash on a canonicalised struct or protobuf serialization of the relevant fields so that future changes to the representation are less likely to introduce subtle collisions or omissions.
- Right now only the rule name is included in the hash input; if other rule attributes (e.g. remediation or severity) ever become relevant for equivalence, it will be easy to forget to update `computeEquivalenceHash`, so it may be worth adding a short comment near the rules loop stating that only the name is intentionally considered.
## Individual Comments
### Comment 1
<location path="central/convert/internaltov2storage/compliance_operator_profile_test.go" line_range="171-175" />
<code_context>
+ []string{"rule-a"}, []*central.ComplianceOperatorProfileV2_SetValue{svA, svB},
+ central.ComplianceOperatorProfileV2_TAILORED_PROFILE))
+
+ tests := map[string][]*central.ComplianceOperatorProfileV2_SetValue{
+ "different value": {{Name: "var-a", Value: "other"}, svB},
+ "different name": {{Name: "var-x", Value: "va"}, svB},
+ "extra entry": {svA, svB, {Name: "var-c", Value: "vc"}},
+ "fewer entries": {svA},
+ }
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Fix type mismatch in `TestEquivalenceHash_SetValuesSensitivity` map literal
In `TestEquivalenceHash_SetValuesSensitivity`, `tests` is typed as `map[string][]*central.ComplianceOperatorProfileV2_SetValue`, but the `"extra entry"` case includes `{Name: "var-c", Value: "vc"}` as a value, not a pointer. This won’t compile. Use `¢ral.ComplianceOperatorProfileV2_SetValue{Name: "var-c", Value: "vc"}` (and ensure all elements are pointers) so the test type-checks.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| tests := map[string][]*central.ComplianceOperatorProfileV2_SetValue{ | ||
| "different value": {{Name: "var-a", Value: "other"}, svB}, | ||
| "different name": {{Name: "var-x", Value: "va"}, svB}, | ||
| "extra entry": {svA, svB, {Name: "var-c", Value: "vc"}}, | ||
| "fewer entries": {svA}, |
There was a problem hiding this comment.
issue (bug_risk): Fix type mismatch in TestEquivalenceHash_SetValuesSensitivity map literal
In TestEquivalenceHash_SetValuesSensitivity, tests is typed as map[string][]*central.ComplianceOperatorProfileV2_SetValue, but the "extra entry" case includes {Name: "var-c", Value: "vc"} as a value, not a pointer. This won’t compile. Use ¢ral.ComplianceOperatorProfileV2_SetValue{Name: "var-c", Value: "vc"} (and ensure all elements are pointers) so the test type-checks.
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR adds support for tailored compliance profiles by introducing equivalence hashing. It extends the internal API schema with Changes
Sequence Diagram(s)sequenceDiagram
participant Sensor
participant Central
participant Storage
Sensor->>Central: Report ComplianceOperatorProfileV2<br/>(with namespace, set_values)
Central->>Central: Convert internal to storage format
Note over Central: Determine operatorKind
alt operatorKind == TAILORED_PROFILE
Central->>Central: computeEquivalenceHash()<br/>SHA-256(name, namespace,<br/>description, title,<br/>sorted rules, sorted set_values)
Central->>Central: Assign equivalence_hash
else non-tailored profile
Central->>Central: equivalence_hash = "" (empty)
end
Central->>Storage: Store ComplianceOperatorProfileV2<br/>(with equivalence_hash)
Storage-->>Storage: Persist profile record
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@central/convert/internaltov2storage/compliance_operator_profile.go`:
- Around line 119-149: computeEquivalenceHash currently preserves duplicate
entries when building ruleNames and setVals, making the hash sensitive to
multiplicity; after sorting the ruleNames slice and the setVals slice (the
nameValue slice sorted via slices.SortFunc) deduplicate consecutive equal
entries (for rules compare rule name, for set_values compare both name and
value) before writing to the hash in computeEquivalenceHash so only unique
effective entries are hashed; update the function that builds ruleNames and
setVals to remove duplicates after sorting and add a regression test that
constructs messages with repeated rules and set_values to assert identical
hashes for duplicated vs. deduped inputs (reference computeEquivalenceHash,
ruleNames, setVals, and the nameValue sort comparison).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: ad38a3bb-0b9a-41d5-aa8c-0c2fa67da4bf
⛔ Files ignored due to path filters (5)
generated/internalapi/central/compliance_operator.pb.gois excluded by!**/*.pb.go,!**/generated/**generated/internalapi/central/compliance_operator_vtproto.pb.gois excluded by!**/*.pb.go,!**/generated/**generated/internalapi/central/v1/token_service.pb.gois excluded by!**/*.pb.go,!**/generated/**generated/storage/compliance_operator_v2.pb.gois excluded by!**/*.pb.go,!**/generated/**generated/storage/compliance_operator_v2_vtproto.pb.gois excluded by!**/*.pb.go,!**/generated/**
📒 Files selected for processing (6)
central/convert/internaltov2storage/compliance_operator_profile.gocentral/convert/internaltov2storage/compliance_operator_profile_test.goproto/internalapi/central/compliance_operator.protoproto/storage/compliance_operator_v2.protosensor/kubernetes/complianceoperator/dispatchers/complianceoperatortailoredprofiles.gosensor/kubernetes/complianceoperator/dispatchers/complianceoperatortailoredprofiles_test.go
| // Rules: sorted for order independence. | ||
| ruleNames := make([]string, 0, len(msg.GetRules())) | ||
| for _, r := range msg.GetRules() { | ||
| ruleNames = append(ruleNames, r.GetRuleName()) | ||
| } | ||
| slices.Sort(ruleNames) | ||
| for _, r := range ruleNames { | ||
| _, _ = h.Write([]byte(r)) | ||
| _, _ = h.Write([]byte{0}) | ||
| } | ||
| // Section separator between rules and set_values. | ||
| _, _ = h.Write([]byte{0}) | ||
|
|
||
| // SetValues: sorted by name then value for order independence. | ||
| type nameValue struct{ name, value string } | ||
| setVals := make([]nameValue, 0, len(msg.GetSetValues())) | ||
| for _, sv := range msg.GetSetValues() { | ||
| setVals = append(setVals, nameValue{name: sv.GetName(), value: sv.GetValue()}) | ||
| } | ||
| slices.SortFunc(setVals, func(a, b nameValue) int { | ||
| if c := strings.Compare(a.name, b.name); c != 0 { | ||
| return c | ||
| } | ||
| return strings.Compare(a.value, b.value) | ||
| }) | ||
| for _, sv := range setVals { | ||
| _, _ = h.Write([]byte(sv.name)) | ||
| _, _ = h.Write([]byte{0}) | ||
| _, _ = h.Write([]byte(sv.value)) | ||
| _, _ = h.Write([]byte{0}) | ||
| } |
There was a problem hiding this comment.
Canonicalize duplicates before hashing.
computeEquivalenceHash() normalizes ordering, but it still hashes duplicate rules / set_values as distinct entries. That makes the digest depend on declaration multiplicity instead of effective configuration. The dispatcher already appends base rules and enabled rules without deduping in sensor/kubernetes/complianceoperator/dispatchers/complianceoperatortailoredprofiles.go at Lines 92-104, so a redundant enable can change the hash and break cross-cluster equivalence for otherwise identical tailored profiles. Please dedupe after sorting, and add a regression test for duplicate rules/set_values.
Proposed fix
slices.Sort(ruleNames)
- for _, r := range ruleNames {
+ var prevRule string
+ hasPrevRule := false
+ for _, r := range ruleNames {
+ if hasPrevRule && r == prevRule {
+ continue
+ }
_, _ = h.Write([]byte(r))
_, _ = h.Write([]byte{0})
+ prevRule = r
+ hasPrevRule = true
}
// Section separator between rules and set_values.
_, _ = h.Write([]byte{0})
// SetValues: sorted by name then value for order independence.
type nameValue struct{ name, value string }
setVals := make([]nameValue, 0, len(msg.GetSetValues()))
for _, sv := range msg.GetSetValues() {
setVals = append(setVals, nameValue{name: sv.GetName(), value: sv.GetValue()})
}
slices.SortFunc(setVals, func(a, b nameValue) int {
if c := strings.Compare(a.name, b.name); c != 0 {
return c
}
return strings.Compare(a.value, b.value)
})
- for _, sv := range setVals {
+ var prevSetVal nameValue
+ hasPrevSetVal := false
+ for _, sv := range setVals {
+ if hasPrevSetVal && sv == prevSetVal {
+ continue
+ }
_, _ = h.Write([]byte(sv.name))
_, _ = h.Write([]byte{0})
_, _ = h.Write([]byte(sv.value))
_, _ = h.Write([]byte{0})
+ prevSetVal = sv
+ hasPrevSetVal = true
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Rules: sorted for order independence. | |
| ruleNames := make([]string, 0, len(msg.GetRules())) | |
| for _, r := range msg.GetRules() { | |
| ruleNames = append(ruleNames, r.GetRuleName()) | |
| } | |
| slices.Sort(ruleNames) | |
| for _, r := range ruleNames { | |
| _, _ = h.Write([]byte(r)) | |
| _, _ = h.Write([]byte{0}) | |
| } | |
| // Section separator between rules and set_values. | |
| _, _ = h.Write([]byte{0}) | |
| // SetValues: sorted by name then value for order independence. | |
| type nameValue struct{ name, value string } | |
| setVals := make([]nameValue, 0, len(msg.GetSetValues())) | |
| for _, sv := range msg.GetSetValues() { | |
| setVals = append(setVals, nameValue{name: sv.GetName(), value: sv.GetValue()}) | |
| } | |
| slices.SortFunc(setVals, func(a, b nameValue) int { | |
| if c := strings.Compare(a.name, b.name); c != 0 { | |
| return c | |
| } | |
| return strings.Compare(a.value, b.value) | |
| }) | |
| for _, sv := range setVals { | |
| _, _ = h.Write([]byte(sv.name)) | |
| _, _ = h.Write([]byte{0}) | |
| _, _ = h.Write([]byte(sv.value)) | |
| _, _ = h.Write([]byte{0}) | |
| } | |
| // Rules: sorted for order independence. | |
| ruleNames := make([]string, 0, len(msg.GetRules())) | |
| for _, r := range msg.GetRules() { | |
| ruleNames = append(ruleNames, r.GetRuleName()) | |
| } | |
| slices.Sort(ruleNames) | |
| var prevRule string | |
| hasPrevRule := false | |
| for _, r := range ruleNames { | |
| if hasPrevRule && r == prevRule { | |
| continue | |
| } | |
| _, _ = h.Write([]byte(r)) | |
| _, _ = h.Write([]byte{0}) | |
| prevRule = r | |
| hasPrevRule = true | |
| } | |
| // Section separator between rules and set_values. | |
| _, _ = h.Write([]byte{0}) | |
| // SetValues: sorted by name then value for order independence. | |
| type nameValue struct{ name, value string } | |
| setVals := make([]nameValue, 0, len(msg.GetSetValues())) | |
| for _, sv := range msg.GetSetValues() { | |
| setVals = append(setVals, nameValue{name: sv.GetName(), value: sv.GetValue()}) | |
| } | |
| slices.SortFunc(setVals, func(a, b nameValue) int { | |
| if c := strings.Compare(a.name, b.name); c != 0 { | |
| return c | |
| } | |
| return strings.Compare(a.value, b.value) | |
| }) | |
| var prevSetVal nameValue | |
| hasPrevSetVal := false | |
| for _, sv := range setVals { | |
| if hasPrevSetVal && sv == prevSetVal { | |
| continue | |
| } | |
| _, _ = h.Write([]byte(sv.name)) | |
| _, _ = h.Write([]byte{0}) | |
| _, _ = h.Write([]byte(sv.value)) | |
| _, _ = h.Write([]byte{0}) | |
| prevSetVal = sv | |
| hasPrevSetVal = true | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@central/convert/internaltov2storage/compliance_operator_profile.go` around
lines 119 - 149, computeEquivalenceHash currently preserves duplicate entries
when building ruleNames and setVals, making the hash sensitive to multiplicity;
after sorting the ruleNames slice and the setVals slice (the nameValue slice
sorted via slices.SortFunc) deduplicate consecutive equal entries (for rules
compare rule name, for set_values compare both name and value) before writing to
the hash in computeEquivalenceHash so only unique effective entries are hashed;
update the function that builds ruleNames and setVals to remove duplicates after
sorting and add a regression test that constructs messages with repeated rules
and set_values to assert identical hashes for duplicated vs. deduped inputs
(reference computeEquivalenceHash, ruleNames, setVals, and the nameValue sort
comparison).
🚀 Build Images ReadyImages are ready for commit b13e27c. To use with deploy scripts: export MAIN_IMAGE_TAG=4.11.x-609-gb13e27c3bb |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #19924 +/- ##
==========================================
+ Coverage 49.61% 49.64% +0.03%
==========================================
Files 2765 2765
Lines 208541 208593 +52
==========================================
+ Hits 103464 103553 +89
+ Misses 97401 97363 -38
- Partials 7676 7677 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Description
Move tailored profile equivalence hash computation from Sensor to Central.
Problem: PR #19561 computed the hash in Sensor. During rolling upgrades when Sensors run different versions, algorithm changes produce inconsistent hashes for identical profiles across clusters.
Solution: Sensor forwards namespace and set_values to Central. Central computes a SHA-256 digest from the profile's content fields (name, namespace, description, title, sorted rules, sorted set_values). The hash is stored only for tailored profiles — regular profiles use name-based equivalence.
Alternative considered: keeping hash in Sensor with a version field to detect mismatches. Rejected because it adds complexity and still requires Central to reconcile different versions.
Key decisions:
equivalence_hashnot in internal API proto — Sensor never computes itUser-facing documentation
Testing and quality
Automated testing
How I validated my change
go build ./central/... ./sensor/...— cleangolangci-lint run— 0 issues