ROX-33361: Per-namespace persistence for process indicators#19957
ROX-33361: Per-namespace persistence for process indicators#19957
Conversation
🚀 Build Images ReadyImages are ready for commit f03bc24. To use with deploy scripts: export MAIN_IMAGE_TAG=4.11.x-632-gf03bc24f22 |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #19957 +/- ##
==========================================
+ Coverage 49.56% 49.63% +0.06%
==========================================
Files 2764 2766 +2
Lines 208357 208732 +375
==========================================
+ Hits 103276 103595 +319
- Misses 97430 97470 +40
- Partials 7651 7667 +16
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
680576b to
e2ec42b
Compare
|
/test ocp-4-21-qa-e2e-tests |
| // The difference between Static and Dynamic cluster config is that Dynamic values are sent over the Central to Sensor gRPC connection. This has the benefit of allowing for "hot reloading" of values without restarting Secured cluster components. | ||
| message DynamicClusterConfig { | ||
| message ProcessIndicators { | ||
| string namespace_filter = 1; |
There was a problem hiding this comment.
It's not obvious whether this filter means namespaces to include or exclude.
There was a problem hiding this comment.
+1. exclude_namespace_filter? namespace_exclusion_filter?
| message ProcessIndicators { | ||
| string namespace_filter = 1; | ||
| bool exclude_openshift_ns = 2; | ||
| bool persistence = 3; |
There was a problem hiding this comment.
What is the meaning of persistence?
There was a problem hiding this comment.
Let me add more commentaries to the proto file.
There was a problem hiding this comment.
Done, does it look better?
There was a problem hiding this comment.
I think a few stated in the design that this should be the first item as it controls the other 2.
57f2aa8 to
cad5788
Compare
|
The only failing tests are flake #19959 |
| // The difference between Static and Dynamic cluster config is that Dynamic values are sent over the Central to Sensor gRPC connection. This has the benefit of allowing for "hot reloading" of values without restarting Secured cluster components. | ||
| message DynamicClusterConfig { | ||
| // Configure per-namespace persistence for ProcessIndicators | ||
| message ProcessIndicators { |
There was a problem hiding this comment.
nit: ProcessIndicatorConfig
| if filter, err := clusterPkg.GetNamespaceFilter(c); err == nil { | ||
| compiledFilter, err := regexp.Compile(filter) | ||
|
|
||
| if err == nil { |
There was a problem hiding this comment.
We should log this error so we don't lose it.
| if filter, err := clusterPkg.GetNamespaceFilter(cluster); err == nil { | ||
| compiledFilter, err := regexp.Compile(filter) | ||
|
|
||
| if err == nil { |
There was a problem hiding this comment.
Adding onto this - logging the error is great, but the cluster upsert above already succeeded at this point with no indication that the filter is broken. Should the regex be validated higher up the chain so invalid patterns are rejected before this?
|
|
||
| // No configuration for runtime data means no filter | ||
| if config == nil { | ||
| return "", errors.New("No runtime data configuration") |
There was a problem hiding this comment.
Is this really an error? Should it be return nil, nil?
|
|
||
| // If everything is present, but empty, set no filter | ||
| if filter == "" { | ||
| return "", errors.New("Empty runtime data configuration") |
There was a problem hiding this comment.
Is this really an error? Should it be return nil, nil?
6deebde to
eda56d2
Compare
Allow to configure per-namespace persistence for process indicators, so
that Central wouldn't need to store information, which never will be used.
It could be configured via DynamicConfig of the cluster configuration in
the form:
```
message ProcessIndicators {
bool no_persistence = 1;
string exclude_namespace_filter = 2;
bool exclude_openshift_ns = 3;
}
```
Where `exclude_namespace_filter` allows to specify a custom regex to
filter out processes by matching namespace, `exclude_openshift_ns`
instructs Central to exclude anything from openshift-* namespaces, and
`no_persistence` can be used to disable storing process indicators at all.
eda56d2 to
f03bc24
Compare
|
/retest |
|
@erthalion: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| ds.idToNameCache.Add(cluster.GetId(), cluster.GetName()) | ||
| ds.nameToIDCache.Add(cluster.GetName(), cluster.GetId()) | ||
|
|
||
| if filter := clusterPkg.GetNamespaceFilter(cluster); filter != nil { |
There was a problem hiding this comment.
If a cluster previously had a namespace filter configured and the user later removes it (all fields empty/false), GetNamespaceFilter would return nil, so we skip this block. But the old compiled regex stays in idToRegexCache, so indicators will keep being filtered even after the filter is disabled. Do we need an else branch here that removes the cluster ID from the idToRegexCache ?
(same pattern in buildCache around line 186)
| return nil | ||
| } | ||
|
|
||
| return pointers.String(filter) |
There was a problem hiding this comment.
Should this be anchored with ^ and $? e.g. return pointers.String(fmt.Sprintf("^(?:%s)$", filter)). Currently, a filter like "test-.*" would match namespace "my-test-foo" because the pattern matches within the string.
There was a problem hiding this comment.
No, if we do that we are assuming intent of the filter that we don't control. I think the onus is on the customer to supply the regex they intend to supply. The platform config regexes include the anchors in the regex itself as they should.
|
|
||
| match, err := m.clusterDataStore.MatchProcessIndicator(lifecycleMgrCtx, indicator) | ||
|
|
||
| if err == nil && match { |
There was a problem hiding this comment.
nitpick - can we invert this to avoid having an empty if body? Something like
if err != nil {
log.Errorf("Cannot match indicator %s: %v", indicator.GetId(), err)
}
if err != nil || !match {
indicatorSlice = append(indicatorSlice, indicator)
}
Description
NOTE: It's been split from #19455, for the purposes of simplifying the review. The PR contains only the first commit, introducing the actual machinery. The implementation is exactly the same as in the original PR.
Allow to configure per-namespace persistence for process indicators, so that Central wouldn't need to store information, which never will be used.
It could be configured via DynamicConfig of the cluster configuration in the form:
Where
namespace_filterallows to specify a custom regex to filter out processes by matching namespace,exclude_openshift_nsinstructs Central to exclude anything from openshift-* namespaces, andpersistencecan be used to disable storing process indicators at all.User-facing documentation
Testing and quality
Automated testing
How I validated my change
Manual validation (creating an operator-managed cluster and modifying the configuration), as well as E2E tests. Split from #19455 to simplify the review.