ROX-33336: [Operator] Propagate cluster-wide TLS settings#18864
ROX-33336: [Operator] Propagate cluster-wide TLS settings#18864vladbologa wants to merge 13 commits intomasterfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
Images are ready for the commit at b9e88ce. To use with deploy scripts, first |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18864 +/- ##
==========================================
+ Coverage 49.34% 49.38% +0.03%
==========================================
Files 2742 2746 +4
Lines 206953 207054 +101
==========================================
+ Hits 102126 102252 +126
+ Misses 97231 97216 -15
+ Partials 7596 7586 -10
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
65477e3 to
1947c4e
Compare
5f90175 to
67aa3a3
Compare
0f9fa38 to
be483be
Compare
be483be to
8a708ee
Compare
b3d11f9 to
6c56818
Compare
6c56818 to
25bf072
Compare
25bf072 to
b75703a
Compare
b75703a to
43f247e
Compare
0023d83 to
7f73d85
Compare
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- Consider using a bounded context (e.g., with timeout) instead of context.Background() when calling FetchProfile at startup, so a slow or non-responsive API server doesn’t block operator initialization indefinitely.
- When ConvertProfile returns nil because the adherence policy does not require honoring the cluster TLS profile and FORCE_OPENSHIFT_TLS_PROFILE is false, it may be helpful to log this decision once so users understand why TLS env vars are not injected even though a profile exists.
- The tls13Ciphers set in convert.go is hardcoded to the current three TLS 1.3 ciphers; consider delegating TLS 1.3 detection to library-go or another shared helper to avoid silently including new TLS 1.3 ciphers in future OpenShift releases.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider using a bounded context (e.g., with timeout) instead of context.Background() when calling FetchProfile at startup, so a slow or non-responsive API server doesn’t block operator initialization indefinitely.
- When ConvertProfile returns nil because the adherence policy does not require honoring the cluster TLS profile and FORCE_OPENSHIFT_TLS_PROFILE is false, it may be helpful to log this decision once so users understand why TLS env vars are not injected even though a profile exists.
- The tls13Ciphers set in convert.go is hardcoded to the current three TLS 1.3 ciphers; consider delegating TLS 1.3 detection to library-go or another shared helper to avoid silently including new TLS 1.3 ciphers in future OpenShift releases.
## Individual Comments
### Comment 1
<location path="operator/internal/tlsprofile/provider_test.go" line_range="145-152" />
<code_context>
+ assert.Equal(t, "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384", profile.OpenSSLCiphers)
+}
+
+func TestFetchProfile_StrictWithNilTLSSecurityProfile(t *testing.T) {
+ apiServer := &configv1.APIServer{
+ ObjectMeta: metav1.ObjectMeta{Name: "cluster"},
+ Spec: configv1.APIServerSpec{
+ TLSAdherence: configv1.TLSAdherencePolicyStrictAllComponents,
+ },
+ }
+
+ clusterTLS, err := FetchProfile(context.Background(), fakeClientWithAPIServer(apiServer))
+ require.NoError(t, err)
+ profile := ConvertProfile(clusterTLS, false)
+ require.NotNil(t, profile)
+ assert.Equal(t, "TLSv1.2", profile.MinVersion)
+}
+
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen the nil TLSSecurityProfile test to also validate cipher conversion.
Since this path runs `FetchProfile` and `ConvertProfile` against library-go defaults, please also assert that `CipherSuites` and `OpenSSLCiphers` are populated and look reasonable (e.g., exclude TLS 1.3-only ciphers). That will verify default profiles yield valid cipher configuration, not just a minimum version.
```suggestion
clusterTLS, err := FetchProfile(context.Background(), fakeClientWithAPIServer(apiServer))
require.NoError(t, err)
profile := ConvertProfile(clusterTLS, false)
require.NotNil(t, profile)
assert.Equal(t, "TLSv1.2", profile.MinVersion)
// validate that default cipher configuration is populated and excludes TLS 1.3–only ciphers
assert.NotEmpty(t, profile.CipherSuites)
assert.Contains(t, profile.CipherSuites, "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384")
assert.Contains(t, profile.CipherSuites, "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384")
assert.NotContains(t, profile.CipherSuites, "TLS_AES_")
assert.NotEmpty(t, profile.OpenSSLCiphers)
assert.Contains(t, profile.OpenSSLCiphers, "ECDHE-ECDSA-AES256-GCM-SHA384")
assert.Contains(t, profile.OpenSSLCiphers, "ECDHE-RSA-AES256-GCM-SHA384")
assert.NotContains(t, profile.OpenSSLCiphers, "TLS_AES_")
}
```
</issue_to_address>
### Comment 2
<location path="operator/internal/tlsprofile/enricher_test.go" line_range="34-54" />
<code_context>
+ assert.Contains(t, envVars["ROX_OPENSSL_TLS_CIPHER_SUITES"], "ECDHE-ECDSA-AES256-GCM-SHA384")
+}
+
+func TestEnricher_UserValuesOverrideInjected(t *testing.T) {
+ e := NewEnricher(intermediateProfile)
+
+ vals := chartutil.Values{
+ "customize": map[string]interface{}{
+ "envVars": map[string]interface{}{
+ "ROX_TLS_MIN_VERSION": "TLSv1.3",
+ },
+ },
+ }
+ result, err := e.Enrich(context.Background(), &unstructured.Unstructured{}, vals)
+ require.NoError(t, err)
+
+ envVars, err := result.Table("customize.envVars")
+ require.NoError(t, err)
+
+ assert.Equal(t, "TLSv1.3", envVars["ROX_TLS_MIN_VERSION"],
+ "user-specified value should take precedence over injected value")
+ assert.Contains(t, envVars["ROX_TLS_CIPHER_SUITES"], "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
+ "non-overridden values should still be injected")
+}
</code_context>
<issue_to_address>
**suggestion (testing):** Extend override test to verify that unrelated user envVars are preserved alongside injected TLS vars.
To better cover the coalescing behavior, please add an extra non-TLS user env var (e.g. `"FOO": "bar"`) alongside `ROX_TLS_MIN_VERSION` and assert that it remains in `customize.envVars` after enrichment, confirming we don't drop arbitrary user-defined env vars when injecting TLS settings.
```suggestion
func TestEnricher_UserValuesOverrideInjected(t *testing.T) {
e := NewEnricher(intermediateProfile)
vals := chartutil.Values{
"customize": map[string]interface{}{
"envVars": map[string]interface{}{
"ROX_TLS_MIN_VERSION": "TLSv1.3",
"FOO": "bar",
},
},
}
result, err := e.Enrich(context.Background(), &unstructured.Unstructured{}, vals)
require.NoError(t, err)
envVars, err := result.Table("customize.envVars")
require.NoError(t, err)
assert.Equal(t, "TLSv1.3", envVars["ROX_TLS_MIN_VERSION"],
"user-specified value should take precedence over injected value")
assert.Contains(t, envVars["ROX_TLS_CIPHER_SUITES"], "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
"non-overridden values should still be injected")
assert.Equal(t, "bar", envVars["FOO"],
"unrelated user-specified env vars should be preserved alongside injected TLS settings")
}
```
</issue_to_address>
### Comment 3
<location path="operator/cmd/main.go" line_range="150" />
<code_context>
}
defer restore()
+ // Read the cluster TLS profile once at startup. The TLSProfileWatcher
+ // restarts the Operator when this changes, so a single read is always current.
+ // The result is used for both the Operator's own metrics server TLS and the
</code_context>
<issue_to_address>
**issue (complexity):** Consider encapsulating the TLS profile bootstrap, conversion, and watcher wiring into helper functions so that run() focuses only on wiring controllers and starting the manager.
You can keep all the TLS-profile functionality while reducing `run()`’s cognitive load by pushing the orchestration into the `tlsprofile` package (or a small local helper). The main benefits: `run()` goes back to being “wire controllers + start manager”, and TLS concerns become self-contained.
### 1. Encapsulate TLS profile bootstrap + operand conversion + watcher
Instead of handling client creation, profile fetch, conversion, and watcher wiring inline, introduce a helper that returns everything `run()` needs:
```go
// in internal/tlsprofile/init.go (for example)
package tlsprofile
import (
"context"
"crypto/tls"
configv1 "github.com/openshift/api/config/v1"
tlspkg "github.com/openshift/controller-runtime-common/pkg/tls"
"github.com/pkg/errors"
ctrl "sigs.k8s.io/controller-runtime"
ctrlClient "sigs.k8s.io/controller-runtime/pkg/client"
)
type InitResult struct {
ClusterProfile *configv1.TLSSecurityProfile
OperandProfile OperandTLSProfile // whatever ConvertProfile returns
ManagerTLSOption func(*tls.Config) // nil if no cluster profile
}
func Initialize(
ctx context.Context,
mgr ctrl.Manager,
cfg *rest.Config, // utils.GetRHACSConfigOrDie()
scheme *runtime.Scheme,
forceOpenShift bool,
log logr.Logger,
) (*InitResult, error) {
bootstrapClient, err := ctrlClient.New(cfg, ctrlClient.Options{Scheme: scheme})
if err != nil {
return nil, errors.Wrap(err, "create bootstrap client for TLS profile")
}
clusterProfile, err := FetchProfile(ctx, bootstrapClient)
if err != nil {
return nil, errors.Wrap(err, "fetch cluster TLS profile")
}
var tlsOpt func(*tls.Config)
if clusterProfile != nil {
tlsConfigFn, unsupported := tlspkg.NewTLSConfigFromProfile(clusterProfile.ProfileSpec)
if len(unsupported) > 0 {
log.Info("some ciphers from cluster TLS profile are not supported by Go, skipping them", "ciphers", unsupported)
}
tlsOpt = tlsConfigFn
}
operandProfile := ConvertProfile(clusterProfile, forceOpenShift)
return &InitResult{
ClusterProfile: clusterProfile,
OperandProfile: operandProfile,
ManagerTLSOption: tlsOpt,
}, nil
}
```
Then `run()` can be simplified to something like:
```go
ctx := context.Background()
tlsInit, err := tlsprofile.Initialize(
ctx,
mgr,
utils.GetRHACSConfigOrDie(),
scheme,
forceOpenShiftTLSProfile.BooleanSetting(),
setupLog,
)
if err != nil {
return err
}
var tlsOpts []func(*tls.Config)
if tlsInit.ManagerTLSOption != nil {
tlsOpts = append(tlsOpts, tlsInit.ManagerTLSOption)
}
if !enableHTTP2 {
tlsOpts = append(tlsOpts, func(c *tls.Config) {
c.NextProtos = []string{"http/1.1"}
})
}
// ...
operandTLSProfile := tlsInit.OperandProfile
if centralReconcilerEnabled.BooleanSetting() {
if err = centralReconciler.RegisterNewReconciler(mgr, centralLabelSelector, operandTLSProfile); err != nil {
return errors.Wrap(err, "unable to set up Central reconciler")
}
}
```
### 2. Hide watcher-specific context wiring
Similarly, hide the watcher + cancel wiring behind a helper so `run()` keeps the familiar pattern:
```go
// in internal/tlsprofile/init.go
func WithProfileWatcherContext(
baseCtx context.Context,
mgr ctrl.Manager,
clusterProfile *configv1.TLSSecurityProfile,
) (context.Context, error) {
if clusterProfile == nil {
return baseCtx, nil
}
ctx, cancel := context.WithCancel(baseCtx)
if err := SetupTLSProfileWatcher(mgr, *clusterProfile, cancel); err != nil {
cancel()
return nil, errors.Wrap(err, "set up TLS profile watcher")
}
return ctx, nil
}
```
And in `run()`:
```go
baseCtx := ctrl.SetupSignalHandler()
ctx, err := tlsprofile.WithProfileWatcherContext(baseCtx, mgr, tlsInit.ClusterProfile)
if err != nil {
return err
}
setupLog.Info("starting manager")
if err = mgr.Start(ctx); err != nil {
return errors.Wrap(err, "problem running manager")
}
```
These small helpers keep all the introduced behavior (profile fetch, operand conversion, manager TLS config, and watcher-triggered shutdown) but move the low-level details out of `run()`, which should make the startup flow easier to follow and maintain.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
7c68f15 to
8f7581f
Compare
|
/konflux-retest operator-bundle-on-push |
8f7581f to
b9e88ce
Compare
|
/test gke-operator-e2e-tests |
Description
When an OpenShift cluster has a cluster-wide TLS security profile configured via
apiserver.config.openshift.io/cluster, the Operator now:ROX_TLS_MIN_VERSION,ROX_TLS_CIPHER_SUITES, andROX_OPENSSL_TLS_CIPHER_SUITESenvironment variables (when thespec.tlsAdherencepolicy requires it, or whenFORCE_OPENSHIFT_TLS_PROFILE=true). The individual deployments can use the env var that is suitable for them (e.g. Go apps would useROX_TLS_CIPHER_SUITES, C++/Rust/Postgres components would useROX_OPENSSL_TLS_CIPHER_SUITES)The propagation uses
customize.envVarsin the Helm values, which would also allows users to set per-component overrides. TheFORCE_OPENSHIFT_TLS_PROFILEOperator env var allows enforcement on clusters wherespec.tlsAdherenceis not yet available.On non-OpenShift clusters, the feature is a no-op.
This PR follows the recommendations from this document, but adapted to the fact that we need to also support non-Openshift k8s. Note that in our implementation, we fetch the TLS settings only in the Operator. This avoids having multiple implementations in various languages for all our components and makes sure that the handling of these settings is consistent, by parsing them in a single place.
User-facing documentation
Testing and quality
Automated testing
How I validated my change
Tested on an OpenShift 4.21 cluster with
CentralandSecuredClusterCRs deployed. Verified env vars on all 8 workloads: central, central-db, sensor, admission-control, scanner-v4-indexer, scanner-v4-matcher, scanner-v4-db, collector.Default (Intermediate) profile with
FORCE_OPENSHIFT_TLS_PROFILE=true:FORCE_OPENSHIFT_TLS_PROFILE=trueon the operator CSVtlsSecurityProfileset on the cluster (defaults to Intermediate)ROX_TLS_MIN_VERSION=TLSv1.2, 6 ECDHE ciphers in both IANA and OpenSSL formatsModern profile (watcher restart):
apiserver.config.openshift.io/clustertotype: Moderncluster TLS profile changed, restarting Operator to apply new settings, oldMinTLSVersion=VersionTLS12, newMinTLSVersion=VersionTLS13ROX_TLS_MIN_VERSION=TLSv1.3, empty cipher lists (TLS 1.3 ciphers handled separately by Go/OpenSSL)Custom profile (watcher restart):
minTLSVersion: VersionTLS12ROX_TLS_MIN_VERSION=TLSv1.2, exactly the 3 specified ciphers in both IANA (TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) and OpenSSL formatNote that I had to use
FORCE_OPENSHIFT_TLS_PROFILE=truefor testing, as the tlsAdherence field will only be available in OpenShift 4.22.