ROX-12934: Fix operator reconciliation when external DB is in use#3322
ROX-12934: Fix operator reconciliation when external DB is in use#3322vladbologa merged 16 commits intomasterfrom
Conversation
|
Images are ready for the commit at e3059f6. To use with deploy scripts, first |
5f7c4e7 to
89fefa6
Compare
83cf14a to
823b611
Compare
0b709ce to
5213d5f
Compare
operator/pkg/central/extensions/reconcile_central_db_password.go
Outdated
Show resolved
Hide resolved
| tail: -1 | ||
| --- | ||
| apiVersion: apps/v1 | ||
| kind: Deployment |
There was a problem hiding this comment.
Might be good to also assert health of the central deployment in addition to central-db's health?
| cv.SetError(errors.New("if a connection string is provided, no persistence settings must be supplied")) | ||
| } | ||
|
|
||
| // TODO: there are other settings which are ignored in external mode - should we error if those are set, too? |
There was a problem hiding this comment.
+1 for erroring loud and clear on dubious configurations
There was a problem hiding this comment.
I can add it. My main reservation against this stemmed from the fact that switching between configurations gets challenging, as you cannot comment out stuff in objects on etcd when doing, e.g., oc edit. But I guess it's not a very important use case to switch back and forth between internal and external DB.
There was a problem hiding this comment.
Yeah, I think now I remember the problem, but that problem is even bigger.
The OpenShift console has the very unfortunate property of importing OpenAPI field defaults into the form and then acting as if the user has set those field values. See:
Screen.Recording.2022-10-26.at.18.26.49.mov
Currently there's no issue because Gorman didn't introduce a default value for the PVC name for the DB, but, arguably, there should be one, as we also have a stackrox-db default for the central persistence, and setting a default for a field is really nothing very unreasonable.
The bigger problem now is that setting a default for any of the fields means a user using OpenShift console can no longer disable central DB. OpenShift console simply does not respect the fact that (non-)nullness of fields in the CR might carry any meaning, it uses a lossy round-trip between the two representations. This, by the way, also applies in the other direction: with the existing CR structure, it is impossible to specify "use central DB, but with default settings" in the form view in OpenShift console, you have to set any of the fields (such as explicitly typing in the default central-db pvc name) or use YAML view.
@porridge I know you had some discussion with Gorman on the original PR re: an explicit boolean field vs the mere presence of another field (this was about external, not the DB as a whole). Putting the aspect of bool vs enum aside: what is your opinion on (a) avoiding explicit enabled/disabled fields and (b) erroring on inconsistent combinations of set fields, in light of these OpenShift console form view realities?
There was a problem hiding this comment.
Oh, wow. This is a game-changer. 😞
I guess this means that mere presence of a struct can no longer have any meaning.
This in turn means CentralDBEnabled() needs to look at a new explicit toggle (I would very much prefer an enum on the CentralComponentSpec level rather than bools) rather than at the nilness of .DB 🤔
Please also mention in a (non-user visible) comment somewhere why we don't look at struct nilness. This is probably not obvious for a non-openshift developer.
| return errors.New("no password secret was specified") | ||
| } | ||
|
|
||
| passwordSecretName := r.centralObj.Spec.Central.DB.PasswordSecret.Name |
There was a problem hiding this comment.
| passwordSecretName := r.centralObj.Spec.Central.DB.PasswordSecret.Name | |
| passwordSecretName := r.centralObj.Spec.Central.DB.GetPasswordSecret().Name |
There was a problem hiding this comment.
I think this only makes sense if there's a nil-safe getter for Name. Adding this.
There was a problem hiding this comment.
actually, we already check that GetPasswordSecret() is non-nil above, so I don't think this adds any value.
| return errors.Wrapf(err, "reading central db password from secret %s", passwordSecretName) | ||
| } | ||
|
|
||
| r.password = password |
There was a problem hiding this comment.
Why not returning the password and setting it from the main function? It feels odd to me because the result of that function is not returned, instead in sets state on the struct.
Setting it from the caller makes it more explicit to me.
There was a problem hiding this comment.
I can change it. I don't feel very strongly tbh because the reconciliation execution structs are heavily dependent on modifying state. For example, validate also sets the current password, and this is intended.
There was a problem hiding this comment.
Actually, I think it makes more sense to leave the behavior as-is. I will rename the function to readAndSetPassword...
| Namespace: testutils.TestNamespace, | ||
| }, | ||
| Data: map[string][]byte{ | ||
| "password": []byte(fmt.Sprintf("%s\n", pw1)), |
There was a problem hiding this comment.
Nice job, did not thought about this test-case.
| return central.Spec.Central.GetPersistence() | ||
| case DefaultCentralDBPVCName: | ||
| return convertDBPersistenceToPersistence(central.Spec.Central.DB.GetPersistence()) | ||
| func getPersistenceByTarget(central *platform.Central, target PVCTarget) *platform.Persistence { |
There was a problem hiding this comment.
Can you add a small doc string what this function is used for? I found it hard to understand, looking into the caller also did not really helped me.
5213d5f to
da160c9
Compare
| // When using this option, you must explicitly set a password secret; automatically generating a password will not | ||
| // be supported. |
There was a problem hiding this comment.
| // When using this option, you must explicitly set a password secret; automatically generating a password will not | |
| // be supported. | |
| // When using this option, you must explicitly set a password secret; a password will not be automatically generated. |
| cv.SetError(errors.New("if a connection string is provided, no persistence settings must be supplied")) | ||
| } | ||
|
|
||
| // TODO: there are other settings which are ignored in external mode - should we error if those are set, too? |
There was a problem hiding this comment.
Oh, wow. This is a game-changer. 😞
I guess this means that mere presence of a struct can no longer have any meaning.
This in turn means CentralDBEnabled() needs to look at a new explicit toggle (I would very much prefer an enum on the CentralComponentSpec level rather than bools) rather than at the nilness of .DB 🤔
Please also mention in a (non-user visible) comment somewhere why we don't look at struct nilness. This is probably not obvious for a non-openshift developer.
operator/pkg/central/extensions/reconcile_central_db_password_test.go
Outdated
Show resolved
Hide resolved
|
@misberner: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
d806cde to
e3059f6
Compare
Description
This PR fixes the reconciliation of the central-db persistent volume and password.
Unfortunately, the JIRA is missing a concrete description of the issue (and also doesn't cover all the existing issues). However, this PR functionally addresses the following:
central-db-secret, the contents of the user-supplied secret are mirrored into acentral-db-secretsecret (as only that is mounted into Central)passwordentry in the manually specified secret changed, thecentral-db-secretpassword is also updated upon reconciliationcentral-db-secret, and a password is manually specified in a different secret, reconciliation will fail if the passwords are differentOn the non-functional sides, this PR improves some naming and adds extensive unit tests for central DB password reconciler, as well as kuttl tests.
This PR is based upon #3293 by @connorgorman .
Checklist
Evaluated and added CHANGELOG entry if requiredDetermined and documented upgrade stepsDocumented user facing changes (create PR based on openshift/openshift-docs and merge into rhacs-docs)If any of these don't apply, please comment below.
Testing Performed