ROX-9128: Create certSecretsRepo type to store certificates in k8s secrets#457
ROX-9128: Create certSecretsRepo type to store certificates in k8s secrets#457
Conversation
|
Tag for build #210093 is 💻 For deploying this image using the dev scripts, run the following first: export MAIN_IMAGE_TAG='3.68.x-190-g0bc9854aaa'📦 You can also generate an installation bundle with: docker run -i --rm stackrox/main:3.68.x-190-g0bc9854aaa central generate interactive > bundle.zip🕹️ A |
porridge
left a comment
There was a problem hiding this comment.
The names and comments here are a bit abstract... I think I can see that this is a wrapper over a secrets client, but what value does it provide? Reading and writing collections of secrets? Perhaps mention that in the docstrings?
SimonBaeumer
left a comment
There was a problem hiding this comment.
Looks good but would like to keep it simpler. In this case removing the exponential backoff logic because it is already taken care of by the ticker.
Btw, great to see the effort you put into testing!
There was a problem hiding this comment.
Why the amount of service types a limit? I think the benefit is that the implementation is dynamic right?
There was a problem hiding this comment.
If we pass more secret names to newFixture than the size of serviceTypes then serviceTypes[i] will get out of bounds, this is a check to ensure we don't write tests wrong. We could change newFixture to replace secretNames ...string with a map[storage.ServiceType]string to prevent that, but newFixture is just a quick test utils function so I wanted to make it easy to call.
We cannot use serviceTypes[i %len(serviceTypes)] because then we would be overriding the value for a service type in the secretsMap map[storage.ServiceType]*v1.Secret that the repo is persisting
There was a problem hiding this comment.
I've finally removed the s.Require().LessOrEqual because newFixture now only accepts a single secret name, which is what the tests are using. See next comment for more context
There was a problem hiding this comment.
I would like to be more explicit here. It's a bit off that the first element of the given string array represents a specific service. Also your tests only ever use the first element.
There was a problem hiding this comment.
when I wrote newFixture I wasn't sure how many secrets we'd be adding to the fake client set. So I'll rewrite this so newFixture accepts a single secret name, and simplify the creation of these maps removing the look and the s.Require().LessOrEqual above. This should make the test easier to read
The value is basically in making the certificate refresher a bit simpler to understand by moving the responsibility of persisting the secrets to this type, by implementing the repository pattern. It's not a big win, but I think separating this is not a big effort either. I'll add a docstring to |
d08a7b9 to
e11ce62
Compare
porridge
left a comment
There was a problem hiding this comment.
Since we're trying to abstract away kubernetes by using the repository pattern, I wonder whether it would make sense to match this abstraction level in the function signatures, and accept/return the keys/values that are stored in a secret, rather than the whole Secret type?
There was a problem hiding this comment.
Why is this a problem? Perhaps the caller does not want to update all previously declared secrets this time?
What worries me more is that we're silently ignoring entries in secrets that were not declared at repo initialization time.
The motivation for having a separate |
|
FTR, we talked with Juan offline and he agreed to remove the use of |
I've changed this to a repository of
Please let me know what you think |
SimonBaeumer
left a comment
There was a problem hiding this comment.
Did not reviewed the test until now. S
I think this repo should be concrete for scanner atm. It can be extended following the next iterations.
My main argument is that the differences are in the data structures and logic how to persist the certificate rather than having a "real" generic solution here.
I fear introducing the abstractions here leads to a lot of unnecessary development with little value.
In contrast the ticker and synchronous messaging are patterns which could be repeated savely without breaking its abstractions.
(Whereas we decided against making sync messaging over gRPC streams because of the efforts)
sensor/kubernetes/localscanner/service_certificates_repository.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
To be honest I don't understand this, can you explain why this is necessary?
I would expect the secrets repo knows how to convert between v1.Secret <> storage.TypedServiceCertificateSet and communicating to the k8s API.
There was a problem hiding this comment.
This is to ensure the invariant "All secrets with non nil Data store the same CA PEM." A difference between storage.TypedServiceCertificateSet and v1.Secret is that the same CA is repeated in different secrets, while it appears only once in storage.TypedServiceCertificateSet.
But on second though that check is not really needed here because we just use secrets for the ObjectMeta but we ignore the Data in those secrets. So I'll remove it from here and turn it into an error in getServiceCertificates when different CAs are found for different secrets
There was a problem hiding this comment.
| func newServiceCertificatesRepoWithSecretsPersistence(secrets map[storage.ServiceType]*v1.Secret, | |
| func newServiceCertificatesRepo(secrets map[storage.ServiceType]*v1.Secret, |
There was a problem hiding this comment.
Also I think the repository can be initialized with all service types currently supported instead of passing them directly to it.
There was a problem hiding this comment.
The idea here is that the repository will use secrets as a template for storing secrets in k8s, and it will only modify the secret Data with the certificates received in storage.TypedServiceCertificateSet. That way the sensor component can handle the logic of setting up ObjectMeta with suitable OwnerReferences, and the repository doesn't care about that.
This is documented in the invariant "secrets and secretsClient are read-only, except the field Data of the entries in secrets" in the docstring for type serviceCertificatesRepoSecretsImpl
There was a problem hiding this comment.
Why checking here the secrets? The repo should be stateless and not know about the underlying data.
Imho it is up to the application layer to ensure that the CA cert are configured correctly.
There was a problem hiding this comment.
Same reason as I mentioned above: we are using the secrets as templates, be able to use the secret names for reading from k8s and to persist in k8s with the right ObjectMeta
sensor/kubernetes/localscanner/service_certificates_repository.go
Outdated
Show resolved
Hide resolved
sensor/kubernetes/localscanner/service_certificates_repository.go
Outdated
Show resolved
Hide resolved
sensor/kubernetes/localscanner/service_certificates_repository.go
Outdated
Show resolved
Hide resolved
sensor/kubernetes/localscanner/service_certificates_repository.go
Outdated
Show resolved
Hide resolved
sensor/kubernetes/localscanner/service_certificates_repository.go
Outdated
Show resolved
Hide resolved
sensor/kubernetes/localscanner/service_certificates_repository_test.go
Outdated
Show resolved
Hide resolved
I agree this is |
SimonBaeumer
left a comment
There was a problem hiding this comment.
I still think this repo should not support other services like it does it now. We can also discuss this offline if we don't agree.
There was a problem hiding this comment.
I lost track on the discussions, so commenting here.
Why should the serviceCertificateSecret be passed to the repo in the constructor?
To me it provides no value (expect DI) because we know upfront which storage.ServiceType are supported and which not.
Now every time the repo is used I need to construct the map[storage.ServiceType]serviceCertificateSecret and pass it to the repo? 🤔
I still think this should only be implemented for scanner certs for now and avoiding pre-optimizations.
The repo currently only wraps the k8s api and adds type conversion logic, for this the current implementation introduces complexity which is not necessary.
Let's keep it simple, if we need to support other services we still can add them.
There was a problem hiding this comment.
Why should the
serviceCertificateSecretbe passed to the repo in the constructor?
As discussed in #457 (comment) and #457 (comment) this repository is only concerned with modifying the secret data, not with creating secrets, or the secret name or ownership. By keeping that responsebility out of the repo we are making the repo simpler.
serviceCertificateSecret contains that information which is not created by the repository as a *v1.Secret, and also the paths for the certificates in the secret. All that information is constant for the life of the repository and that is why it is passed in the constructor. That way getServiceCertificates and putServiceCertificates have less parameters and are easier to use, and harder to use wrong.
There was a problem hiding this comment.
we know upfront which
storage.ServiceTypeare supported and which not
I understand that you mean we could instead pass two serviceCertificateSecret one for scanner and another for scannerDB, or we could use instead
type serviceCertificateSecrets struct {
scannerSecret serviceCertificateSecret
scannerDBSecret serviceCertificateSecret
}That is the same information in map[storage.ServiceType]serviceCertificateSecret, but with a type that is less flexible. I can change map[storage.ServiceType]serviceCertificateSecret to accept two serviceCertificateSecrets, but I see no value in changing the field secrets map[storage.ServiceType]serviceCertificateSecret in serviceCertificatesRepoSecretsImpl.
There was a problem hiding this comment.
expect DI
I don't know what "DI" means, can you please explain?
There was a problem hiding this comment.
DI = Dependency Injection
There was a problem hiding this comment.
I still think this is not necessary, instead a valid secret must be passed on every put to the repo.
If the ownerRef is necessary it should be injected instead.
There was a problem hiding this comment.
requiring passing additional parameters that would be constant for the life of the repository would make putServiceCertificates harder to use, which might lead to more errors
There was a problem hiding this comment.
| // we don't know where to persist this. | |
| // we don't know how to persist this. |
|
Some additional tests are missing, I'll complete the tests when we agree on the production code |
|
A problem with the current implementation of |
SimonBaeumer
left a comment
There was a problem hiding this comment.
Looks very good! Only smaller style suggestions after that ready to merge 💪
| sensorDeployment *appsApiv1.Deployment, secretName string) (*v1.Secret, error) { | ||
| secret, err := r.secretsClient.Get(ctx, secretName, metav1.GetOptions{}) | ||
|
|
||
| if k8sErrors.IsNotFound(err) { |
There was a problem hiding this comment.
Use secretClient.Create directly here, no need to check existence beforehand.
There was a problem hiding this comment.
this comment seems to be outdated, at 8456e48 createSecret uses Create directly
| // service type. | ||
| // - if the data for a secret is missing some expecting key then the corresponding field in the TypedServiceCertificate | ||
| // for that secret will contain a zero value. | ||
| func (r *serviceCertificatesRepoSecretsImpl) GetServiceCertificates(ctx context.Context) (*storage.TypedServiceCertificateSet, error) { |
There was a problem hiding this comment.
Return error here if secrets were not owned by the deployment. The GetServiceCertificates is then used to determine whether the ticker stops or not.
There was a problem hiding this comment.
this comment seems to be outdated, at 8456e48 getServiceCertificates returns ErrUnexpectedSecretsOwner if the owner is not as expected
| secretsClient corev1.SecretInterface | ||
| } | ||
|
|
||
| // serviceCertSecretSpec species the name of the secret where certificates for a service are stored, and |
There was a problem hiding this comment.
| // serviceCertSecretSpec species the name of the secret where certificates for a service are stored, and | |
| // serviceCertSecretSpec specifies the name of the secret where certificates for a service are stored, and |
| // - Otherwise, fails with ErrUnexpectedSecretsOwner in case the owner specified in the constructor is not the sole owner of all secrets. | ||
| // - Otherwise, fails with ErrMissingSecretData in case any secret has no data. | ||
| // - Otherwise, fails with ErrDifferentCAForDifferentServiceTypes in case the CA is not the same in all secrets. | ||
| // - Otherwise, fails with any k8s API error. |
There was a problem hiding this comment.
I think this doc line can be removed, at least to me it is clear that this happens in all cases where a connection is necessary.
| // TODO: if this works abstract as "highestPriorityError" func | ||
| if _, ok := errors[ErrUnexpectedSecretsOwner]; ok { | ||
| return nil, ErrUnexpectedSecretsOwner | ||
| } | ||
| if _, ok := errors[ErrMissingSecretData]; ok { | ||
| return nil, ErrMissingSecretData | ||
| } | ||
| if _, ok := errors[ErrDifferentCAForDifferentServiceTypes]; ok { | ||
| return nil, ErrDifferentCAForDifferentServiceTypes | ||
| } |
There was a problem hiding this comment.
Why not using multierror and returning all errors at the same time?
There was a problem hiding this comment.
I just investigated and I learned that errors.Is can be used to determine whether or not an error appears inside a multierror, so I'll follow your suggestion and use errors.Is in the client code
| "successful put": {nil, s.newFixture("")}, | ||
| "failed put": {errForced, s.newFixture("patch")}, | ||
| "cancelled put": {context.Canceled, s.newFixture("")}, |
| "successful get": {nil, s.newFixture("")}, | ||
| "failed get": {errForced, s.newFixture("get")}, | ||
| "cancelled get": {context.Canceled, s.newFixture("")}, |
| fixture *certSecretsRepoFixture | ||
| }{ | ||
| "successful put": {nil, s.newFixture("")}, | ||
| "failed put": {errForced, s.newFixture("patch")}, |
There was a problem hiding this comment.
Why should it fail here? Can you described that in the test description?
There was a problem hiding this comment.
These errors are forced in the k8s API mock, by passing "patch" to the verbToError parameter of newFixture, which uses clientSet.CoreV1().(*fakecorev1.FakeCoreV1).PrependReactor to force the error for the verb. I've renamed the test name here to "failed put due to k8s API error", and also in TestGet that was doing something similar
| s.Error(repo.createSecrets(ctx, certificates.Clone())) | ||
| } | ||
|
|
||
| func (s *serviceCertificatesRepoSecretsImplSuite) checkExpectedError(expectedErr, err error) { |
There was a problem hiding this comment.
Why did you created this function?
If errors are checked I expected either checking the type with s.ErrorIs or if not available checking the error msg with s.Contains(err.Error(), "expected string").
There was a problem hiding this comment.
replaced with s.ErrorIs, nice to know about this
| return s.newFixtureAdvancedOpts(verbToError, false) | ||
| } | ||
|
|
||
| func (s *serviceCertificatesRepoSecretsImplSuite) newFixtureAdvancedOpts(verbToError string, emptySecretData bool, |
There was a problem hiding this comment.
What I don't like about that signature is that verbToError magically gets empty strings passed which disturbes the reading flow of a test.
I need to first look at newFixtureAdvancedOpts and understand what's going on under the hood.
Can you use a struct as the input parameter which is more expressive instead of multiple parameters?
This makes it more obvious because the passed parameters become named automatically (or using constants for verbToError and emptySecretData).
I know a lot of our unit tests don't do that - and that's not good because it gets frustrating debugging these.
There was a problem hiding this comment.
verbToError is the HTTP verb of the k8s API for which we force an error in the fake k8s client set, so by passing "" we specify that all verbs that we'll use will work fine.
I created a certSecretsRepoFixtureSpec struct as suggested, and unified newFixtureAdvancedOpts and newFixture, into just newFixture that takes that struct. I've also added comments to both newFixture and certSecretsRepoFixtureSpec to clarify how this behaves
I think we may want to combine |
also fix some comments
I have replaced put and create with ensure, please let me know what you think. Some additional tests are required, but I'm leaving that for when we reach and agreement for the production code |
| func (r *serviceCertificatesRepoSecretsImpl) ensureServiceCertificate(ctx context.Context, caPem []byte, | ||
| cert *storage.TypedServiceCertificate, secretSpec serviceCertSecretSpec) error { | ||
| patchErr := r.patchServiceCertificate(ctx, caPem, cert, secretSpec) | ||
| if k8sErrors.IsNotFound(patchErr) { |
There was a problem hiding this comment.
I like the simplicity and readability of this function!
| } | ||
| } | ||
|
|
||
| type certSecretsRepoFixtureSpec struct { |
There was a problem hiding this comment.
Spec is misleading because typically Spec is reserved for k8s objects (i.e. PodSpec). For this Config or Options is better suited.
| type certSecretsRepoFixtureSpec struct { | |
| type certSecretsRepoFixtureConfig struct { |
| emptySecretData bool | ||
| // These keys will be removed from the data keys of the secret used to initialize the fake k8s client set. | ||
| missingSecretDataKeys []string | ||
| } |
There was a problem hiding this comment.
Thanks for adding docs!
|
completed tests |
Checklist
CHANGELOG entry and upgrade steps not required.
Testing Performed
Added unit test. New test are passing with
go test -count=100 -race -v -p=1 github.com/stackrox/rox/sensor/kubernetes/localscanner -run TestServiceCertificatesRepoSecretsImpl