ROX-9127: Create CertificateRequester to request certificates from central#463
ROX-9127: Create CertificateRequester to request certificates from central#463
Conversation
| } | ||
|
|
||
| func (i *certRequestSyncImpl) requestCertificates(ctx context.Context) (*central.IssueLocalScannerCertsResponse, error) { | ||
| msg := ¢ral.MsgFromSensor{ |
There was a problem hiding this comment.
I think one of the reasons why I found this a bit hard to review is that responsibilities aren't clear. The certRequestSyncImpl directly injects a message into the sensor stream, but the certificateRequesterImpl is responsible for handling the response to that and dispatching it.
Imho, the direct communication with the sensor stream should be exclusively reserved to the certificateRequesterImpl. I don't think a certRequestSyncImpl ever needs to know its ID, and I don't think the ID - after being written into the outgoing message - needs to be persisted anywhere other than in the map.
So, what I'd basically suggest is to send the outgoing message in (*certificateRequesterImpl).RequestCertificates, and only maintain the msgToSensorC in the certRequestSyncImpl struct (which I would suggest should be renamed to responseC).
There was a problem hiding this comment.
That would leave certRequestSyncImpl with just one field and just one method that is called only once in its lifetime. So I'll get rid of certRequestSyncImpl and replace it with a receiveResponse method in certificateRequesterImpl. That way the code will be a bit simpler
|
Tag for build #167065 is 💻 For deploying this image using the dev scripts, run the following first: export MAIN_IMAGE_TAG='3.68.x-104-g8a059f374b'📦 You can also generate an installation bundle with: docker run -i --rm stackrox/main:3.68.x-104-g8a059f374b central generate interactive > bundle.zip🕹️ A |
There was a problem hiding this comment.
Is it intentional to not be a receiver func of certificateRequesterImpl?
There was a problem hiding this comment.
yes, to stress we are not using any of the state, receive just uses information that is local to a particular request (the context and the receive channel)
There was a problem hiding this comment.
Should this perhaps assert that Start() has already been called?
Also I think an exported method requires a docstring.
Speaking of docstrings, it would be great to explain why we have this worker goroutine, rather than do the central response handling directly in RequestCertificates. I'm assuming this is because we want to support overlapping requests/response timelines?
There was a problem hiding this comment.
It's weird that the style checks have passed without such docstrings, I'll add that to Start, Stop and RequestCertificates, including that precondition in RequestCertificates.
We need to launch dispatchResponses to support concurrent requests, as we have a single channel to get the responses, which is the interface we have in SensorComponent. We have a unit test TestRequestConcurrentRequestDoNotInterfere for that.
For now we don't plan to have concurrent request for local scanner, because the retry ticker only runs one instance of the tick function at a time, but during this PR we came to the conclusion that supporting concurrent requests would make this type easier to use
There was a problem hiding this comment.
So, rather than just checking if Start() has been called via a simple if condition, I would propose to zero-initialize the stopC error signal. Then, you call r.stopC.Reset() at the beginning of Start(), and select on <-r.stopC.Done() in the send method, returning r.stopC.ErrorOrDefault(errors.New("not started")) if that branch is taken.
Re docstrings: Docstrings are actually only reported for functions that are explicitly exported. I.e., func Foo() requires a docstring, as does func (t *ExportedType) Foo(), but func (t *unexportedType) Foo() does not, because this concrete function (*unexportedType).Foo is not exported (OTOH, you can reference ExportedType.Foo even in the absence of a receiver object - it will be of type func(*ExportedType)). The function may be exported as an interface method, but the way Golang interfaces work, it generally isn't possible to rule that such an interface exists, as it might even be declared in a different package. Also, the generated go docs would never show this function anywhere (arguably, the linter should require a docstring on each method of an exported interface, but it doesn't do that).
There was a problem hiding this comment.
Thanks for the docstrings explanation @misberner , TIL.
porridge
left a comment
There was a problem hiding this comment.
Sorry, forgot to click the approve radio button in my previous review.
8ce3155 to
07f48cf
Compare
|
I just pushed a rebase with no change to these commits, to fix the issues with CI |
sometimes `dispatchResponses` was starting before `r.Stop` was reset, thus stopping dispatching immediatelly
to avoid interference between tests that causes DATA RACE errors
porridge
left a comment
There was a problem hiding this comment.
Mostly nitpicks, but I have a request for another test case that I think is pretty important.
| waitGroup := concurrency.NewWaitGroup(numConcurrentRequests) | ||
|
|
||
| for i := 0; i < numConcurrentRequests; i++ { | ||
| go f.respondRequest(t, nil) |
There was a problem hiding this comment.
I think this test is useful, though I would bump numConcurrentRequests to something more like 10.
However what we don't seem to be testing explicitly is the case where the simulated central reads multiple requests first, and then responds to them in a different order.
AFAICT, this test is likely to always respond to a request with a matching response before the next request appears. This means we could make the requester serialized by mistake, and not realize this until something would get stuck in production.
There was a problem hiding this comment.
I increased numConcurrentRequests to 10.
I added a jitter time.Duration parameter to respondRequest to randomly delay responses to simulate that, please let me know if that works:
if jitter > 0 {
time.Sleep(time.Duration(rand.Int63n(jitter.Milliseconds())) * time.Millisecond)
}
select {
case <-f.ctx.Done():
case f.receiveC <- response:
}There was a problem hiding this comment.
I also added another test case where responses are delayed with a time increasingly smaller, to get responses out of order in a deterministic way
There was a problem hiding this comment.
I think that will work, although I must say that the test has become somewhat elaborate now. Personally I'd prefer a separate, simpler to follow Test... function, but up to you.
There was a problem hiding this comment.
I changed the test case struct type to try to make this easier to read:
testCases := map[string]struct {
responseDelayFunc func(requestIndex int) (responseDelay time.Duration)
}{
"decreasing response delay": {func(requestIndex int) (responseDelay time.Duration) {
// responses are responded increasingly faster, so always out of order.
return time.Duration(numConcurrentRequests-(requestIndex+1)) * 10 * time.Millisecond
}},
"random response delay": {func(requestIndex int) (responseDelay time.Duration) {
// randomly out of order responses.
return time.Duration(rand.Intn(100)) * time.Millisecond
}},
}…y timeouts as needed
- check 10 concurrent requests - add jitter to simulate out of order responses
porridge
left a comment
There was a problem hiding this comment.
LGTM, whatever addition to the docstring you end up with should be fine.
to make the test easier to read
Description
The CertificateRequester to request certificates from central using two channels that can be provided by a SensorComponent, see example usage in draft PR #400
Checklist
No CHANGELOG entry or upgrade steps required.
Testing Performed
Added unit tests, tests are passing with
go test -count=500 -race -v -p=1 github.com/stackrox/rox/sensor/kubernetes/localscanner -run TestCertificateRequester