Skip to content

ROX-9127: Create CertificateRequester to request certificates from central#463

Merged
juanrh merged 29 commits intomasterfrom
juanrh/ROX-9127
Feb 2, 2022
Merged

ROX-9127: Create CertificateRequester to request certificates from central#463
juanrh merged 29 commits intomasterfrom
juanrh/ROX-9127

Conversation

@juanrh
Copy link
Contributor

@juanrh juanrh commented Jan 27, 2022

Description

The CertificateRequester to request certificates from central using two channels that can be provided by a SensorComponent, see example usage in draft PR #400

Checklist

  • Investigated and inspected CI test results
  • Unit test and regression tests added
  • Evaluated and added CHANGELOG entry if required
  • Determined and documented upgrade steps

No CHANGELOG entry or upgrade steps required.

Testing Performed

Added unit tests, tests are passing with go test -count=500 -race -v -p=1 github.com/stackrox/rox/sensor/kubernetes/localscanner -run TestCertificateRequester

}

func (i *certRequestSyncImpl) requestCertificates(ctx context.Context) (*central.IssueLocalScannerCertsResponse, error) {
msg := &central.MsgFromSensor{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one of the reasons why I found this a bit hard to review is that responsibilities aren't clear. The certRequestSyncImpl directly injects a message into the sensor stream, but the certificateRequesterImpl is responsible for handling the response to that and dispatching it.

Imho, the direct communication with the sensor stream should be exclusively reserved to the certificateRequesterImpl. I don't think a certRequestSyncImpl ever needs to know its ID, and I don't think the ID - after being written into the outgoing message - needs to be persisted anywhere other than in the map.

So, what I'd basically suggest is to send the outgoing message in (*certificateRequesterImpl).RequestCertificates, and only maintain the msgToSensorC in the certRequestSyncImpl struct (which I would suggest should be renamed to responseC).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would leave certRequestSyncImpl with just one field and just one method that is called only once in its lifetime. So I'll get rid of certRequestSyncImpl and replace it with a receiveResponse method in certificateRequesterImpl. That way the code will be a bit simpler

@juanrh juanrh requested a review from misberner January 28, 2022 15:37
@ghost
Copy link

ghost commented Jan 28, 2022

Tag for build #167065 is 3.68.x-104-g8a059f374b.

💻 For deploying this image using the dev scripts, run the following first:

export MAIN_IMAGE_TAG='3.68.x-104-g8a059f374b'

📦 You can also generate an installation bundle with:

docker run -i --rm stackrox/main:3.68.x-104-g8a059f374b central generate interactive > bundle.zip

🕹️ A roxctl binary artifact can be downloaded from CircleCI.

Copy link
Contributor

@misberner misberner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Contributor

@SimonBaeumer SimonBaeumer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job Juán!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional to not be a receiver func of certificateRequesterImpl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, to stress we are not using any of the state, receive just uses information that is local to a particular request (the context and the receive channel)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this perhaps assert that Start() has already been called?
Also I think an exported method requires a docstring.

Speaking of docstrings, it would be great to explain why we have this worker goroutine, rather than do the central response handling directly in RequestCertificates. I'm assuming this is because we want to support overlapping requests/response timelines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird that the style checks have passed without such docstrings, I'll add that to Start, Stop and RequestCertificates, including that precondition in RequestCertificates.

We need to launch dispatchResponses to support concurrent requests, as we have a single channel to get the responses, which is the interface we have in SensorComponent. We have a unit test TestRequestConcurrentRequestDoNotInterfere for that.
For now we don't plan to have concurrent request for local scanner, because the retry ticker only runs one instance of the tick function at a time, but during this PR we came to the conclusion that supporting concurrent requests would make this type easier to use

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, rather than just checking if Start() has been called via a simple if condition, I would propose to zero-initialize the stopC error signal. Then, you call r.stopC.Reset() at the beginning of Start(), and select on <-r.stopC.Done() in the send method, returning r.stopC.ErrorOrDefault(errors.New("not started")) if that branch is taken.

Re docstrings: Docstrings are actually only reported for functions that are explicitly exported. I.e., func Foo() requires a docstring, as does func (t *ExportedType) Foo(), but func (t *unexportedType) Foo() does not, because this concrete function (*unexportedType).Foo is not exported (OTOH, you can reference ExportedType.Foo even in the absence of a receiver object - it will be of type func(*ExportedType)). The function may be exported as an interface method, but the way Golang interfaces work, it generally isn't possible to rule that such an interface exists, as it might even be declared in a different package. Also, the generated go docs would never show this function anywhere (arguably, the linter should require a docstring on each method of an exported interface, but it doesn't do that).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the docstrings explanation @misberner , TIL.

Copy link
Contributor

@porridge porridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot to click the approve radio button in my previous review.

@juanrh
Copy link
Contributor Author

juanrh commented Jan 31, 2022

I just pushed a rebase with no change to these commits, to fix the issues with CI

Juan Rodriguez Hortala added 4 commits January 31, 2022 16:19
sometimes `dispatchResponses` was starting before `r.Stop` was
reset, thus stopping dispatching immediatelly
to avoid interference between tests that causes DATA RACE errors
@juanrh juanrh requested a review from porridge February 1, 2022 09:49
Copy link
Contributor

@porridge porridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nitpicks, but I have a request for another test case that I think is pretty important.

waitGroup := concurrency.NewWaitGroup(numConcurrentRequests)

for i := 0; i < numConcurrentRequests; i++ {
go f.respondRequest(t, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test is useful, though I would bump numConcurrentRequests to something more like 10.

However what we don't seem to be testing explicitly is the case where the simulated central reads multiple requests first, and then responds to them in a different order.

AFAICT, this test is likely to always respond to a request with a matching response before the next request appears. This means we could make the requester serialized by mistake, and not realize this until something would get stuck in production.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I increased numConcurrentRequests to 10.
I added a jitter time.Duration parameter to respondRequest to randomly delay responses to simulate that, please let me know if that works:

if jitter > 0 {
    time.Sleep(time.Duration(rand.Int63n(jitter.Milliseconds())) * time.Millisecond)
}
select {
case <-f.ctx.Done():
case f.receiveC <- response:
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added another test case where responses are delayed with a time increasingly smaller, to get responses out of order in a deterministic way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that will work, although I must say that the test has become somewhat elaborate now. Personally I'd prefer a separate, simpler to follow Test... function, but up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the test case struct type to try to make this easier to read:

testCases := map[string]struct {
		responseDelayFunc func(requestIndex int) (responseDelay time.Duration)
	}{
		"decreasing response delay": {func(requestIndex int) (responseDelay time.Duration) {
			// responses are responded increasingly faster, so always out of order.
			return time.Duration(numConcurrentRequests-(requestIndex+1)) * 10 * time.Millisecond
		}},
		"random response delay":     {func(requestIndex int) (responseDelay time.Duration) {
			// randomly out of order responses.
			return time.Duration(rand.Intn(100)) * time.Millisecond
		}},
	}

@juanrh juanrh requested a review from porridge February 1, 2022 18:18
Copy link
Contributor

@porridge porridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, whatever addition to the docstring you end up with should be fine.

@juanrh juanrh merged commit 0457053 into master Feb 2, 2022
@juanrh juanrh deleted the juanrh/ROX-9127 branch February 2, 2022 12:16
RTann pushed a commit that referenced this pull request Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants