Skip to content

ROX-13661: Refactor Stopper and use it more#3931

Merged
msugakov merged 20 commits intomasterfrom
misha/stoppable-v2
Jan 2, 2023
Merged

ROX-13661: Refactor Stopper and use it more#3931
msugakov merged 20 commits intomasterfrom
misha/stoppable-v2

Conversation

@msugakov
Copy link
Contributor

@msugakov msugakov commented Nov 28, 2022

Description

Refactor existing Stopper to be used in all places where we previously had stopC and stoppedC.
Induced by #3800

Note that I am not trying to fix things in this PR, just reduce the amount of different patterns. Some things look very questionable but I refrain from fixing them in the absence of unit tests.

It is best to review this PR by going through commits. This way the size isn't overwhelming.

Checklist

  • Investigated and inspected CI test results
  • Unit test and regression tests added

No need in the following:

  • Evaluated and added CHANGELOG entry if required
  • Determined and documented upgrade steps
  • Documented user facing changes (create PR based on openshift/openshift-docs and merge into rhacs-docs)

Testing Performed

  • Relying on CI.

@ghost
Copy link

ghost commented Nov 28, 2022

Images are ready for the commit at 46adb80.

To use with deploy scripts, first export MAIN_IMAGE_TAG=3.73.x-331-g46adb80f44.

// StoppableFlow represents an interface for goroutine to check and report status of a graceful shutdown.
type StoppableFlow interface {
// StopWithError initiates a shutdown due to an error that happened during async processing in goroutine.
StopWithError(error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't StoppableClient.Stop and StoppableFlow.StopWithError be closer together in the same object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intentional.

.Client().Stop() is called externally (outside of goroutine), and I haven't found any cases when non-nil error was provided as parameter. That's why .Client().Stop() doesn't take error parameters.

OTOH, I found goroutines self-stop due to error (e.g. a channel closed). That's why I added .Flow().StopWithError(error) for internal interface of goroutines (that's what I call StoppableFlow).

@msugakov msugakov requested a review from a team November 29, 2022 09:19
@msugakov msugakov changed the title (WIP) Refactor existing Stopper to Stoppable ROX-13661: (WIP) Refactor existing Stopper to Stoppable Nov 29, 2022
@msugakov msugakov changed the title ROX-13661: (WIP) Refactor existing Stopper to Stoppable ROX-13661: (WIP) Refactor Stopper and use it more Nov 29, 2022
Copy link
Contributor

@vikin91 vikin91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far (as still WIP). I am curious to see it in the NodeScanHandler (or NodeInventoryHandler after landing #3755)

func (c *commandHandlerImpl) Stop(err error) {
c.stopC.SignalWithError(err)
func (c *commandHandlerImpl) Stop(_ error) {
c.stopper.Client().Stop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we are loosing information about the error passed here. As c.stopper.Flow().StopWithError(err) does not fit here, then maybe we need a replacement for sending a command to stop with error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we are not loosing information because the error is always nil.

It is exactly the same topic as above - #3931 (comment)

I recommend you to check with IDE Find Usages what calls this Stop() method and whether there's any single place that gets non-nil error. I found zero.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so the error was a "plan for later" that we never used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. Something of this kind, or API was not separated for internal/external (from goroutine point of view) use cases and external functions were just a copy of internal needs. I guess we'd never know unless we go in archeology.

@msugakov msugakov force-pushed the misha/stoppable-v2 branch 2 times, most recently from 32651d4 to dcb444a Compare November 29, 2022 18:29
}

func (s *NodeScanHandlerTestSuite) TestHandlerStoppedError() {
func (s *NodeScanHandlerTestSuite) TestHandlerStopIgnoresError() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think TestHandlerStopIgnoresError test became redundant and can be removed. Would it be ok with you @vikin91 ?

@msugakov msugakov force-pushed the misha/stoppable-v2 branch 3 times, most recently from 2f3dd15 to 3105f6a Compare November 30, 2022 22:57
Comment on lines -43 to -45
if s.stopC.IsDone() {
return false, errors.New("Cannot start reader because stopC is already done. Reader might have already been stopped")
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was redundant because nothing in the current code tries to stop the reader before it's started.

return s.stopC.Signal()
func (s *auditLogReaderImpl) StopReader() {
s.stopper.Client().Stop()
_ = s.stopper.Client().Stopped().Wait()
Copy link
Contributor Author

@msugakov msugakov Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absence of this .Wait() line lead to the following error appear from time to time in logs when running tests.

collection/auditlog: 2022/11/30 12:19:56.751189 auditlog_impl.go:158: Error: Audit log tailer stopped with an error no such file or directory

Looks like I fixed a bug.

// Returns true if the reader can be started (log exists and can be read). Log file missing is not considered an error.
StartReader(ctx context.Context) (bool, error)
// StopReader will stop the reader if it's started. Will return false if it was already stopped.
StopReader() bool
Copy link
Contributor Author

@msugakov msugakov Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was nothing that looks at the returned value of StopReader().

@msugakov msugakov requested a review from a team as a code owner December 1, 2022 00:06
@msugakov msugakov force-pushed the misha/stoppable-v2 branch 2 times, most recently from 6b46a72 to 531b081 Compare December 1, 2022 00:32
Copy link
Contributor

@fredrb fredrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good. Some general comments and thoughts. No blockers.

The distinction between Client and Flow is more clear now on the second pass.

However, I have one last optional suggestion: I wonder if there's a way to enforce Client/Flow distinction (at least as a code pattern) in a way that avoid misuse of the Stopper interface inside the goroutine? For other dev that wants to build a new sensor component the distinction might not be so obvious.

We don't want the code inside the component's goroutine to call stopper.Client() functions, only stopper.Flow(), right? So maybe the goroutine should be anonymous and receive only what it should operate on?

Something like:

func (c *MyComponent) Start() {
    // do pre-work
    go func(flow concurrency.Flow, dependency T) {
           // Component main loop
    }(c.stopper.Flow(), someOtherDependency)
}

Maybe out of the scope for this PR. WDYT?

type SensorComponent interface {
Start() error
Stop(err error)
Stop(err error) // TODO: get rid of err argument as it always seems to be effectively nil.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@msugakov msugakov requested a review from fredrb December 20, 2022 18:09
@msugakov
Copy link
Contributor Author

@fredrb Sorry, I missed your suggestion previously.

I wonder if there's a way to enforce Client/Flow distinction (at least as a code pattern)

That's the ideal state I wanted to get it to but unfortunately couldn't find a way. Mainly the reason is that we use Stopper in a struct as a field and so every struct's function has access to everything in the Stopper. Even in the snippet you provide

func (c *MyComponent) Start() {
    // do pre-work
    go func(flow concurrency.Flow, dependency T) {
           // Component main loop
    }(c.stopper.Flow(), someOtherDependency)
}

there is c.stopper and goroutine can use that instead of flow. If some things can be done differently, humans will do them differently. The ideal state would be if there's only a single way and it hides things that should not be seen, but, as I mentioned, I did not find it.

We don't want the code inside the component's goroutine to call stopper.Client() functions, only stopper.Flow(), right?

For me it is actually the opposite. We don't want anything outside of goroutine to use stopper.Flow(). Not having stopper.Client() in goroutine would be nice, but there are some cases that violate that, e.g. sensor/common/sensor/central_receiver_impl.go.

@msugakov
Copy link
Contributor Author

msugakov commented Jan 2, 2023

/retest

@msugakov msugakov force-pushed the misha/stoppable-v2 branch from 2e17e81 to 46adb80 Compare January 2, 2023 12:53
@msugakov msugakov merged commit b976089 into master Jan 2, 2023
@msugakov msugakov deleted the misha/stoppable-v2 branch January 2, 2023 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants