ROX-28479: Add external IPs to fake workload generation#14574
ROX-28479: Add external IPs to fake workload generation#14574JoukoVirtanen merged 13 commits intomasterfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
Images are ready for the commit at 68301c6. To use with deploy scripts, first |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14574 +/- ##
==========================================
- Coverage 49.26% 48.97% -0.30%
==========================================
Files 2528 2543 +15
Lines 184979 186443 +1464
==========================================
+ Hits 91130 91305 +175
- Misses 86621 87909 +1288
- Partials 7228 7229 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
3137adc to
e406412
Compare
Stringy
left a comment
There was a problem hiding this comment.
Looks good from my perspective, just a couple of minor comments/suggested simplifications. Will defer approval to the Sensor folks, who have more context on the implications of this than I do :)
…Ps in pool. Improving comments
parametalol
left a comment
There was a problem hiding this comment.
Please refactor a little.
Nits:
Not your change, but IPs could be integers until needed to be formatted.
Also, pools might not be needed at all, if sequential IPs are used instead of random.
Co-authored-by: Michaël <smartptr@gmail.com> Co-authored-by: Piotr Rygielski <114479+vikin91@users.noreply.github.com>
…s should not have external IP
parametalol
left a comment
There was a problem hiding this comment.
I see a room for refactoring, but that's fine.
vikin91
left a comment
There was a problem hiding this comment.
Thank you for addressing my review comments, this looks good!
Using that opportunity, I would like to highlight that this change may amplify the number of active connections being tracked in Sensor and Central (see https://issues.redhat.com/browse/ROX-28242, https://issues.redhat.com/browse/ROX-28259, and https://issues.redhat.com/browse/ROX-28315). The "bad" behavior is already there and it concerns mainly the long-running cluster with fake workloads, so this is not a blocker for this PR. However, I would keep the next release managers informed that such things could be observed (Sensor OOMing, maybe also Central). We have not seen that yet in production code (non-fake workload), but this could also theoretically happen over longer time spans.
Luis and me are trying to prevent this from happening in #14538 and #14483, but the proper fix for that would be the Collector closing the connections. We do not have a collector in fake-workloads, so adding that to the workload generation would be beneficial. If you have an appetite for implementing it (separately to this PR), I would be happy to review.
The long running cluster is actually two clusters, one of which has collector and workload generated by kube-burner creating berserker pods. |
Yes, I am aware of this. I mentioned:
and the hint about closing connections for this cluster is still valid. |
Description
Adds external IPs to fake workload generation. We need to be able to test the external IPs feature using the existing scale tests, in which Sensor creates fake data that it itself consumes. This will enable us to find bugs that might cause crashes, scalability issues, and leaks. This will be part of the long running cluster. It will also be run with K6 load testing, though at the current moment the API endpoints relevant to external IPs is not called by K6 load testing. That change can be made later.
User-facing documentation
Testing and quality
Automated testing
- [ ] added e2e tests- [ ] added regression tests- [ ] added compatibility tests- [ ] modified existing testsNo code that runs in production is modified, so no testing is needed. The fake workload generation currently does not have any testing.
How I validated my change
Scale testing was run using the following commands from the root of the stackrox/stackrox repository.
The following script was run to get metadata on the external entities.
Part of the output is shown bellow
The script was run a few times and the value of
totalEntitieswas seen to fluctuate.Long running cluster
A long running cluster was also created, which essentially does the same thing as above, except that it is triggered through a github action and runs for much longer.
The following was done to run the github action.
The commit used for this tag was 8436aaa. The tag was later deleted.
The images were then built by CI.
Went to https://github.com/stackrox/test-gh-actions/actions/workflows/create-clusters.yml
Clicked on “Run workflow”. Set the image tag in “Version of the images” to 0.0.1. Selected “Create a long-running cluster on RC1” and clicked on the green button that says “Run workflow”.
Set the
ROX_EXTERNAL_IPSenvironment variable by editing the central deployment manually.Obtained the password and API token.
Ran the script from the first testing section and saw the following
Here are some screen shots from the long running cluster.