ROX-12967: Replace fake NodeScan with real NodeInventory#3722
ROX-12967: Replace fake NodeScan with real NodeInventory#3722Maddosaurus merged 14 commits intomasterfrom
Conversation
166591b to
0debacd
Compare
|
Images are ready for the commit at 17897d7. To use with deploy scripts, first |
7a0ae57 to
d1669fd
Compare
|
Local result on Colima: |
interesting. why are we trying to read |
17ccb31 to
d9d8759
Compare
17ccb31 to
9831cb4
Compare
|
@vikin91 is the review from the Merlin team expected? |
@ivan-degtiarenko sorry for the confusion, this PR is not yet ready for review. I marked it as draft to make it clear. |
9831cb4 to
9a5c795
Compare
|
@RTann @vikin91 I was also able to reproduce the docker layer message in a slightly different wording on an OS4 cluster as well: |
980bafe to
cbcdfa6
Compare
| microdnf clean all && \ | ||
| rm /tmp/snappy.rpm /tmp/postgres.rpm /tmp/postgres-libs.rpm RPM-GPG-KEY-CentOS-Official && \ | ||
| # (Optional) Remove line below to keep package management utilities | ||
| rpm -e --nodeps $(rpm -qa curl '*rpm*' '*dnf*' '*libsolv*' '*hawkey*' 'yum*') && \ |
There was a problem hiding this comment.
Why are we removing this here? (I am not sure what this file does, that is why I am asking)
There was a problem hiding this comment.
This is the spot where we remove all package managers from the main image.
I have only removed *RPM* from this line, as we want to keep RPM in the images for the rpmdb command Compliance needs.
IIUC, this file is the template from which the Dockerfile for main is generated.
There was a problem hiding this comment.
Oh, now I recall. We remove it from the removal code, so in fact we are keeping the package. All good
There was a problem hiding this comment.
Due to a lack of a follow-up // TODO(ROX-12345): comment, I have a question: where do we stand in this decision? Related Slack thread looks unfinished to me https://srox.slack.com/archives/C04213W844U/p1670426415333499
There was a problem hiding this comment.
Customers may ask about this (someone asked why Scanner still has it). Perhaps we should consider updating the docs or CHANGELOG? Definitely need to also make sure our default policy which allows Scanner to have RPM also allows our other images have it.
Perhaps a followup ticket, find a way to do this without rpm (eh. I wanted to remove it in Scanner, but it's the only official way to get rpmdb data), or perhaps use a separate Dockerfile for Compliance, so we minimize the number images with rpm installed
There was a problem hiding this comment.
I have created a follow up ticket (ROX-13934) so we don't forget about it and will push this as a TODO comment.
| microdnf clean all && \ | ||
| rm /tmp/snappy.rpm /tmp/postgres.rpm /tmp/postgres-libs.rpm RPM-GPG-KEY-CentOS-Official && \ | ||
| # (Optional) Remove line below to keep package management utilities | ||
| rpm -e --nodeps $(rpm -qa curl '*rpm*' '*dnf*' '*libsolv*' '*hawkey*' 'yum*') && \ |
There was a problem hiding this comment.
Customers may ask about this (someone asked why Scanner still has it). Perhaps we should consider updating the docs or CHANGELOG? Definitely need to also make sure our default policy which allows Scanner to have RPM also allows our other images have it.
Perhaps a followup ticket, find a way to do this without rpm (eh. I wanted to remove it in Scanner, but it's the only official way to get rpmdb data), or perhaps use a separate Dockerfile for Compliance, so we minimize the number images with rpm installed
compliance/collection/main.go
Outdated
| var scanner nodescanv2.NodeScanner | ||
| if features.UseFakeNodeInventory.Enabled() { | ||
| log.Infof("Using FakeNodeScanner") | ||
| scanner = &nodescanv2.FakeNodeScanner{} |
There was a problem hiding this comment.
not a big deal, but something like scanner = nodescanv2.NewFakeNodeScanner() looks cleaner (similar for below). Actually something like nodescanner.NewFake() and nodescanner.New() where this is defined in something like pkg/nodescanner/v2 looks even cleaner IMO, but maybe that's for another time
There was a problem hiding this comment.
Customers may ask about this
I have introduced a changelog entry in cb77bec informing of this change, and also created a follow up ticket ROX-13934 to track this TODO.
looks even cleaner IMO
I concur! The FakeNodeScanner was introduced as a temporary measure to unblock parallel workstreams on the RHCOS topic.
We aim to remove it (and the according Interface) with the MVP release. I really like your suggestion here, and am definitely using this on future occasions. But for now, I'd tend to leave it as is and remove this code with the MVP release.
image/templates/helm/stackrox-secured-cluster/templates/collector.yaml
Outdated
Show resolved
Hide resolved
| ) | ||
|
|
||
| // NodeInventorizer is the interface that defines the interface a scanner must implement | ||
| type NodeInventorizer interface { |
There was a problem hiding this comment.
How come we changed the name from NodeScanner? I figure if we define a Scan function, NodeScanner makes sense. I don't think Inventorizer is a word, so I'm a bit skeptical using it to name this interface
There was a problem hiding this comment.
We originally planned this as NodeScanning V2, but later changed it to NodeInventory, to improve separation, because the changes were rather confusing, as NodeScanning is already used in multiple places in the codebase.
I decided to reflect these changes in the implementation as well, to make it more clear we're talking about a NodeInventory and not the already existing NodeScan.
I concur that the name doesn't feel common, as it's a nonstandard word.
I'd be happy to consider alternatives, because I currently don't know how we could call it otherwise (except reverting back to NodeScanner, which would also be an option). WDYT?
| Scan(nodeName string) (*storage.NodeInventory, error) | ||
| } | ||
|
|
||
| // NodeInventoryCollector is the implementation of NodeInventorizer |
There was a problem hiding this comment.
nit: it is "an" implementation. not necessarily "the" only one
|
|
||
| // Scan scans the current node and returns the results as storage.NodeInventory object | ||
| func (n *NodeInventoryCollector) Scan(nodeName string) (*storage.NodeInventory, error) { | ||
| componentsHost, err := nodes.Analyze(nodeName, "/host/", false) |
There was a problem hiding this comment.
Also maybe comment why we use false here
There was a problem hiding this comment.
Also, since we have a log to say when we finish, can we have a log to state when we start?
There was a problem hiding this comment.
I've updated the code, added a more detailed comment and update the loglevel
| log.Errorf("Error scanning node /host inventory: %v", err) | ||
| return nil, err | ||
| } | ||
| log.Infof("Components found under /host: %v", componentsHost) |
There was a problem hiding this comment.
do we plan on keeping this? If so, then maybe do a Debug log instead
| return m, nil | ||
| } | ||
|
|
||
| func protoComponentsFromScanComponents(c *nodes.Components) *scannerV1.Components { |
There was a problem hiding this comment.
I believe we discussed probably moving this inside the Scanner repo, right?
There was a problem hiding this comment.
Yes, we did. I have created ROX-14029 to track this change for the future. Thank you for reminding me 👍
msugakov
left a comment
There was a problem hiding this comment.
This looks good to me. Some final touches could be:
- Please address my last two comments.
- Style checks are failing for some reason.
- I hope your testing works out fine.
Once you update the code and ping me for re-review, I'll most likely approve it.
|
@Maddosaurus: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Done!
It looks like style checks moved into GH Actions. Maybe a rebase fixes this.
Testing worked out fine with the latest changes from here as well as the combined Scanner PR (1026).
👍 |
Experiment with calls to Analyze
…iance image, temporary switch to bleeding edge scanner PR branch
- Restructured code - Moved nil handling into functions - Add TODOs for tracking open points
- Add changelog for RPM - Rename tmp volume in Helm chart - Rework logging for nodeinventory.go
Co-authored-by: Misha Sugakov <537715+msugakov@users.noreply.github.com>
7f222ff to
17897d7
Compare
|
@msugakov Let's see if the rebase fixed the style check. |
| v1rhelc := make([]*scannerV1.RHELComponent, 0, len(rc.Packages)) | ||
| for _, rhelc := range rc.Packages { | ||
| v1rhelc = append(v1rhelc, &scannerV1.RHELComponent{ | ||
| // TODO(ROX-13936): Find out if ID field is needed here |
There was a problem hiding this comment.
Note: We need a unique ID here. For now, use loop index.
If no index is provided, it is set to 0. If IDs are not unique, results are overwritten in ScannerDB
Idea for long term: Hash(Name+Version) that yields int64.
More information: https://srox.slack.com/archives/C04213W844U/p1667825859123589
vikin91
left a comment
There was a problem hiding this comment.
LGTM as discussed in the sync-review.
The issue with ID in scannerV1.RHELComponent must be addressed (we need a unique int64 for a name:version combination), but I am fine doing it in ROX-13936 (the ticket would need to be slightly repurposed for this).
Description
This PR introduces the real NodeScan functionality that collects installed components on a Node.
Prerequisite for this PR to work correctly is the use of latest Scanner after stackrox/scanner#1026 is merged.
The scanning code requires two changes to the running Compliance image:
rpmdbbinary, provided by theRPMpackage/tmpdir in Compliance (achieved through anemptyDirmount)Checklist
If any of these don't apply, please comment below.
Testing Performed
Deployed to an infra-provisioned OpenShift 4 cluster and observed the logs in the
Compliancecontainer to state the correct namespace (rhcos:4.11) as well as a list of installed components in its log message.Example log line:
Steps to reproduce
go get github.com/stackrox/scanner@mm/rox-12967-fix-rhcos-detection && go mod tidymake docker-build-roxctl-image) and main (make image) images to a private quay repoquay.io/mmeiding/playground./deploy/openshift/deploy.sh