Skip to content

Add migration for groups with invalid values#3789

Merged
dhaus67 merged 3 commits intomasterfrom
master-dh/fix-groups-migration
Nov 16, 2022
Merged

Add migration for groups with invalid values#3789
dhaus67 merged 3 commits intomasterfrom
master-dh/fix-groups-migration

Conversation

@dhaus67
Copy link
Contributor

@dhaus67 dhaus67 commented Nov 11, 2022

Description

Within the initial PR to migrate the groups bucket from a composite key created by the storage.GroupProperties set on the group to a UUID, groups that had an empty role name have been missed during the migration.

Those have surfaced recently after looking at a ticket (ROX-13446), where the groups bucket contained two groups with properties set to nil and an empty role name.

Irrespective of how the groups got there in the first place, the first migration missed to add an ID to groups with a) either no role name set or b) no properties being set.

After adding validation, groups that either do not reference a role name or an auth provider are considered as invalid.

This PR creates a migration that specifically removes those groups.

Checklist

  • Investigated and inspected CI test results
  • Unit test and regression tests added
    - [ ] Evaluated and added CHANGELOG entry if required
    - [ ] Determined and documented upgrade steps
    - [ ] Documented user facing changes (create PR based on openshift/openshift-docs and merge into
    rhacs-docs)

If any of these don't apply, please comment below.

Testing Performed

  • see unit tests added for the migration.

Additionally, the following tests were done manually to ensure the migration worked as expected:

  1. Create a central which inserts invalid data on startup (I used a custom build that added those improper groups with nil properties and empty role name).
  2. You should observe the following output when querying the groups endpoint:
roxcurl  /v1/groups
{"groups":[{"props":null,"roleName":""},{"props":{"authProviderId":"177332fb-c6ae-4062-ab60-54b42cdeb024","key":"","value":""},"roleName":""}
  1. Upgrade to this version.
  2. Query the groups endpoint and verify that the previously invalid entries are not available anymore.
roxcurl  /v1/groups
{}

@dhaus67
Copy link
Contributor Author

dhaus67 commented Nov 11, 2022

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@ghost
Copy link

ghost commented Nov 11, 2022

Images are ready for the commit at 74da504.

To use with deploy scripts, first export MAIN_IMAGE_TAG=3.72.x-621-g74da504ac6.

// In a prior migration, the groups bucket was migrated from groups being stored by a composite key to groups
// being stored by a UUID.
// Within that migration, groups that had an empty role name were skipped during migration.
// This lead to bucket entries where the properties and role name was both empty, thus making it impossible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also delete groups with empty role names here as a pre-caution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do delete groups with empty role names as long as they are still stored keyed by the composite key.

I don't think we need to remove groups with empty role names when they are stored using a unique ID - currently it would be possible to delete those via API directly (I also don't expect those to be there, but then again neither did I with the current situation). Still, I'd say let's not do this and add additional complexity to the migration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, empty role name = invalid group = should be deleted. This corresponds with the goal of this migration as I see it - fix our previous mistake and return the groups bucket to the valid state. Maybe I understood it wrong and we delete somewhere groups with an empty role, but definitely not during startup:

var isEmptyGroupPropertiesF = func(props *storage.GroupProperties) bool {
if props.GetAuthProviderId() == "" && props.GetKey() == "" && props.GetValue() == "" {
return true
}
return false
}

I won't fixate on this though. If you feel strongly about not adding this to migration right now, let's put it on hold - for me, it makes sense to do it now due to our release cadence.


// 1. Remove the value stored behind the composite key, since the migrated group is now successfully stored.
if err := bucket.Delete(compositeKey); err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make this migration best-effort by appending an error into multierror instead of returning immediately?

for i := range groupStoredByCompositeKeys {
compositeKey := groupStoredByCompositeKeys[i].compositeKey

// 1. Remove the value stored behind the composite key, since the migrated group is now successfully stored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no 2

return nil
})
})
return groupsStoredByCompositeKey, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's explicitly return nil instead of err here

return nil
}

for i := range groupStoredByCompositeKeys {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we use for _, group := range groupStoredByCompositeKeys? Then we don't need to address array value by index within the loop

suite.Run(t, new(removeGroups))
}

type removeGroups struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's call it removeGroupsMigrationSuite or something similar

Copy link
Contributor

@ivan-degtiarenko ivan-degtiarenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo couple of nits

},
}

// 1. Buckets don't exist should succeed still
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super-nit

Suggested change
// 1. Buckets don't exist should succeed still
// 1. Migration should succeed if the bucket does not exist


var deleteGroupErrs *multierror.Error
for _, group := range groupStoredByCompositeKeys {
compositeKey := group.compositeKey
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can be inlined

}

// EncodeBytesList takes a list of byte slices and encodes them into a single byte slice.
func EncodeBytesList(byteSlices ...[]byte) []byte {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is using it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the same examples for copying stackrox code into the migrations - which is copying the package as is and not modifying it, even though some functions may be unused.


// DecodeBytesList takes a byte buffer encoded via EncodeBytesList or WriteBytesList and decodes it into a list of byte
// slices.
func DecodeBytesList(buf []byte) ([][]byte, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is using it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@dhaus67 dhaus67 force-pushed the master-dh/fix-groups-migration branch from cc540a9 to 74da504 Compare November 15, 2022 23:56
@dhaus67 dhaus67 enabled auto-merge (squash) November 15, 2022 23:57
@dhaus67 dhaus67 merged commit f6d3f9d into master Nov 16, 2022
@dhaus67 dhaus67 deleted the master-dh/fix-groups-migration branch November 16, 2022 00:30
vikin91 added a commit that referenced this pull request Nov 25, 2022
7ffc6be ROX-13368: Skip failing nongroovy tests on PG (#3721)
bbdd7a0 Bump github.com/gofrs/uuid from 4.3.0+incompatible to 4.3.1+incompatible (#3642)
1f253f2 Bump github.com/google/certificate-transparency-go from 1.1.3 to 1.1.4 (#3543)
d434c8d [ROX-13030] : Add delete collection API endpoint and service implementation (#3648)
f062c21 Dashrews/ROX-13253 wait for central-db to come back after bounce and allow FATAL connection lost error (#3537)
edc1174 CI: Fill the gaps for https://testgrid.k8s.io/ (#3715)
86d7c54 ROX-13231: use passed context when non-postgres (#3540)
9093195 Add less specific type for BE collection response string (#3728)
5abb652 Only enable ROX_OBJECT_COLLECTIONS feature flag during gke-postgres-ui-e2e job (#3727)
4f64cd1 Add centralDBOnly mode in render (#3707)
d67bbe5 Dashrews/ROX-13082 UUID searcher and common updates to set allow use of postgres UUID PR 1 of 4 (#3679)
6f829d5 ROX-13259: graphInit called during init time (#3705)
f3bc50d ROX-13380: Conditional rendering edges for deployments and namespaces (#3641)
3764476 ROX-12319: implement smoke test step with groovy test filter (#3220)
f202fd4 ROX-11826: Disable kernel support package uploads for managed central (#3661)
61f03dc ROX-11101: Remove deprecated resources from central (#3115)
e6aa6d7 ROX-11101: Restore Role permission in UI (#3428)
3203e04 ROX-11101: Remove deprecated resources (#3036)
a35f41e Bump golang.org/x/sys from 0.1.0 to 0.2.0 (#3733)
4120524 Bump snakeyaml from 1.29 to 1.33 in /qa-tests-backend (#3732)
e1785c0 Bump github.com/coreos/go-systemd/v22 from 22.4.0 to 22.5.0 (#3724)
870df4a Bump google.golang.org/api from 0.101.0 to 0.102.0 (#3723)
721454c Generalize User-Agent setup (#3672)
6a11bf0 Bumps collector version to 3.11.x-145-gc345f72f5e (#3736)
ec5d343 [ROX-12923] Walk retries - remainder work (#3729)
40f3d43 ROX-13440: Replace ambiguous central with sensor in networkGraph integration test (#3730)
281ed22 Bump groovy-xml from 2.5.18 to 2.5.19 in /qa-tests-backend (#3741)
49d1651 Bump github.com/prometheus/client_golang from 1.13.1 to 1.14.0 (#3742)
b5544aa Bump cloud.google.com/go/storage from 1.27.0 to 1.28.0 (#3743)
9c61e53 ensure CVSS is present for istio vulns (#3706)
ae29d52 ROX-13452: don't always clobber scoped ctx when non-postgres (#3748)
517bf05 ROX-13261: DryRunUpdate on collection datastore (#3687)
baf7654 ROX-13378: Group new resources with deprecated in UI (#3690)
569922f ROX-13421: Enable roxctl netpol generate and add tech-preview messages (#3740)
2465fc5 Dashrews/ROX-13082 UUID generator templates PR 2 of 4 (#3681)
c093c68 Bump slack-api-client from 1.20.2 to 1.27.0 in /qa-tests-backend (#3752)
2c860bb Bump ubi8-micro from 8.6 to 8.7 in /operator (#3751)
80eb04c Make deploy.sh and deploy-local.sh pass shellcheck (#3582)
2182b43 Dashrews/ROX-13082 UUID test updates PR 3 of 4 (#3694)
6dc6ca5 [ROX-13403] : Fix node -> topVuln sub resolver bug when node cves is empty (#3689)
1b21361 Move integration tests for page title from general to specific containers (#3675)
e1a9f31 Bump google.golang.org/api from 0.102.0 to 0.103.0 (#3773)
a05ea31 Bump golang.org/x/crypto from 0.1.0 to 0.2.0 (#3772)
65ddf4f ROX-12824: Add roxctl commands to generate Central DB bundle (#3602)
c3f1e2f Remove obsolete authProviders request for Integrations page (#3759)
7ccd54d Dashrews/ROX-13082 UUID protos generated PR 4 of 4 (#3698)
9ab5c8f cleanup image digest utilities (#3764)
187ed44 ROX-11931: Convert junit failure artifacts to Slack attachments (#3438)
b5d8790 ROX-13432: leaning up unused code copied/pasted from topology demo (#3750)
ab05bfc Refactor collection form page for better composition (#3744)
c5562f7 Remove babel devDependencies in ui-components (#3761)
2b90b3a Extract collection form from drawer wrapper layout (#3745)
a779fc9 [ROX-12625 + ROX-13032] : Add GetCollectionCount and UpdateCollection endpoints and  services (#3749)
e77f0da Upgrade cypress 11.0.0 devDependencies in ui (#3760)
a3fba94 ROX-13068: Use real data for deployment details (#3688)
4c7d90e ROX-12617: Collection to search query converter (#3683)
3e98aec ROX-13067: fill out port configurations section of deployment details (#3714)
a48de36 ROX-12835: Add support for NodeScanV2 to Sensor (#3533)
30c5dc7 ROX-13466: Fix deletion of groups with empty properties (#3756)
5cb2470 Add autocomplete for name selector dropdowns (#3676)
b9a75ad ROX-13464 adding flows dropdown in NG (#3763)
3217a67 [ROX-13500] Perform type check for V1 CronJob (#3787)
af3790d Remove bulk delete from collections table (#3776)
dda123b Add more info in migration log (#3788)
179f0c9 ROX-13502: Remove the circular dependency between cluster datastore init and cscc notifier init (#3790)
029d584 Update SCANNER_VERSION (#3774)
cbca57c Bump github.com/ckaznocha/protoc-gen-lint from 0.2.4 to 0.3.0 (#3783)
3613b56 Bump golang.org/x/tools from 0.2.0 to 0.3.0 (#3782)
5fc0a6a Bump github.com/google/go-containerregistry from 0.12.0 to 0.12.1 (#3781)
1d1c687 Bump controller-gen version to 0.10.0 (#3754)
c3a5290 Untie documentation link from the product version (#3799)
ed822aa use correct package for migration (#3784)
397a0b4 Validate that label keys are valid k8s labels and ensure correct key splitting (#3777)
edd1050 Rename variable ScannerGRPCEndpoint to ScannerSlimGRPCEndpoint (#3657)
6662c9f ROX-13378: Access Control page permissions (#3720)
b0e73c5 fix Operator reconciliation for external Central DB (#3796)
b83bc1f ROX-13505: Fix error log scanning the postgres stat collection (#3795)
ca660cb Prevent the collection being edited from displaying in its own embedded list (#3778)
3f7b3fc [ROX-13441][POSTGRES] Propagate context correctly in retries (#3793)
e0cbc6f ROX-12839: Update changelog to announce removal of in-product docs (#3805)
696e8bc [ROX-12358] Follow up on vulnerability request proto change (#2851)
c4b46d8 Change getCollectionCount endpoint and updateCollection request type
5f2efbc remove make proto-fmt (#3804)
0c75540 Remove os.Std* from roxctl/central (#3758)
25a90de Add ability to view embedded collections in a pop up modal (#3747)
5c1bf81 ROX-13240: fix scanner-slim updates when WebSockets are used (#3704)
1d98577 Add more context to jira notifier logging (#3812)
da2fd28 ROX-13031: DryRun Collection API (#3766)
1c418d5 Test data migration code in postgres tests (#3803)
ed95b37 Update UI Collection requests for BE compatibiltiy (#3762)
09cc188 ROX-11931: Fix junit-parse install in CI (#3811)
d2b01e3 ROX-12814: Disable PolicyFieldsTest on openshift. (#3797)
d10ce27 ROX-13345: disable 'missing required registry' aspect on openshift (#3798)
3d22396 Update collector to 3.12 (#3809)
1eb33fb ROX-13347: Modify scope queries to included quoted cluster and nameace names, to allow exact matches instead of erroneous and unintended prefix matches. (#3767)
3811a69 ROX-12621: list collection selectors api (#3806)
f6d3f9d Add migration for groups with invalid values (#3789)
cc21125 Bugfixes for collection autocomplete (#3816)
7623dec ROX-9350 Use fine-grained host paths for compliance mounts (#2479)
b4bf5c2 Fix collector volumeMounts  (#3826)
0e9be05 ROX-12953: figure out last 4 versions of sensor automatically (#3611)
459c7ae ROX-12814: Add proper todo for reenabling the test (#3817)
9ee40ff ROX-13523: add isEnabled enum to central db spec (#3815)
535bc72 Replace requestConfig with routeMatcherMap in helper functions for integration tests (#3686)
2b75b61 `gosec` G104: Add `ShouldErr(err)` that returns `err` (#3830)
fb1b82f WIP: Introduce nodescan call
35f8a8f WIP: Prepare converter
716144b Moved and renamed fake nodescan tests
4748de7 Introduce real node scanner with conversion functions
5e6d9a8 wip: real scanner
0169868 wip: log results
1438d2c wip: Debug Analyze call
b09894c wip: Debug Analyze call
3ceed72 wip: Update and improve debug logs
d1669fd Remove copied lib, bump scanner version, add debug
14a3f73 Merge branch 'master' into mm/ROX-12967-real-nodescan
fbd0450 Fix style issues
17ccb31 Debug: let both scans finish to see what they return
@dhaus67 dhaus67 mentioned this pull request Dec 8, 2022
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants