Skip to content

ROX-13819: Recreate groups bucket#4068

Merged
dhaus67 merged 3 commits intomasterfrom
master-dh/ROX-13819-recreate-groups-bucket-
Dec 8, 2022
Merged

ROX-13819: Recreate groups bucket#4068
dhaus67 merged 3 commits intomasterfrom
master-dh/ROX-13819-recreate-groups-bucket-

Conversation

@dhaus67
Copy link
Contributor

@dhaus67 dhaus67 commented Dec 8, 2022

Description

We currently run into an issue ROX-13819 where we are observing invalid group entries.

While there has been another attempt in creating a migration for that. it turns out the specific bucket of the customer was not migrated properly.

The current reasons for this are unknown.

Now, to ensure we start with a clean bucket, we have agreed on doing the following:

  • Read all the bucket entries in-memory (since the groups store is a low-frequent and volume store, this should be of no concern).
  • Drop the groups bucket.
  • Re-create the groups bucket with the key-value pairs hold in-memory. If it cannot be determined that the key is a ID belonging to a group, the entry will be dropped and logged.

This way, we ensure we are reinstating a clean groups bucket after this migration.

Checklist

  • Investigated and inspected CI test results
  • Unit test and regression tests added
  • Evaluated and added CHANGELOG entry if required
  • Determined and documented upgrade steps
  • Documented user facing changes (create PR based on openshift/openshift-docs and merge into rhacs-docs)

If any of these don't apply, please comment below.

Testing Performed

1. Create a prior instance (e.g. 3.73.0 release) _without_ postgres enabled.
2. Create a couple of groups.
roxcurl /v1/groups -X POST -d '{"roleName": "", "props": {"authProviderId": "something", "key": "somewhere", "value":"somehow"}}'

roxcurl /v1/groups
{"groups":[{"props":{"id":"io.stackrox.authz.group.340a6739-4b5c-4bef-b229-a7755b621593","traits":null,"authProviderId":"something","key":"somewhere","value":"somehow"},"roleName":"abc"},{"props":{"id":"io.stackrox.authz.group.37817f74-8b68-47a9-b7e3-b326739c4b96","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"name","value":"aa"},"roleName":"Sensor Creator"},{"props":{"id":"io.stackrox.authz.group.3fe77b94-01aa-41ea-b6e5-d42dd8606451","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"","value":""},"roleName":"Admin"},{"props":{"id":"io.stackrox.authz.group.7e6d645a-ba75-445e-9402-6fbfbb743912","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"groups","value":"adsadasd"},"roleName":"Sensor Creator"},{"props":{"id":"io.stackrox.authz.group.ab329ef5-29be-4bcd-859d-062303455574","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"userid","value":"asdasdasdasdasd"},"roleName":"Analyst"},{"props":{"id":"io.stackrox.authz.group.be588a60-1cd7-4704-85f9-190bd69b1a1d","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"groups","value":"asdqweqeqweqw"},"roleName":"Vulnerability Management Approver"},{"props":{"id":"io.stackrox.authz.group.c6eda2df-479d-437a-a958-c95a197f2d0d","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"userid","value":"123123123123"},"roleName":"Analyst"},{"props":{"id":"io.stackrox.authz.group.e32849f3-2be7-4914-b53a-a36d369464b4","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"email","value":"someone@somewhere"},"roleName":"Vulnerability Management Approver"}]}

roxcurl /v1/groups | jq -r '.groups | length'                                                                                                                                                                                                                                                                        
8

4. Upgrade to the current version which includes the migration.
5. Observe within the logs that the migration for sequence 113 is executed:
Migrator: 2022/12/08 21:44:56.923080 log.go:18: Info: Run migrator.run() with version: 3.73.x-188-ge0a6c74774, DB sequence: 113
Migrator: 2022/12/08 21:44:56.923511 log.go:18: Info: conf.Maintenance.ForceRollbackVersion: none
pkg/migrations: 2022/12/08 21:44:56.923987 migration_version.go:58: Info: Migration version of database at /var/lib/stackrox/current: &{/var/lib/stackrox/current 3.73.0 112 0001-01-01 00:00:00 +0000 UTC}
clone/rocksdb: 2022/12/08 21:44:56.924136 db_clone_manager_impl.go:66: Info: Found clone current -> .db-init
clone/rocksdb: 2022/12/08 21:44:56.924223 db_clone_manager_impl.go:124: Info: Database clones:
clone/rocksdb: 2022/12/08 21:44:56.924275 db_clone_manager_impl.go:126: Info: current -> &{/var/lib/stackrox/current 3.73.0 112 0001-01-01 00:00:00 +0000 UTC}
pkg/migrations: 2022/12/08 21:44:56.931565 migration_version.go:58: Info: Migration version of database at /var/lib/stackrox/.db-46ce3067-d797-49f2-9909-1efb6b7c9b04: &{/var/lib/stackrox/.db-46ce3067-d797-49f2-9909-1efb6b7c9b04 3.73.0 112 0001-01-01 00:00:00 +0000 UTC}
Migrator: 2022/12/08 21:44:56.931671 log.go:18: Info: Clone to Migrate "temp", ""
Migrator: 2022/12/08 21:44:56.931720 log.go:13: Info: starting DB compaction
Migrator: 2022/12/08 21:44:56.931865 log.go:18: Info: Free fraction of 0.0625 (16384/262144) is < 0.7500. Will not compact
Migrator: 2022/12/08 21:44:57.042237 log.go:18: Info: In runner.Run
Migrator: 2022/12/08 21:44:57.042567 log.go:18: Info: Found DB at version 112, which is less than what we expect (113). Running migrations...
Migrator: 2022/12/08 21:44:57.050522 log.go:18: Info: Successfully updated DB from version 112 to 113

7. Verify that the amount of groups is the same as beforehand:
roxcurl /v1/groups
{"groups":[{"props":{"id":"io.stackrox.authz.group.340a6739-4b5c-4bef-b229-a7755b621593","traits":null,"authProviderId":"something","key":"somewhere","value":"somehow"},"roleName":"abc"},{"props":{"id":"io.stackrox.authz.group.37817f74-8b68-47a9-b7e3-b326739c4b96","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"name","value":"aa"},"roleName":"Sensor Creator"},{"props":{"id":"io.stackrox.authz.group.3fe77b94-01aa-41ea-b6e5-d42dd8606451","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"","value":""},"roleName":"Admin"},{"props":{"id":"io.stackrox.authz.group.7e6d645a-ba75-445e-9402-6fbfbb743912","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"groups","value":"adsadasd"},"roleName":"Sensor Creator"},{"props":{"id":"io.stackrox.authz.group.ab329ef5-29be-4bcd-859d-062303455574","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"userid","value":"asdasdasdasdasd"},"roleName":"Analyst"},{"props":{"id":"io.stackrox.authz.group.be588a60-1cd7-4704-85f9-190bd69b1a1d","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"groups","value":"asdqweqeqweqw"},"roleName":"Vulnerability Management Approver"},{"props":{"id":"io.stackrox.authz.group.c6eda2df-479d-437a-a958-c95a197f2d0d","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"userid","value":"123123123123"},"roleName":"Analyst"},{"props":{"id":"io.stackrox.authz.group.e32849f3-2be7-4914-b53a-a36d369464b4","traits":null,"authProviderId":"a779715b-d696-4c6d-ae9b-3aaf953554ff","key":"email","value":"someone@somewhere"},"roleName":"Vulnerability Management Approver"}]}

roxcurl /v1/groups | jq -r '.groups | length'                                                                                                                                                                                                                                                                        
8


@dhaus67
Copy link
Contributor Author

dhaus67 commented Dec 8, 2022

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@dhaus67 dhaus67 requested a review from md2119 December 8, 2022 19:21
@dhaus67
Copy link
Contributor Author

dhaus67 commented Dec 8, 2022

/retest

@ghost
Copy link

ghost commented Dec 8, 2022

Images are ready for the commit at 490a5bb.

To use with deploy scripts, first export MAIN_IMAGE_TAG=3.73.x-188-g490a5bb90d.

Copy link
Contributor

@ivan-degtiarenko ivan-degtiarenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments + questions regarding why we are leaving some groups behind

if err := proto.Unmarshal(value, &group); err != nil {
return false
}
return group.GetProps().GetId() != ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also curious why we give up on groups without props id here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, no group should exist after the migration we had beforehand which doesn't have an ID associated with it.
This is just to ensure we have no "invalid" group properties in the store (if you were to try to delete this specific group, you wouldn't be able to at the moment).

Copy link
Contributor

@md2119 md2119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to update the sequence number pkg/migrations/internal/seq_num.go. After that 🚢 .

@md2119
Copy link
Contributor

md2119 commented Dec 8, 2022

Also, add 3.73.1-rc.2 milestone to the PR so that it gets picked into the patch

@dhaus67 dhaus67 added this to the 3.73.1-rc.2 milestone Dec 8, 2022
@dhaus67
Copy link
Contributor Author

dhaus67 commented Dec 8, 2022

/retest

@dhaus67 dhaus67 force-pushed the master-dh/ROX-13819-recreate-groups-bucket- branch from e0a6c74 to 490a5bb Compare December 8, 2022 21:28
@dhaus67
Copy link
Contributor Author

dhaus67 commented Dec 8, 2022

/retest

Copy link
Contributor

@ivan-degtiarenko ivan-degtiarenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@dhaus67 dhaus67 enabled auto-merge (squash) December 8, 2022 21:48
@dhaus67 dhaus67 merged commit c5755ab into master Dec 8, 2022
@dhaus67 dhaus67 deleted the master-dh/ROX-13819-recreate-groups-bucket- branch December 8, 2022 21:55
@dhaus67 dhaus67 removed this from the 3.73.1-rc.2 milestone Dec 9, 2022
dhaus67 added a commit that referenced this pull request Dec 9, 2022
rhybrillou pushed a commit that referenced this pull request Dec 13, 2022
rhybrillou pushed a commit that referenced this pull request Dec 13, 2022
rhybrillou pushed a commit that referenced this pull request Dec 13, 2022
rhybrillou pushed a commit that referenced this pull request Dec 13, 2022
rhybrillou pushed a commit that referenced this pull request Dec 14, 2022
rhybrillou pushed a commit that referenced this pull request Dec 14, 2022
rhybrillou pushed a commit that referenced this pull request Dec 14, 2022
rhybrillou pushed a commit that referenced this pull request Dec 15, 2022
rhybrillou pushed a commit that referenced this pull request Dec 15, 2022
rhybrillou pushed a commit that referenced this pull request Dec 15, 2022
rhybrillou pushed a commit that referenced this pull request Jan 5, 2023
rhybrillou pushed a commit that referenced this pull request Jan 5, 2023
rhybrillou pushed a commit that referenced this pull request Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants