ROX-30961: migration to populate indicator container start time WIP DO NOT REVIEW#16935
ROX-30961: migration to populate indicator container start time WIP DO NOT REVIEW#16935dashrews78 wants to merge 20 commits intomasterfrom
Conversation
|
This change is part of the following stack: Change managed by git-spice. |
|
Skipping CI for Draft Pull Request. |
|
Images are ready for the commit at 302311e. To use with deploy scripts, first |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #16935 +/- ##
==========================================
+ Coverage 48.80% 48.82% +0.01%
==========================================
Files 2706 2716 +10
Lines 202090 202398 +308
==========================================
+ Hits 98631 98814 +183
- Misses 95695 95804 +109
- Partials 7764 7780 +16
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
43b8187 to
95eab27
Compare
c2da3ac to
6df86ee
Compare
6f0c461 to
92c8cbc
Compare
6df86ee to
494fd3a
Compare
e985ea0 to
cf81ef1
Compare
|
/retest |
|
Caution There are some errors in your PipelineRun template.
|
|
|
||
| var clusters []string | ||
| db.Model(&updatedSchema.ProcessIndicators{}).Distinct("clusterid").Pluck("clusterid", &clusters) | ||
| log.Infof("clusters found: %v", clusters) |
There was a problem hiding this comment.
Debug? Applies to a bunch of the messages, and I assume you've set them to Info for purposes of testing and plan to change most of them to Debug after.
There was a problem hiding this comment.
Actually I've set some of these intentionally. What we found with the network flow migration is customers would look at the log and it wouldn't be moving so they would restart assuming it was stuck. So I logged a little extra to try to avoid that.
| // have no need to worry about the old schema and can simply perform all our work on the new one. | ||
| db := database.GormDB | ||
| pgutils.CreateTableFromModel(database.DBCtx, db, updatedSchema.CreateTableProcessIndicatorsStmt) | ||
| db = db.WithContext(database.DBCtx).Table(updatedSchema.ProcessIndicatorsTableName) |
There was a problem hiding this comment.
As we discussed, use clusters table here for less expensive query
| log.Infof("Processing %s with %d indicators", cluster, len(storeIndicators)) | ||
| for objBatch := range slices.Chunk(storeIndicators, batchSize) { | ||
| if err = store.UpsertMany(ctx, objBatch); err != nil { | ||
| return errors.Wrap(err, "failed to upsert all converted objects") |
There was a problem hiding this comment.
nit: upsert %d objects or upsert chunk (these are not converted)
(repeated instance below, so applies at other places)
There was a problem hiding this comment.
repeated instance is unnecessary because batching with slices.Chunk doesn't leave any left overs. So really I was upserting them twice. So that was probably slow.
charmik-redhat
left a comment
There was a problem hiding this comment.
Why migrate by cluster and not directly migrate all processes_indicators in batches? Could there be process indicators without a cluster ID which are skipped this way?
There was a problem hiding this comment.
Why copy the generic store too?
There was a problem hiding this comment.
I did it purely to remove the mutex locks since there should be no risk of deadlock here as the migration is the only thing running.
| defer wg.Done() | ||
| err := migrateByCluster(cluster, database) | ||
| if err != nil { | ||
| errorList = append(errorList, err) |
There was a problem hiding this comment.
Would concurrent additions to the errorList cause any issue?
| ctx, cancel := context.WithTimeout(database.DBCtx, types.DefaultMigrationTimeout) | ||
| defer cancel() | ||
|
|
||
| store := updatedStore.New(database.PostgresDB) |
There was a problem hiding this comment.
Initialize it outside the function to avoid doing it many times
There was a problem hiding this comment.
This looks the same as migrator/migrations/m_212_to_m_213_add_container_start_column_to_indicators/schema/clusters.go
Can the tests use the above clusters schema instead?
|
/test |
This comment was marked as outdated.
This comment was marked as outdated.
|
/test gke-race-condition-qa-e2e-tests |
indicators without a cluster ID isn't valid. The pipeline adds the cluster to the indicator on create. As for the reason, I was trying to group them in such a way I could process multiple batches at the same time so I could remove the mutex lock in the |
894aad5 to
3915aa5
Compare
|
/retest |
|
/test gke-upgrade-tests |
|
@dashrews78: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
migrator and pools don't easily support concurrency. We would need a lot of work to disable cache statements and various things. |
Description
Adds a migration to populate the container start time column in process indicators.
User-facing documentation
Testing and quality
Automated testing
How I validated my change
Unit test, upgrade test. Additional manual testing to ensure the upgrade and rollback worked properly.