Skip to content

ROX-17183: Migrate persistent data #1 - scanner definition#6068

Merged
c-du merged 60 commits intomasterfrom
cong/vul-migrate
May 22, 2023
Merged

ROX-17183: Migrate persistent data #1 - scanner definition#6068
c-du merged 60 commits intomasterfrom
cong/vul-migrate

Conversation

@c-du
Copy link
Contributor

@c-du c-du commented May 17, 2023

Description

Add:

  1. Large objects utilities in addition to the Gorm
  2. Add migration to move scanner definition to blobstore with test.

Checklist

  • Investigated and inspected CI test results
  • Unit test and regression tests added
  • [ ] Evaluated and added CHANGELOG entry if required
  • [ ] Determined and documented upgrade steps
  • [ ] Documented user facing changes (create PR based on openshift/openshift-docs and merge into rhacs-docs)

If any of these don't apply, please comment below.

Testing Performed

  • Add unit test
  • Run a migration test to add a scanner bundle and then upgrade and then verify.
  • Run a migration without scanner bundle and upgrade and then verify.

In addition to reviewing your code, reviewers must also review your testing
instructions and make sure they are sufficient.

@@ -0,0 +1,124 @@
package m179tom180
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will fix.

@@ -0,0 +1,98 @@
//go:build sql_integration

package m179tom180
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package

Copy link
Contributor

@dashrews78 dashrews78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update the sequence number. Should update the minimum sequence number too.

Can you add some thoughts to the description on what the advantages of using gorm in this case are? It feels duplicative from what you already did with the blob store.

@c-du
Copy link
Contributor Author

c-du commented May 18, 2023

The discussions around why use Gorm was documented here. https://docs.google.com/document/d/1OLMylIIoPvqD0yqOdF6fJGF4kSCPznPjTse1pBjdkJE/edit#heading=h.bupciudrwmna

The scenarios are the same. Migrator needs to have backward compatibility to earlier versions while the schema and utilities evolving. The target is to have the schema stable while we still have the freedom to innovate and optimize the central without keep looking back.

Due to lack of facilities and experience, we are still copying codes of stores in addition to the schema in migrations which is cumbersome which IMHO are all feasible with Gorm. What is missing here is to have enough examples and utilities to show people how to create gorm-style migration. We only need to have a frozen schema instead of the whole store.

In this migration, large object is not supported by Gorm but it is easy to add it. This can be used in the future if any other migration may need it.

@c-du c-du requested a review from dashrews78 May 18, 2023 16:55
Copy link
Contributor

@RTann RTann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with Gorm nor the blob stuff, so I just focused on the main migration file and the test

pgutils.CreateTableFromModel(context.Background(), db, schema.CreateTableBlobsStmt)

tx := db.Begin(&sql.TxOptions{Isolation: sql.LevelRepeatableRead})
if err = moveScannerDefination(tx); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: moveScannerDefinitions

// Prepare blob
blob := &storage.Blob{
Name: scannerDefBlobName,
Oid: 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this ID used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This oid is used to reference the large object in the database. It is filled inside the store if this is a new blob.

if err != nil {
return errors.Wrapf(err, "invalid timestamp %v", stat.ModTime())
}
fd, err := os.Open(scannerDefPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be sure to close the fd

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also remove the file, nor no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we do not remove the file. All the persistent data including databases will be left there for rollback.

s.Equal(scannerDefBlobName, blob.GetName())
s.EqualValues(size, blob.GetLength())

fileInfo, err := file.Stat()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking we should get the mod time before calling modeToBlobs to make sure moveToBlobs does not change the time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we need to move the stat out to make sure it does not change time? It is the same stat operation and it does not change modification time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the test, we don't want to assume moveToBlobs is just going to call Stat, so that's why I thought we should check the time before we call moveToBlobs to make sure it didn't change the mod time before making the blob

@c-du c-du requested a review from RTann May 18, 2023 18:10
@dashrews78
Copy link
Contributor

The discussions around why use Gorm was documented here. https://docs.google.com/document/d/1OLMylIIoPvqD0yqOdF6fJGF4kSCPznPjTse1pBjdkJE/edit#heading=h.bupciudrwmna

The scenarios are the same. Migrator needs to have backward compatibility to earlier versions while the schema and utilities evolving. The target is to have the schema stable while we still have the freedom to innovate and optimize the central without keep looking back.

Due to lack of facilities and experience, we are still copying codes of stores in addition to the schema in migrations which is cumbersome which IMHO are all feasible with Gorm. What is missing here is to have enough examples and utilities to show people how to create gorm-style migration. We only need to have a frozen schema instead of the whole store.

In this migration, large object is not supported by Gorm but it is easy to add it. This can be used in the future if any other migration may need it.

If you intend to use this as an example of a Gorm migration, it might be worth updating the README and pointing this out as an example.

Copy link
Contributor

@RTann RTann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two minor things (word misspelling and potential nil dereference). Otherwise LGTM from Scanner side

if os.IsNotExist(err) {
return nil
}
defer utils.IgnoreError(fd.Close)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move this below the error check? It may be possible fd is nil at this point (I'm not 100% sure)

if err = moveScannerDefination(tx); err != nil {
result := tx.Rollback()
if result.Error != nil {
return result.Error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ends up covering up the initial error, so would be nice to wrap?

Copy link
Contributor

@connorgorman connorgorman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits, some comments about the tx vs gorm.DB

return tx.Commit().Error
}

func moveScannerDefination(tx *gorm.DB) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: defination -> definition?

if err != nil {
return errors.Wrapf(err, "invalid timestamp %v", stat.ModTime())
}
fd, err := os.Open(scannerDefPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just open once, then call the Stat() function on the file? Then we don't need to check existence twice

// Prepare blob
blob := &storage.Blob{
Name: scannerDefBlobName,
Oid: 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to initialize, I would just leave it unset

return result.Error
}

if len(targets) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check and distinguish in this case? Migrations only run once and we can be guaranteed that no object will exist in the database because all of this is wrapped in a txn. I think it's a fair assumption that this does not exist and that we will always need to add and create

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could happen when the first migration fails maybe at migration 182 and then we fixed the problem somehow and customer tries again. All migrations should be re-entriable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point, do you think it'd be generally worthwhile to write out the version after each sequence going forward? Not in this PR of course

"github.com/stretchr/testify/suite"
)

type categoriesMigrationTestSuite struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming? should be relation to the test


// Start a new transaction
tx2 := s.gormDB.Begin()
// tx := gormDB.Begin(&sql.TxOptions{Isolation: sql.LevelRepeatableRead})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment?

// returns err
func (o *LargeObject) wrapClose(err error) error {
if closeErr := o.Close(); closeErr != nil {
return closeErr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will cover up the initial error, can we wrap this?

return err
}
defer func() {
err = obj.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will overwrite the initial error if there is an error on close

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Find a better way to join the errors.

defer func() {
err = obj.Close()
}()
_, err = obj.Truncate(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: inline this into an if _, err =
and also the same for copy

// This is originally created with similar API with existing github.com/jackc/pgx
// For more details see: http://www.postgresql.org/docs/current/static/largeobjects.html
type LargeObjects struct {
*gorm.DB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be gorm.Tx?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. gorm.Tx is only an interface to have access to transaction and context operation. That is not enough for large object operations.

Copy link
Contributor

@connorgorman connorgorman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits, some comments about the tx vs gorm.DB

Base automatically changed from cong/vuldef to master May 19, 2023 17:24
@c-du c-du requested a review from connorgorman May 19, 2023 17:35
@c-du
Copy link
Contributor Author

c-du commented May 19, 2023

/test ocp-4-10-qa-e2e-tests
/test ocp-4-13-qa-e2e-tests
/test gke-nongroovy-e2e-tests

@openshift-ci
Copy link

openshift-ci bot commented May 19, 2023

@c-du: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/gke-nongroovy-e2e-tests e40addf link false /test gke-nongroovy-e2e-tests
ci/prow/ocp-4-10-qa-e2e-tests e40addf link false /test ocp-4-10-qa-e2e-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@c-du c-du merged commit 220bc3a into master May 22, 2023
@c-du c-du deleted the cong/vul-migrate branch May 22, 2023 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants