Skip to content

ROX-28253: Change run query for schema#14939

Merged
mtodor merged 1 commit intomasterfrom
mtodor/ROX-28253-add-query-for-schema-fn
Apr 14, 2025
Merged

ROX-28253: Change run query for schema#14939
mtodor merged 1 commit intomasterfrom
mtodor/ROX-28253-add-query-for-schema-fn

Conversation

@mtodor
Copy link
Contributor

@mtodor mtodor commented Apr 9, 2025

Description

This PR has added some refactoring for Postgres query handling and introduces logic that will apply cursors in cases where that is needed.

This is related to changes made in #14636 <- that PR will be modified to use new function.

User-facing documentation

  • CHANGELOG is updated OR update is not needed
  • documentation PR is created and is linked above OR is not needed

Testing and quality

  • the change is production ready: the change is GA or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • no new tests have been added. Since the change is on an already existing function, tests should confirm that nothing is broken.

How I validated my change

  • local unit tests for changed package (not Postgres unit tests)
  • let CI validate it

Summary by Sourcery

Refactor Postgres query handling to optimize cursor usage and improve query execution logic

Enhancements:

  • Introduce a more flexible query execution method that applies cursors only when necessary
  • Improve error handling and retry mechanisms for database queries
  • Simplify query preparation and execution logic

Chores:

  • Deprecate existing RunCursorQueryForSchemaFn method in favor of new RunQueryForSchemaFn
  • Update multiple migration and datastore files to use new query execution method

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Apr 9, 2025

Reviewer's Guide by Sourcery

This PR refactors Postgres query handling to use cursors only when necessary, specifically when the query pagination limit is not set or exceeds the cursor batch size. It introduces a new function RunQueryForSchemaFn that conditionally applies cursors, improving performance by avoiding unnecessary cursor creation for small queries. The changes also include the replacement of direct calls to RunCursorQueryForSchemaFn with the new RunQueryForSchemaFn in multiple files.

Updated class diagram for cursorSession

classDiagram
    class cursorSession {
        -id string
        -tx *postgres.Tx
        -close func()
    }
Loading

File-Level Changes

Change Details Files
Introduces a new function RunQueryForSchemaFn that uses cursors only when the query pagination limit is not set or is greater than the cursor batch size.
  • Introduces RunQueryForSchemaFn to conditionally use cursors based on the query limit.
  • Implements runQueryForSchemaWithCursorFn to handle queries with cursors.
  • Implements runQueryForSchemaFn to handle queries without cursors.
  • Adds helper functions prepareQuery, retryableGetRows, handleRowsWithCallback, and retryableGetCursorSession to improve code modularity and readability.
  • Replaces direct calls to RunCursorQueryForSchemaFn with RunQueryForSchemaFn in multiple files.
pkg/search/postgres/common.go
central/image/datastore/store/postgres/store.go
central/image/datastore/store/v2/postgres/store.go
central/node/datastore/store/postgres/store.go
migrator/migrations/m_168_to_m_169_postgres_remove_clustercve_permission/permissionsetpostgresstore/postgres_plugin.go
migrator/migrations/m_169_to_m_170_collections_sac_resource_migration/permissionsetpostgresstore/postgres_plugin.go
migrator/migrations/m_170_to_m_171_create_policy_categories_and_edges/policycategoryedgepostgresstore/store.go
migrator/migrations/m_170_to_m_171_create_policy_categories_and_edges/policycategorypostgresstore/store.go
migrator/migrations/m_170_to_m_171_create_policy_categories_and_edges/policypostgresstore/store.go
migrator/migrations/m_171_to_m_172_move_scope_to_collection_in_report_configurations/collectionPostgresStore/postgres_plugin.go
migrator/migrations/m_171_to_m_172_move_scope_to_collection_in_report_configurations/reportConfigurationPostgresStore/postgres_plugin.go
migrator/migrations/m_173_to_m_174_group_unique_constraint/stores/previous/postgres_plugin.go
migrator/migrations/m_173_to_m_174_group_unique_constraint/stores/updated/postgres_plugin.go
migrator/migrations/m_174_to_m_175_enable_search_on_api_tokens/oldapitokenpostgresstore/postgres_plugin.go
migrator/migrations/m_176_to_m_177_network_baselines_cidr/networkbaselinestore/postgres_plugin.go
migrator/migrations/m_177_to_m_178_group_permissions/permissionsetpostgresstore/postgres_plugin.go
migrator/migrations/m_178_to_m_179_embedded_collections_search_label/reportconfigstore/postgres_plugin.go
migrator/migrations/m_181_to_m_182_group_role_permission_with_access_one/permissionsetstore/store.go
migrator/migrations/m_182_to_m_183_remove_default_scope_manager_role/apitokenstore/postgres_plugin.go
migrator/migrations/m_182_to_m_183_remove_default_scope_manager_role/groupstore/postgres_plugin.go
migrator/migrations/m_182_to_m_183_remove_default_scope_manager_role/permissionsetstore/postgres_plugin.go
migrator/migrations/m_182_to_m_183_remove_default_scope_manager_role/rolestore/postgres_plugin.go
migrator/migrations/m_183_to_m_184_move_declarative_config_health/declarativeconfig/store/postgres_plugin.go
migrator/migrations/m_183_to_m_184_move_declarative_config_health/integrationhealth/store/postgres_plugin.go
migrator/migrations/m_184_to_m_185_remove_policy_vulnerability_report_resources/permissionsetstore/postgres_plugin.go
migrator/migrations/policymigrationhelper/categorypostgresstorefortest/postgres_policy_category_plugin.go
pkg/search/postgres/store.go

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@mtodor mtodor changed the title ROX-28253: Change run query for schema to use cursors only when required ROX-28253: Change run query for schema Apr 9, 2025
@rhacs-bot
Copy link
Contributor

rhacs-bot commented Apr 9, 2025

Images are ready for the commit at 37461f4.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.8.x-437-g37461f4181.

@codecov
Copy link

codecov bot commented Apr 9, 2025

Codecov Report

Attention: Patch coverage is 52.11268% with 34 lines in your changes missing coverage. Please review.

Project coverage is 48.95%. Comparing base (d174ead) to head (37461f4).
Report is 46 commits behind head on master.

Files with missing lines Patch % Lines
pkg/search/postgres/common.go 52.11% 30 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #14939      +/-   ##
==========================================
- Coverage   48.96%   48.95%   -0.02%     
==========================================
  Files        2550     2551       +1     
  Lines      187228   187411     +183     
==========================================
+ Hits        91677    91747      +70     
- Misses      88301    88404     +103     
- Partials     7250     7260      +10     
Flag Coverage Δ
go-unit-tests 48.95% <52.11%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mtodor mtodor force-pushed the mtodor/ROX-28253-add-query-for-schema-fn branch from 18262f3 to 7008d28 Compare April 9, 2025 15:33
@dashrews78
Copy link
Contributor

How did you test this? I'm not sure CI is sufficient as I would imagine by and large the CI queries and flows would come in under the cursor threshold. I would think a new unit test would probably be valid as well since we are creating a new path.

@janisz janisz self-requested a review April 10, 2025 14:42
@mtodor mtodor force-pushed the mtodor/ROX-28253-add-query-for-schema-fn branch from 7008d28 to 37461f4 Compare April 11, 2025 14:04
@mtodor
Copy link
Contributor Author

mtodor commented Apr 11, 2025

@dashrews78 I have changed functions. Based on @janisz tests, using a cursor is expensive, and we don't want to keep it everywhere. I didn't add a test because there is no if anymore. All common functions are primarily tested via store functions. It will be the same with RunQueryForSchemaFn when we start using it. Please take a look.

@janisz, can you please check this PR again?

@dashrews78
Copy link
Contributor

@dashrews78 I have changed functions. Based on @janisz tests, using a cursor is expensive, and we don't want to keep it everywhere. I didn't add a test because there is no if anymore. All common functions are primarily tested via store functions. It will be the same with RunQueryForSchemaFn when we start using it. Please take a look.

@janisz, can you please check this PR again?

I will be honest, these changes have been on blast and I have not had the time to keep up due to critical feature priority. So forgive me if the questions I'm about to ask have been addressed elsewhere.

Regarding the "cursor is expensive" statement, there are 2 sides to that equation. Central expense and Postgres expense. Have we looked at how these changes impact consumption of resources on Postgres? Do we have quantification of the cursor expense some place? What type of load have these tests been performed? What set of objects and queries were primarily used?

@janisz
Copy link
Contributor

janisz commented Apr 11, 2025

From my tests on alerts, cursor is 20 to 30% slower than without it.
I have not checked the impact on the postgres yet. We will summarize it in one place and decide if we need cursors or not.

@janisz
Copy link
Contributor

janisz commented Apr 14, 2025

Regarding the "cursor is expensive" statement, there are 2 sides to that equation. Central expense and Postgres expense. Have we looked at how these changes impact consumption of resources on Postgres? Do we have quantification of the cursor expense some place? What type of load have these tests been performed? What set of objects and queries were primarily used?

I think we are not getting it worse as we keep Walk with cursor and GetBy... without it. Eventually we can discuss if we should keep cursor and maybe merge Walk with GetByQuery or even GetMany (if we don't need missing indices) but for now I think migration to a callback give us more performance.

@mtodor mtodor merged commit 9831caa into master Apr 14, 2025
91 checks passed
@mtodor mtodor deleted the mtodor/ROX-28253-add-query-for-schema-fn branch April 14, 2025 08:32
@dashrews78
Copy link
Contributor

Regarding the "cursor is expensive" statement, there are 2 sides to that equation. Central expense and Postgres expense. Have we looked at how these changes impact consumption of resources on Postgres? Do we have quantification of the cursor expense some place? What type of load have these tests been performed? What set of objects and queries were primarily used?

I think we are not getting it worse as we keep Walk with cursor and GetBy... without it. Eventually we can discuss if we should keep cursor and maybe merge Walk with GetByQuery or even GetMany (if we don't need missing indices) but for now I think migration to a callback give us more performance.

For posterity, I just want to point out that I don't think any of that actually answered my questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants