Skip to content

perf: avoid repeated scan of entire venv via packages_distributions() at import time#16579

Open
bonauer-pf wants to merge 1 commit intogoogleapis:mainfrom
bonauer-pf:fix/avoid-packages-distributions-scan
Open

perf: avoid repeated scan of entire venv via packages_distributions() at import time#16579
bonauer-pf wants to merge 1 commit intogoogleapis:mainfrom
bonauer-pf:fix/avoid-packages-distributions-scan

Conversation

@bonauer-pf
Copy link
Copy Markdown

@bonauer-pf bonauer-pf commented Apr 8, 2026

packages_distributions() scans every installed package in the environment to build a complete module-to-distribution mapping. In large venvs (500+ packages, common with many google-cloud-* libs), this causes multi-second import delays for google.api_core and every library that depends on it.

This PR contains 2 changes:

  • Wrap packages_distributions() with functools.cache so the expensive O(n) scan happens at most once per process.
  • Defer the package label resolution in check_python_version() so it only runs when a warning is actually emitted, not on the common happy path of a supported Python version.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #15015 and #16552.

@google-cla
Copy link
Copy Markdown

google-cla bot commented Apr 8, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request optimizes the _get_pypi_package_name function by replacing the slow packages_distributions() scan with targeted metadata.distribution() lookups. It also introduces lazy resolution for package labels to improve performance on supported Python versions. However, the top-level import of importlib.metadata breaks compatibility with Python 3.7. Feedback was provided to handle this as an optional import and to ensure _get_pypi_package_name handles cases where the metadata module is unavailable.

  packages_distributions() scans every installed package in the environment to build a complete module-to-distribution mapping. In large venvs (500+ packages, common with many google-cloud-* libs), this causes multi-second import delays for google.api_core and every library that depends on it.

  Two changes:
  - Wrap packages_distributions() with functools.cache so the expensive O(n) scan happens at most once per process.
  - Defer the package label resolution in check_python_version() so it only runs when a warning is actually emitted, not on the common happy path of a supported Python version.
@bonauer-pf bonauer-pf force-pushed the fix/avoid-packages-distributions-scan branch from e63555e to 6425506 Compare April 8, 2026 10:49
@bonauer-pf bonauer-pf changed the title perf: avoid full venv scan via packages_distributions() at import time perf: avoid repeated scan of entire venv via packages_distributions() at import time Apr 8, 2026
@bonauer-pf bonauer-pf marked this pull request as ready for review April 8, 2026 14:03
@bonauer-pf bonauer-pf requested review from a team as code owners April 8, 2026 14:03
@vchudnov-g
Copy link
Copy Markdown
Contributor

Thanks for your PR! It looks very sensible. We'll approve once it passes the presubmits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Import time is very slow (>5s)

2 participants