Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Jan 20, 2026

cc @ShaharNaveh @moreal @terryluan12

To understand better what lib is related to its dependencies

$ python3 scripts/update_lib deps dis            
[+] lib: cpython/Lib/dis.py
[+] test: cpython/Lib/test/test_dis.py
soft_deps:
- [ ] dis (native: _opcode, sys)
  - [ ] collections (native: _collections, _weakref, itertools, sys)
    - [ ] copy
      - [ ] copyreg
        - [ ] operator (native: _operator, builtins)
          - [x] functools
        - [x] typing
      - [ ] types (native: _types, sys)
        - [x] _collections_abc, functools
      - [ ] weakref (native: _weakref, atexit, gc, itertools, sys)
        - [ ] _weakrefset (native: _weakref)
          - [ ] types
        - [ ] copy
        - [x] _collections_abc
    - [ ] operator (native: _operator, builtins)
      - [x] functools
    - [ ] reprlib (native: _thread, builtins, itertools, math, sys)
    - [x] _collections_abc, heapq, keyword
  - [ ] io (native: _io, _thread, errno, msvcrt, sys)
    - [ ] codecs (native: _codecs, builtins, sys)
      - [ ] encodings (native: _codecs_cn, _codecs_hk, _codecs_iso2022, _codecs_jp, _codecs_kr, _codecs_tw, _multibytecodec, _win_cp_codecs, binascii, sys, unicodedata, zlib)
        - [ ] quopri (native: binascii, sys)
          - [ ] getopt (native: sys)
            - [ ] gettext (native: builtins, errno, sys)
              - [ ] locale (native: _locale, builtins, sys)
                - [ ] os (native: nt, posix, sys)
                  - [ ] ntpath (native: _winapi, nt, sys)
                  - [ ] posixpath (native: errno, posix, pwd, sys)
                  - [ ] subprocess (native: _posixsubprocess, _winapi, builtins, errno, fcntl, grp, msvcrt, pwd, select, sys, time)
                  - [ ] io
                  - [x] _collections_abc, abc, stat, warnings
                - [ ] encodings
                - [x] _collections_abc, functools, re, warnings
              - [ ] os (native: nt, posix, sys)
                - [ ] io, ntpath, posixpath, subprocess
                - [x] _collections_abc, abc, stat, warnings
              - [ ] struct (native: _struct)
              - [ ] copy, operator
              - [x] re, warnings
            - [ ] os (native: nt, posix, sys)
              - [ ] io, ntpath, posixpath, subprocess
              - [x] _collections_abc, abc, stat, warnings
          - [ ] io
        - [ ] codecs, io
        - [x] base64, bz2, re, stringprep
      - [x] warnings
    - [ ] locale (native: _locale, builtins, sys)
      - [ ] encodings, os
      - [x] _collections_abc, functools, re, warnings
    - [ ] os (native: nt, posix, sys)
      - [ ] io, ntpath, posixpath, subprocess
      - [x] _collections_abc, abc, stat, warnings
    - [x] _collections_abc, abc, stat, warnings
  - [ ] opcode (native: _opcode, builtins)
    - [ ] _opcode_metadata
  - [ ] types (native: _types, sys)
    - [x] _collections_abc, functools
  - [x] argparse

Summary by CodeRabbit

  • New Features

    • Added automated library dependency checking workflow with GitHub Actions integration and PR commenting
    • Added CLI command to analyze and display module dependencies with hierarchical views
  • Bug Fixes

    • Improved error handling and path validation in library operations
    • Enhanced robustness in AST parsing and file operations with safer fallback handling
  • Tests

    • Added comprehensive test coverage for dependency analysis and path validation

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

Warning

Rate limit exceeded

@youknowone has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minutes and 4 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between a9d231f and af13255.

📒 Files selected for processing (10)
  • .github/workflows/lib-deps-check.yaml
  • scripts/update_lib/auto_mark.py
  • scripts/update_lib/copy_lib.py
  • scripts/update_lib/deps.py
  • scripts/update_lib/io_utils.py
  • scripts/update_lib/path.py
  • scripts/update_lib/quick.py
  • scripts/update_lib/show_deps.py
  • scripts/update_lib/tests/test_copy_lib.py
  • scripts/update_lib/tests/test_deps.py
📝 Walkthrough

Walkthrough

This PR introduces a comprehensive library dependency checking system via a new GitHub Actions workflow and supporting Python tooling. It adds a deps command to the update_lib CLI, creates new dependency analysis functions for import parsing and module introspection, refactors path handling with reusable utilities, hardens error handling across scripts, and expands test coverage for the new functionality.

Changes

Cohort / File(s) Summary
GitHub Actions Workflow
.github/workflows/lib-deps-check.yaml
New workflow triggered on PR changes to Lib/; clones CPython v3.14.2, detects modified modules, runs dependency checks, and posts sticky PR comments with dependency details or removes them if no changes exist.
CLI Integration
scripts/update_lib/__main__.py
Adds deps subcommand routing that delegates to show_deps_main function for dependency report generation.
Dependency Analysis Core
scripts/update_lib/deps.py
Major enhancements: refactors path construction to use new helpers, adds import parsing functions (parse_lib_imports, get_all_imports), introduces soft/Rust dependency classification (get_soft_deps, get_rust_deps), adds module freshness detection (is_up_to_date), and integrates safe AST parsing.
New Dependency Display Script
scripts/update_lib/show_deps.py
New script that formats and displays hierarchical dependency information with up-to-date status, native extensions, and configurable depth; includes module discovery, tree formatting, and CLI interface.
Path Utility Layer
scripts/update_lib/path.py
Three new helper functions: resolve_module_path (flexible .py/directory resolution), construct_lib_path (standardized Lib/ path building), get_module_name (module extraction handling __init__.py).
Script Refactoring
scripts/update_lib/auto_mark.py, scripts/update_lib/copy_lib.py, scripts/update_lib/quick.py
auto_mark.py: uses safe AST parsing. copy_lib.py: enforces /Lib/ path requirement with fail-fast validation. quick.py: adopts new path utilities, safe file reading, robust directory handling, improved error messaging.
Test Coverage
scripts/update_lib/tests/test_copy_lib.py, scripts/update_lib/tests/test_deps.py
test_copy_lib.py: validates /Lib/ path validation error. test_deps.py: comprehensive tests for import parsing and soft dependency detection covering edge cases (syntax errors, circular imports, missing modules).

Sequence Diagram

sequenceDiagram
    participant GH as GitHub Actions
    participant WF as Workflow
    participant Git as Git
    participant Py as Python Script
    participant CP as CPython Repo
    participant PR as PR Comments API

    GH->>WF: Trigger on PR to Lib/
    WF->>Git: Checkout base + fetch PR head
    WF->>CP: Clone CPython v3.14.2
    WF->>Py: Detect changed files in Lib/
    Py-->>WF: Return module list
    alt Modules Detected
        WF->>Py: Run dependency check
        Py->>CP: Analyze imports & dependencies
        Py-->>WF: Return deps_output
        WF->>PR: Post sticky comment with details
    else No Changes
        WF->>PR: Delete existing comment
    end
    PR-->>GH: Comment posted/removed
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • update_lib hard dependency resolver #6817: Modifies the same scripts/update_lib/ codebase with dependency-resolution functions and path helpers, creating direct code-level overlap.
  • Remake update_lib #6796: Changes the same scripts/update_lib/ tooling including path utilities and module refactoring, indicating concurrent development in the same subsystem.

Suggested reviewers

  • ShaharNaveh

Poem

🐰 Hop, hop—dependencies now clear,
Our scripts check what lib modules hold dear,
Safe parsing guards against syntax's snare,
Path helpers hop through CPython with care,
A workflow watches each Lib change with flair! 🌟

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[update_lib] show deps' clearly and concisely describes the main feature addition: introducing dependency visualization for library modules.
Docstring Coverage ✅ Passed Docstring coverage is 92.11% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

Code has been automatically formatted

The code in this PR has been formatted using:

  • ruff format
    Please pull the latest changes before pushing again:
git pull origin show-deps

@youknowone youknowone added the skip:ci Skip running the ci label Jan 20, 2026
@youknowone youknowone force-pushed the show-deps branch 2 times, most recently from 7bc4ef4 to a9d231f Compare January 20, 2026 15:55
@youknowone youknowone marked this pull request as ready for review January 20, 2026 15:56
@youknowone
Copy link
Member Author

probably we need the opposite graph. finding the bottom-most library and solve one by one from there. Then everything could be updated smoothly

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In @.github/workflows/lib-deps-check.yaml:
- Around line 67-85: The workflow step "Run deps check" currently expands
untrusted module names directly in the shell via the command starting with
"python scripts/update_lib deps ${{ steps.changed-files.outputs.modules }}
--depth 2", which allows command injection; change this to stop interpolating
the raw variable and instead stream the modules through a safe tool (e.g., use
printf or echo to produce one module per line and pipe into xargs) so each
module is passed as a positional argument and include a "--" marker before
module names to prevent option injection, invoking "python scripts/update_lib
deps -- <module> --depth 2" for each module; keep the rest of the step (writing
/tmp/deps_output.txt and populating GITHUB_OUTPUT and has_output) unchanged.
- Around line 35-60: The duplicate-module check uses a regex substring test in
the condition 'if [[ -n "$module" && ! " $modules " =~ " $module " ]]; then'
which misdetects modules like "os" when "posix" exists; replace this with
exact-match deduplication by tracking seen modules with an associative array
(e.g., declare -A seen) inside the same loop (for file in $changed) and only
append when the module key is not present (check via [[ -z "${seen[$module]}"
]]); after appending set seen[$module]=1 and keep producing the trimmed $modules
and GITHUB_OUTPUT as before.

In `@scripts/update_lib/deps.py`:
- Around line 12-13: Remove the unused safe_read_text import from the import
statement in update_lib/deps.py: update the line that imports from
update_lib.io_utils so it only imports read_python_files and safe_parse_ast,
eliminating safe_read_text to satisfy lint rule F401 and avoid unused-import
warnings.
- Around line 333-369: The directory comparison in is_up_to_date uses
filecmp.dircmp but only inspects the top-level diff_files/left_only/right_only,
missing nested differences; add a recursive check (e.g., a helper like
_dircmp_has_diff) that walks dcmp.subdirs (dcmp.subdirs.values()) and returns
True if any dcmp.diff_files, dcmp.left_only, dcmp.right_only is non-empty at any
level, and use that helper in is_up_to_date when handling the directory branch
so nested subpackage mismatches are detected; reference symbols: is_up_to_date,
get_lib_paths, filecmp.dircmp, and dcmp.subdirs.

In `@scripts/update_lib/tests/test_deps.py`:
- Around line 10-14: The new assertions/logic added to the test file violate the
rule that test files under **/*test*.py may only change expectedFailure
annotations; revert the behavioral changes in tests like get_soft_deps,
get_test_dependencies, get_test_paths, parse_lib_imports, and parse_test_imports
and either (a) move any new validation logic into a non-test helper/module
(e.g., scripts/update_lib/helpers.py) and import it from the tests, or (b) if
the test must change behavior because of a known failing condition, mark the
test with `@pytest.mark.xfail` (or expectedFailure) instead of changing
assertions; ensure only annotation changes remain in the test file.
🧹 Nitpick comments (4)
scripts/update_lib/__main__.py (1)

52-56: Consider updating the usage docstring to include deps.

The top-level usage block doesn’t mention the new subcommand, which can confuse CLI users.

.github/workflows/lib-deps-check.yaml (1)

20-33: Consider pinning external actions and CPython clone to commit SHAs for supply-chain hardening.

The external actions at lines 21 and 63 (actions/checkout@v6.0.1 and actions/setup-python@v6.1.0) and the CPython clone at line 33 (v3.14.2) use floating tags rather than commit SHAs. While these versions are valid and current, GitHub Actions security best practices recommend pinning to full-length commit SHAs and including the tag in a comment for human traceability (e.g., uses: actions/checkout@<sha> # v6.0.1). This reduces supply-chain risk by preventing mutations to the pinned reference.

scripts/update_lib/path.py (1)

112-144: Validate prefer to avoid silent fallbacks.

A typo currently routes to the dir-first branch; consider rejecting unknown values up-front.

♻️ Proposed guard
 def resolve_module_path(
     name: str, prefix: str = "cpython", prefer: str = "file"
 ) -> pathlib.Path:
@@
-    file_path = pathlib.Path(f"{prefix}/Lib/{name}.py")
+    if prefer not in {"file", "dir"}:
+        raise ValueError(f"prefer must be 'file' or 'dir' (got: {prefer})")
+
+    file_path = pathlib.Path(f"{prefix}/Lib/{name}.py")
scripts/update_lib/show_deps.py (1)

200-223: Avoid duplicate output when all is combined with explicit names.

Expansion can repeat modules; de-dupe while preserving order.

♻️ Proposed de-duplication
     expanded_names = []
     for name in names:
         if name == "all":
             expanded_names.extend(get_all_modules(cpython_prefix))
         else:
             expanded_names.append(name)
+    # Preserve order but remove duplicates
+    expanded_names = list(dict.fromkeys(expanded_names))

Comment on lines +35 to +60
- name: Get changed Lib files
id: changed-files
run: |
# Get the list of changed files under Lib/
changed=$(git diff --name-only ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} -- 'Lib/*.py' 'Lib/**/*.py' | head -50)
echo "Changed files:"
echo "$changed"
# Extract unique module names (top-level only, skip test/)
modules=""
for file in $changed; do
# Skip test files
if [[ "$file" == Lib/test/* ]]; then
continue
fi
# Extract module name: Lib/foo.py -> foo, Lib/foo/__init__.py -> foo
module=$(echo "$file" | sed -E 's|^Lib/||; s|/__init__\.py$||; s|\.py$||; s|/.*||')
if [[ -n "$module" && ! " $modules " =~ " $module " ]]; then
modules="$modules $module"
fi
done
modules=$(echo "$modules" | xargs) # trim whitespace
echo "Detected modules: $modules"
echo "modules=$modules" >> $GITHUB_OUTPUT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix duplicate-module detection (substring match can skip modules).

Line 52 uses a regex substring check against a concatenated string, which can treat os as already present when posix is in the list. This can silently drop modules from the deps run.

🔧 Proposed fix (exact-match dedup)
-          modules=""
+          modules=""
+          declare -A seen=()
@@
-            if [[ -n "$module" && ! " $modules " =~ " $module " ]]; then
-              modules="$modules $module"
-            fi
+            if [[ -n "$module" && -z "${seen[$module]:-}" ]]; then
+              seen["$module"]=1
+              modules="$modules $module"
+            fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Get changed Lib files
id: changed-files
run: |
# Get the list of changed files under Lib/
changed=$(git diff --name-only ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} -- 'Lib/*.py' 'Lib/**/*.py' | head -50)
echo "Changed files:"
echo "$changed"
# Extract unique module names (top-level only, skip test/)
modules=""
for file in $changed; do
# Skip test files
if [[ "$file" == Lib/test/* ]]; then
continue
fi
# Extract module name: Lib/foo.py -> foo, Lib/foo/__init__.py -> foo
module=$(echo "$file" | sed -E 's|^Lib/||; s|/__init__\.py$||; s|\.py$||; s|/.*||')
if [[ -n "$module" && ! " $modules " =~ " $module " ]]; then
modules="$modules $module"
fi
done
modules=$(echo "$modules" | xargs) # trim whitespace
echo "Detected modules: $modules"
echo "modules=$modules" >> $GITHUB_OUTPUT
- name: Get changed Lib files
id: changed-files
run: |
# Get the list of changed files under Lib/
changed=$(git diff --name-only ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} -- 'Lib/*.py' 'Lib/**/*.py' | head -50)
echo "Changed files:"
echo "$changed"
# Extract unique module names (top-level only, skip test/)
modules=""
declare -A seen=()
for file in $changed; do
# Skip test files
if [[ "$file" == Lib/test/* ]]; then
continue
fi
# Extract module name: Lib/foo.py -> foo, Lib/foo/__init__.py -> foo
module=$(echo "$file" | sed -E 's|^Lib/||; s|/__init__\.py$||; s|\.py$||; s|/.*||')
if [[ -n "$module" && -z "${seen[$module]:-}" ]]; then
seen["$module"]=1
modules="$modules $module"
fi
done
modules=$(echo "$modules" | xargs) # trim whitespace
echo "Detected modules: $modules"
echo "modules=$modules" >> $GITHUB_OUTPUT
🤖 Prompt for AI Agents
In @.github/workflows/lib-deps-check.yaml around lines 35 - 60, The
duplicate-module check uses a regex substring test in the condition 'if [[ -n
"$module" && ! " $modules " =~ " $module " ]]; then' which misdetects modules
like "os" when "posix" exists; replace this with exact-match deduplication by
tracking seen modules with an associative array (e.g., declare -A seen) inside
the same loop (for file in $changed) and only append when the module key is not
present (check via [[ -z "${seen[$module]}" ]]); after appending set
seen[$module]=1 and keep producing the trimmed $modules and GITHUB_OUTPUT as
before.

Comment on lines +67 to +85
- name: Run deps check
if: steps.changed-files.outputs.modules != ''
id: deps-check
run: |
# Run deps for all modules at once
python scripts/update_lib deps ${{ steps.changed-files.outputs.modules }} --depth 2 > /tmp/deps_output.txt 2>&1 || true
# Read output for GitHub Actions
echo "deps_output<<EOF" >> $GITHUB_OUTPUT
cat /tmp/deps_output.txt >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
# Check if there's any meaningful output
if [ -s /tmp/deps_output.txt ]; then
echo "has_output=true" >> $GITHUB_OUTPUT
else
echo "has_output=false" >> $GITHUB_OUTPUT
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Guard against command injection from PR-controlled module names.

Line 72 expands untrusted module names directly into a shell command in a pull_request_target workflow. A crafted filename could inject shell tokens. Pipe modules through xargs and include -- to avoid option injection.

🔒 Proposed fix (no shell eval of module names)
-          python scripts/update_lib deps ${{ steps.changed-files.outputs.modules }} --depth 2 > /tmp/deps_output.txt 2>&1 || true
+          printf '%s\n' ${{ steps.changed-files.outputs.modules }} | \
+            xargs -r python scripts/update_lib deps --depth 2 -- > /tmp/deps_output.txt 2>&1 || true
🤖 Prompt for AI Agents
In @.github/workflows/lib-deps-check.yaml around lines 67 - 85, The workflow
step "Run deps check" currently expands untrusted module names directly in the
shell via the command starting with "python scripts/update_lib deps ${{
steps.changed-files.outputs.modules }} --depth 2", which allows command
injection; change this to stop interpolating the raw variable and instead stream
the modules through a safe tool (e.g., use printf or echo to produce one module
per line and pipe into xargs) so each module is passed as a positional argument
and include a "--" marker before module names to prevent option injection,
invoking "python scripts/update_lib deps -- <module> --depth 2" for each module;
keep the rest of the step (writing /tmp/deps_output.txt and populating
GITHUB_OUTPUT and has_output) unchanged.

Comment on lines +12 to +13
from update_lib.io_utils import read_python_files, safe_parse_ast, safe_read_text
from update_lib.path import construct_lib_path, resolve_module_path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused safe_read_text import.

This is unused and will trip lint (F401).

🧹 Proposed cleanup
-from update_lib.io_utils import read_python_files, safe_parse_ast, safe_read_text
+from update_lib.io_utils import read_python_files, safe_parse_ast
🧰 Tools
🪛 Flake8 (7.3.0)

[error] 12-12: 'update_lib.io_utils.safe_read_text' imported but unused

(F401)

🤖 Prompt for AI Agents
In `@scripts/update_lib/deps.py` around lines 12 - 13, Remove the unused
safe_read_text import from the import statement in update_lib/deps.py: update
the line that imports from update_lib.io_utils so it only imports
read_python_files and safe_parse_ast, eliminating safe_read_text to satisfy lint
rule F401 and avoid unused-import warnings.

Comment on lines +10 to 14
get_soft_deps,
get_test_dependencies,
get_test_paths,
parse_lib_imports,
parse_test_imports,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Test logic changes violate repo test-editing rules.

New test logic/assertions were added in a **/*test*.py file, which conflicts with the rule that only expectedFailure annotations may be changed. Please confirm an exception or move these checks to a non-test module. As per coding guidelines, ...

Also applies to: 236-343

🤖 Prompt for AI Agents
In `@scripts/update_lib/tests/test_deps.py` around lines 10 - 14, The new
assertions/logic added to the test file violate the rule that test files under
**/*test*.py may only change expectedFailure annotations; revert the behavioral
changes in tests like get_soft_deps, get_test_dependencies, get_test_paths,
parse_lib_imports, and parse_test_imports and either (a) move any new validation
logic into a non-test helper/module (e.g., scripts/update_lib/helpers.py) and
import it from the tests, or (b) if the test must change behavior because of a
known failing condition, mark the test with `@pytest.mark.xfail` (or
expectedFailure) instead of changing assertions; ensure only annotation changes
remain in the test file.

@youknowone youknowone merged commit 58c8043 into RustPython:main Jan 20, 2026
8 of 9 checks passed
@youknowone youknowone deleted the show-deps branch January 20, 2026 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip:ci Skip running the ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant