-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Update the zipfile + zipimport libraries + tests - v3.13.11 #6639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the zipfile + zipimport libraries + tests - v3.13.11 #6639
Conversation
|
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (8)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including You can disable this status message by setting the 📝 WalkthroughWalkthroughUpdates SRE constants and tooling: SRE_MAGIC value changed, the SreFlag TEMPLATE bit removed, VM repr mapping no longer lists TEMPLATE, and a new Python script generates Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Code has been automatically formatted The code in this PR has been formatted using git pull origin update_zipfile_zipimport |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
scripts/generate_sre_constants.py (2)
7-16: Consider clarifying ValueError handling.The function correctly avoids unnecessary writes when content hasn't changed. However, catching
ValueErrorin addition toOSErroris unusual for file operations. If there's a specific reason for catchingValueError, consider adding a comment. Otherwise,OSErroralone should suffice.🔎 Simplified exception handling
def update_file(file, content): try: with open(file, "r") as fobj: if fobj.read() == content: return False - except (OSError, ValueError): + except OSError: pass with open(file, "w") as fobj: fobj.write(content) return True
60-82: Minor style improvement for conditional append.The function correctly generates bitflags definitions. Line 75 uses a conditional expression with
append()which always returnsNone. While functional, this is slightly unconventional.🔎 Cleaner conditional logic
content = ["bitflags! {\n"] - content.append(f"{derives}\n") if derives else None + if derives: + content.append(f"{derives}\n") content.append(f" pub struct {struct_name}: {int_t} {{\n")
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (18)
Lib/re/__init__.pyis excluded by!Lib/**Lib/re/_casefix.pyis excluded by!Lib/**Lib/re/_compiler.pyis excluded by!Lib/**Lib/re/_constants.pyis excluded by!Lib/**Lib/re/_parser.pyis excluded by!Lib/**Lib/sre_constants.pyis excluded by!Lib/**Lib/test/test_re.pyis excluded by!Lib/**Lib/test/test_runpy.pyis excluded by!Lib/**Lib/test/test_socket.pyis excluded by!Lib/**Lib/test/test_unittest/test_async_case.pyis excluded by!Lib/**Lib/test/test_zipfile/_path/test_path.pyis excluded by!Lib/**Lib/test/test_zipfile/test_core.pyis excluded by!Lib/**Lib/test/test_zipimport.pyis excluded by!Lib/**Lib/test/zipimport_data/sparse-zip64-c0-0x000000000.partis excluded by!Lib/**Lib/test/zipimport_data/sparse-zip64-c0-0x100000000.partis excluded by!Lib/**Lib/test/zipimport_data/sparse-zip64-c0-0x200000000.partis excluded by!Lib/**Lib/zipfile/__init__.pyis excluded by!Lib/**Lib/zipimport.pyis excluded by!Lib/**
📒 Files selected for processing (3)
crates/sre_engine/src/constants.rscrates/vm/src/stdlib/sre.rsscripts/generate_sre_constants.py
💤 Files with no reviewable changes (1)
- crates/vm/src/stdlib/sre.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/sre_engine/src/constants.rs
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
scripts/generate_sre_constants.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Check the WASM package and demo
🔇 Additional comments (7)
crates/sre_engine/src/constants.rs (4)
6-7: LGTM! Clear documentation of auto-generation.The header correctly indicates this file is auto-generated from Python source, which helps prevent inadvertent manual edits and improves maintainability.
65-121: LGTM! Improved code organization.The additional blank lines between enum and bitflags definitions improve readability and align with Rust formatting conventions.
108-120: TEMPLATE flag removal is complete and has no breaking references.The TEMPLATE flag has been successfully removed from SreFlag, and verification confirms no remaining references to
SRE_FLAG_TEMPLATEorre.TEMPLATEexist in the codebase. This aligns with Python's deprecation of the flag in 3.6+.
16-16: Verify SRE_MAGIC aligns with the target CPython version being synchronized.The SRE_MAGIC constant is a version identifier for the regex engine and is properly exported to the Python layer as
MAGICin the sre module. Changing this value could affect regex pattern compilation compatibility.Confirm that the value 20230612 matches the SRE_MAGIC in the specific CPython version (3.12 or 3.13) being synchronized. Check the target CPython source or run
python -c "import _sre; print(_sre.MAGIC)"on the target version to verify alignment.Note: The .pyc file references in the codebase are for Python bytecode, not regex pattern caches, so the primary concern is ensuring the engine version is compatible with the source being synchronized.
scripts/generate_sre_constants.py (3)
1-5: LGTM! Clear script identification.The shebang and documentation clearly identify the script's purpose and usage.
85-132: Build script with trusted input - exec() usage is acceptable.The
exec()call on line 92 executes code fromLib/re/_constants.py. Whileexec()generally poses security risks, this is acceptable here because:
- The input file is part of the trusted codebase
- This is a build-time script run by developers, not production code
- The script's purpose is to read Python constants and generate Rust code
The generated constants correctly match the structure in
constants.rs.
141-184: Remove commented-out code.This large block of commented-out code should be removed. Version control preserves history if needed, and dead code reduces maintainability. Additionally, the hardcoded list at lines 158-168 includes the now-deprecated
TEMPLATEflag, making it outdated.🔎 Remove dead code
- -# dump(f, OPCODES, "SreOpcode", "u32", "") -# dump(f, ATCODES, "SreAtCode", "u32", "AT_") -# dump(f, CHCODES, "SreCatCode", "u32", "CATEGORY_") -# -# def bitflags(typ, int_t, prefix, flags): -# f.write(f"""\ -# bitflags! {{ -# pub struct {typ}: {int_t} {{ -# """) -# for name in flags: -# val = globals()[prefix + name] -# f.write(f" const {name} = {val};\n") -# f.write("""\ -# } -# } -# """) -# -# bitflags("SreFlag", "u16", "SRE_FLAG_", [ -# "TEMPLATE", -# "IGNORECASE", -# "LOCALE", -# "MULTILINE", -# "DOTALL", -# "UNICODE", -# "VERBOSE", -# "DEBUG", -# "ASCII", -# ]) -# -# bitflags("SreInfo", "u32", "SRE_INFO_", [ -# "PREFIX", "LITERAL", "CHARSET", -# ]) -# -# print("done") - - -# if __name__ == "__main__": -# import sys -# if len(sys.argv) > 1: -# constants_file = sys.argv[1] -# else: -# import os -# constants_file = os.path.join(os.path.dirname(__file__), "../../sre-engine/src/constants.rs") -# with open(constants_file, "w") as f:Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
scripts/generate_sre_constants.py (1)
36-58: Address the past review comment regarding enum value handling.A previous review flagged an inconsistency in how
dump_enum()handles values compared todump_bitflags(). The past review comment suggests either:
- Updating
dump_enum()to preserve values when a dict is passed (likedump_bitflags()does), or- Clarifying in the docstring that the function expects a list and will always assign sequential values starting from 0.
Since the current usage passes lists (
OPCODES,ATCODES,CHCODES), option 2 (clarifying the docstring) may be sufficient.
🧹 Nitpick comments (2)
scripts/generate_sre_constants.py (2)
76-76: Use proper if statement for conditional side effects.Using a ternary expression for side effects is not idiomatic Python.
🔎 Proposed fix
content = ["bitflags! {\n"] - content.append(f"{derives}\n") if derives else None + if derives: + content.append(f"{derives}\n") content.append(f" pub struct {struct_name}: {int_t} {{\n")
86-133: Verify error handling for missing keys and file operations.The script uses
exec()to read constants from a trusted source file, which is acceptable for a build-time code generator. However, there's no validation that required keys (MAGIC,OPCODES,ATCODES,CHCODES,SRE_FLAG_*,SRE_INFO_*) exist in the namespace after execution. If the structure ofLib/re/_constants.pychanges, the script will fail with crypticKeyErrorexceptions.Consider adding validation or wrapping the key accesses in try-except blocks to provide clearer error messages.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
scripts/generate_sre_constants.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
scripts/generate_sre_constants.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Ensure compilation on various targets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
scripts/generate_sre_constants.py (2)
86-134: Consider documenting security assumptions forexec()usage.The use of
exec()on line 93 executes arbitrary Python code from the input file. While this is acceptable for a build script reading from a trusted source file in the same repository, consider adding a comment to document this security assumption for future maintainers.Additionally, the function lacks explicit error handling for file operations. While build scripts commonly fail fast, adding a try-except block with descriptive error messages would improve the developer experience when issues occur.
🔎 Optional improvements
def main( infile="Lib/re/_constants.py", outfile_constants="crates/sre_engine/src/constants.rs", ): + # Execute trusted source file to load constants + # WARNING: This executes arbitrary Python code from infile + # Only use with trusted source files from this repository ns = {} - with open(infile) as fp: - code = fp.read() - exec(code, ns) + try: + with open(infile) as fp: + code = fp.read() + except FileNotFoundError: + raise FileNotFoundError(f"Input file not found: {infile}") + except Exception as e: + raise Exception(f"Failed to read {infile}: {e}") + + exec(code, ns)
136-139: LGTM!Standard entry point pattern with support for command-line arguments.
As per coding guidelines, consider running
ruffto ensure PEP 8 compliance and catch any potential linting issues.#!/bin/bash # Description: Run ruff linter on the new Python script to ensure PEP 8 compliance # Check if ruff is available and run it on the script if command -v ruff &> /dev/null; then ruff check scripts/generate_sre_constants.py else echo "ruff is not installed. Install it with: pip install ruff" fi
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
scripts/generate_sre_constants.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
scripts/generate_sre_constants.py
🧠 Learnings (1)
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: Applies to **/*test*.py : NEVER comment out or delete any test code lines except for removing `unittest.expectedFailure` decorators and upper TODO comments
Applied to files:
scripts/generate_sre_constants.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Check the WASM package and demo
🔇 Additional comments (4)
scripts/generate_sre_constants.py (4)
1-5: LGTM!Clear script header with appropriate shebang and documentation.
7-16: LGTM!The idempotent file update logic is well-implemented, only writing when content actually changes.
19-33: LGTM!Appropriate copyright header for generated Rust code.
60-83: LGTM!The bitflags generation logic correctly preserves original values from the dictionary.
|
@terryluan12 can you please split this PR into multiple PRs? Unless there's a connection between all the 20 files. I'd rather review multiple smaller PRs rather than one big PR, this will help with merging what can already be merged and not blocked by other unrelated changes. And tysm for your contributions<3 |
|
probably zipfile and zipimport is related. If splitting re is possible, that will be the best. |
deb1063 to
83ec1c3
Compare
Sounds good! Just done so on #6648 |
|
(Btw, double checking if skip is needed) |
503a471 to
5c07d25
Compare
5c07d25 to
f94cb62
Compare
Lib/test/test_zipfile/test_core.py
Outdated
| with zipfile.ZipFile(TESTFN, "r") as zipfp: | ||
| self._test_read(zipfp, expected_names, self.file_content) | ||
|
|
||
| @unittest.skip('TODO: RUSTPYTHON') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These skips too. Actually, I'd like to entirely remove skip in this patch.
Let's see what's happening without skip.
Usually we avoid skip because it hides bugs and problems. If it still fails, I will append the reason what justify skip
55d87a0 to
b68b623
Compare
|
|
||
|
|
||
| # TODO: RUSTPYTHON | ||
| @unittest.skip("TODO: RUSTPYTHON shift_jis encoding unsupported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@terryluan12 Because this is a setUp issue, skipping the entire tests as previous version is better than marking every functions as skip. They are caused by the single reason
youknowone
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much!
…on#6639) * Updated zipimport library + test * Updated zipfile library + test * Annotated failing/erroring tests in test_zipfile and test_zipimport * Changed all skips in `test_core.py` to expectedFailures * skip EncodedMetadataTests --------- Co-authored-by: Jeong YunWon <jeong@youknowone.org>
Also includes updates to
sre_compile.py,sre_constants.py,sre_parse.pySummary by CodeRabbit
Chores
User-visible
✏️ Tip: You can customize this high-level summary in your review settings.