-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Support pickle better with __getnewargs_ex__ #6749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughRefactors object pickling/reduction to add a new reduce_newobj path and helpers for extracting getnewargs/getstate/slot and item data, adds getstate/setstate to IO wrapper classes for pickling, interns getnewargs_ex in VM context, and removes a legacy Python reducelib module. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro ⛔ Files ignored due to path filters (11)
📒 Files selected for processing (5)
💤 Files with no reviewable changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used📓 Path-based instructions (1)**/*.rs📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
🔇 Additional comments (12)
✏️ Tip: You can disable this entire section by setting Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin pickle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/vm/src/stdlib/io.rs`:
- Around line 4105-4140: The StringIO __setstate__ method is missing the
closed-file guard present in BytesIO.__setstate__; add a check at the top of fn
__setstate__(zelf: PyRef<Self>, ...) that tests zelf.closed.load() and returns
the same error used by BytesIO.__setstate__ when the object is closed (use the
same vm error construction so behavior and message match). Ensure this check
runs before mutating self.buffer or seeking so __setstate__ fails fast for
closed StringIO instances.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (7)
Lib/test/test_copy.pyis excluded by!Lib/**Lib/test/test_csv.pyis excluded by!Lib/**Lib/test/test_descr.pyis excluded by!Lib/**Lib/test/test_enum.pyis excluded by!Lib/**Lib/test/test_lzma.pyis excluded by!Lib/**Lib/test/test_memoryio.pyis excluded by!Lib/**Lib/test/test_pickle.pyis excluded by!Lib/**
📒 Files selected for processing (3)
crates/vm/src/builtins/object.rscrates/vm/src/stdlib/io.rscrates/vm/src/vm/context.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style usingcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/vm/context.rscrates/vm/src/builtins/object.rscrates/vm/src/stdlib/io.rs
🧬 Code graph analysis (1)
crates/vm/src/builtins/object.rs (2)
crates/vm/src/builtins/tuple.rs (3)
class(41-43)class(527-529)__getnewargs__(352-362)crates/vm/src/builtins/int.rs (3)
__getnewargs__(560-562)a(623-623)a(659-659)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run rust tests (windows-2025)
- GitHub Check: Run snippets and cpython tests (windows-2025)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run tests under miri
- GitHub Check: Run rust tests (ubuntu-latest)
🔇 Additional comments (12)
crates/vm/src/vm/context.rs (1)
138-138: LGTM!The
__getnewargs_ex__constant is correctly added in alphabetical order after__getnewargs__, following the existing pattern for magic method names.crates/vm/src/stdlib/io.rs (4)
161-162: LGTM!The added imports for
PyDictandPyTupleRefare necessary for the new__getstate__and__setstate__methods.
4081-4103: LGTM!The
__getstate__implementation correctly captures StringIO state (content, newline, position, dict). The TODO at line 4096 acknowledges that actual newline handling needs future implementation.
4290-4309: LGTM!The
BytesIO.__getstate__correctly returns a 3-tuple (content, position, dict) which matches CPython's expected state format for BytesIO.
4311-4347: LGTM!The
BytesIO.__setstate__implementation properly validates the closed state, checks for exports viatry_resizable, and correctly restores content, position, and dict.crates/vm/src/builtins/object.rs (7)
626-644: LGTM!The
is_getstate_overriddenfunction correctly detects if a type has overridden__getstate__by comparing the method reference againstobject.__getstate__.
646-656: LGTM!The
object_getstatefunction correctly delegates to either the overridden__getstate__method or the default implementation based on whether the method is overridden.
658-674: LGTM!The
get_items_iterfunction correctly returns iterators for list/dict items when applicable, usingfast_isinstancefor efficient type checks.
676-734: LGTM!The
reduce_newobjfunction correctly implements protocol >= 2 reduction:
- Properly checks for
tp_newavailability- Correctly chooses between
__newobj__and__newobj_ex__based on kwargs presence- Computes the
requiredflag correctly for state retrieval- Returns the expected 5-tuple format
736-743: LGTM!The
common_reducefunction cleanly delegates toreduce_newobjfor protocol >= 2, maintaining the fallback tocopyreg._reduce_exfor older protocols.
564-624: Verify identifier! macro resolution for__getnewargs_ex__.The function uses
identifier!(vm, __getnewargs_ex__)at line 571. Ensure this macro properly resolves to the newly added constant incontext.rs. Unable to verify in current sandbox environment—manual verification or compilation check viacargo clippyrequired.
187-239: No changes needed. The weakref detection logic inobject_getstate_defaultis correct and consistent with the codebase patterns.Both detection mechanisms are intentional and accurate:
- Heap types without explicit
__slots__automatically support weakref—this matches CPython behavior and is confirmed by comments intype.rs(lines 1294-1295).- For builtin types, checking
attributes.contains_key()for__weakref__is the correct method in this codebase to detect weakref support, following the same pattern used for__dict__descriptor detection (type.rs, lines 1441-1446).
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Summary by CodeRabbit
New Features
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.