-
Notifications
You must be signed in to change notification settings - Fork 1.4k
check surrogates #6547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check surrogates #6547
Conversation
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughExposes Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
5bed3d3 to
4457769
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/builtins/type.rs (1)
1359-1379: Validate__slots__string items for UTF-8 surrogates to match validation of type name and__doc__.The PR adds UTF-8 surrogate validation for type names (line 1166) and
__doc__(line 1315), but__slots__string items lack this validation. Currently, slot names containing surrogates (e.g.,'\ud800') would pass the bytes check andtry_into_typedwithout validation, then propagate throughmangle_nameandintern_str. Callingensure_valid_utf8on each slot string after conversion would maintain consistency with the validation applied elsewhere.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/test/test_builtin.pyis excluded by!Lib/**
📒 Files selected for processing (2)
crates/vm/src/builtins/str.rscrates/vm/src/builtins/type.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/builtins/str.rscrates/vm/src/builtins/type.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run snippets and cpython tests (macos-latest)
🔇 Additional comments (6)
crates/vm/src/builtins/str.rs (1)
444-461: LGTM - Visibility change enables proper UTF-8 validation across the crate.The
pub(crate)visibility is the appropriate scope for this helper, allowing type construction paths to validate strings without exposing it publicly.crates/vm/src/builtins/type.rs (5)
1154-1177: LGTM - Proper UTF-8 validation for type name setter.The validation is correctly placed after the null character check, ensuring type names don't contain surrogates. This aligns with CPython's behavior.
1252-1259: LGTM - UTF-8 validation during type construction.Correctly validates the type name immediately after the null character check, preventing type creation with surrogate-containing names.
1311-1316: LGTM - Validates__doc__strings for surrogates during type creation.The check correctly handles the optional nature of
__doc__- only validating when it exists and is a string.
1351-1358: LGTM - Rejects bytes as__slots__value.This correctly rejects
__slots__ = b"foo"with an appropriate error message, matching CPython's behavior.
1366-1373: LGTM - Rejects bytes items within__slots__iterable.Correctly handles cases like
__slots__ = ['a', b'b']by validating each item during iteration.
|
Code has been automatically formatted The code in this PR has been formatted using git pull origin surrogate-check |
755c0a8 to
c29d54a
Compare
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.