-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Assign opcode ids #6637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assign opcode ids #6637
Conversation
📝 WalkthroughWalkthroughRefactors the bytecode Instruction enum to use explicit numeric discriminants, renames and reshapes many opcodes (including argument-bearing variants), and propagates those renames across compiler, codegen, VM, JIT, stdlib metadata, tests, and the opcode-generation script. Changes
Sequence Diagram(s)(omitted) Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
crates/compiler-core/src/bytecode.rs (2)
592-871: Opcode discriminant assignments and TryFrom mapping look soundThe explicit repr(u8) values and the whitelist in
Instruction::try_fromare internally consistent: every value admitted by the match corresponds to a real variant (including the 222–255 range), so themem::transmute::<u8, Instruction>call is sound and there are no “hole” values that would cause UB. From a maintainability perspective, if you expect to churn opcodes often, you might eventually consider an explicitmatch value { ... => Ok(Variant), _ => Err(..) }mapping instead of a broad range + transmute, but the current approach is fine correctness‑wise.Also applies to: 875-915
656-662: AlignCheckExcMatchdocumentation with its current behaviorThe doc comment for
Instruction::CheckExcMatchstill describes it as pushing a boolean result (STACK[-2]vsSTACK[-1]), but the variant now carries aLabelarg, is included inlabel_arg, and itsstack_effectis-2, matching the “duplicate exc + type, then consume both and keep the original exc” pattern used incompile_try_statement. It would be less confusing for future readers if the comment were updated to describe the jump‑on‑no‑match semantics and the fact that it no longer produces a standalone bool on the stack.Also applies to: 1643-1656, 1761-1762
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
crates/codegen/src/compile.rscrates/compiler-core/src/bytecode.rscrates/vm/src/frame.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/frame.rscrates/compiler-core/src/bytecode.rscrates/codegen/src/compile.rs
🧬 Code graph analysis (2)
crates/vm/src/frame.rs (2)
crates/vm/src/vm/mod.rs (2)
value(675-675)value(677-677)crates/vm/src/exceptions.rs (2)
value(426-426)value(590-590)
crates/compiler-core/src/bytecode.rs (1)
crates/compiler-core/src/marshal.rs (1)
try_from(79-111)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run tests under miri
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Check the WASM package and demo
🔇 Additional comments (19)
crates/vm/src/frame.rs (10)
557-564: LGTM: Instruction rename is consistent.The rename from
BinarySubscripttoBinarySubscraligns with the PR's opcode standardization objective. The implementation logic remains unchanged.
791-803: LGTM: Semantic clarification with DeleteName.The rename from
DeleteLocaltoDeleteNamebetter reflects that this instruction deletes by name from the locals mapping (usingass_subscript), distinguishing it fromDeleteFastwhich operates on the fastlocals array by index. This improves code clarity.
804-804: LGTM: Instruction rename is consistent.The rename from
DeleteSubscripttoDeleteSubscraligns with the opcode standardization.
1350-1358: LGTM: Instruction rename is consistent.The rename from
PopExceptiontoPopExceptfollows the abbreviated naming convention used elsewhere in the bytecode.
1370-1370: LGTM: Instruction rename is consistent.The rename from
RaisetoRaiseVarargsbetter communicates that this instruction handles variable arguments for raise statements (reraise, raise with cause, etc.).
1418-1418: LGTM: Instruction rename is consistent.The rename from
SetupAnnotation(singular) toSetupAnnotations(plural) is more accurate since the instruction sets up the__annotations__dictionary which holds multiple annotations.
1487-1492: LGTM: Semantic clarification with StoreName.The rename from
StoreLocaltoStoreNamebetter reflects that this instruction stores by name in the locals mapping (usingass_subscript), distinguishing it fromStoreFastwhich stores by index in the fastlocals array. This improves code clarity.
1493-1493: LGTM: Instruction rename is consistent.The rename from
StoreSubscripttoStoreSubscraligns with the opcode standardization and abbreviated naming convention.
530-1578: Ensure code passescargo fmtandcargo clippybefore merging.Per the coding guidelines for Rust files, run
cargo fmtto format the code andcargo clippyto check for lints. Fix any warnings introduced by these changes before final submission.
700-722: CheckExcMatch implementation is correct.The implementation properly validates that exception types inherit from
BaseException, handles both single types and tuples, and leveragesis_instancewhich correctly delegates tuple matching toobject_recursive_isinstance. The bytecode instruction is correctly emitted during exception handler compilation, and the logic accurately mirrors Python's exception matching behavior.crates/compiler-core/src/bytecode.rs (2)
1696-1823: Stack effects for new/renamed opcodes match codegen usageThe updated
stack_effectarms forBinarySubscr,Subscript,StoreSubscr,DeleteSubscr,StoreName/DeleteName,CheckEgMatch,CheckExcMatch,RaiseVarargs,EndAsyncFor,SetupAnnotations, etc., line up with how they’re emitted incompile.rs:
StoreSubscr => -3andDeleteSubscr => -2match the[value, container, index]and[container, index]stack shapes produced bycompile_subscript,compile_store, andcompile_augassign.BinarySubscr => -1is used where we duplicate the subject and then index into it in pattern helpers.RaiseVarargs’s-(kind.get(arg) as u8 as i32)matches the number of values the compiler pushes for eachRaiseKind(0/1/2).CheckEgMatch => 0(pop 2, push 2) is consistent with the[orig, list, rest, type] -> [orig, list, new_rest, match]comment incompile_try_star_except.CheckExcMatch(_) => -2matches theCopyItem+exc_typepattern incompile_try_statement(3 stack entries down to the original exc).I don’t see any inconsistency between these definitions and the codegen sites.
875-915: Confirm bytecode compatibility expectations after changing opcode valuesBy pinning explicit
repr(u8)discriminants and updatingInstruction::try_fromaccordingly, the binary encoding of bytecode has changed. Any previously serializedCodeUnit/CodeObject(e.g., cached.pyc-like artifacts) using the old opcode layout will now either fail to decode (InvalidBytecode) or, if produced by older code that also usedtransmute, could be misinterpreted if reused across versions. If RustPython ever persists bytecode across versions, it would be good to confirm that:
- Either the magic/marshal version is bumped so old caches are rejected explicitly, or
- You guarantee such caches are never reused across a change like this.
Otherwise, you may get hard‑to‑diagnose failures when loading stale compiled artifacts.
crates/codegen/src/compile.rs (7)
445-502: Subscript codegen matches new BinarySubscr/StoreSubscr/DeleteSubscr semanticsThe various subscript paths are internally consistent with the new opcodes and their stack effects:
compile_subscript:
- Load: evaluates
value, thenslice, and callsSubscript, giving[obj, index] -> resultas expected.- Store: when used from
compile_store, we start with the assigned value on the stack; after compilingvalueandslice(andBuildSlicewhen optimizing 2‑element slices) we end up with[value, container, sub], which matchesStoreSubscr => -3.- Delete: compiles
valuethensliceand callsDeleteSubscr, with[container, sub]matchingDeleteSubscr => -2.
pattern_helper_sequence_subscrduplicates the subject and usesBinarySubscrwith[subject, subject, idx]so the helper keeps the original subject while extracting an element, consistent withBinarySubscr => -1.
compile_annotated_assignbuilds[annotation, __annotations__, name]beforeStoreSubscr, so__annotations__[name] = annotationis encoded correctly.
compile_augassign’s subscript case carefully arranges the stack to[result, container, index]beforeStoreSubscr, again matching the[value, container, sub]convention.Given the stack_effect definitions in
bytecode.rs, all of these sites look correct.Also applies to: 1801-1827, 3621-3622, 4484-4511, 4588-4648
1180-1302: NameOp::Name migration to LoadNameAny/StoreName/DeleteName is coherentRouting
NameOp::NamethroughLoadNameAny/StoreName/DeleteNameis consistent with how indices are allocated:
name()andget_global_name_index()both index intoCodeInfo.metadata.names, which is exactly what these opcodes consume.- Module/class bodies (
compile_program,compile_program_single,compile_class_body) and class metadata setup (__module__,__qualname__,__doc__,__firstlineno__,__type_params__,__classcell__) all usename()andStoreName, so they’re aligned with the bytecode enum.As long as the VM side now treats
LOAD_NAME_ANY/STORE_NAME/DELETE_NAMEas the generic name path (mirroring the old local/name opcodes), behavior should be unchanged.Also applies to: 2959-3042
1501-1518: RaiseVarargs and exception-matching opcodes integrate cleanly in try/except and assert pathsThe switch from
RaisetoRaiseVarargsand fromJumpIfNotExcMatchtoCheckExcMatchis wired consistently:
Stmt::Raise:
RaiseKind::Raiseis emitted after compiling one expression (stack_effect-1).RaiseKind::RaiseCauseis emitted after compiling two expressions (stack_effect-2).- Bare
raiseusesRaiseKind::Reraise(stack_effect0), matching the absence of new stack values.In
assertlowering, the explicitRaiseVarargs { kind: Raise }matches compiling exactly one exception instance.Classic
try/except:
- For typed handlers, the
CopyItem+exc_type+CheckExcMatch(next_handler)pattern creates[exc, exc, type]so thatCheckExcMatchcan consume the duplicate and type and still leave the originalexcfor the bound name orPopTop.- Unhandled cases jump to the final
next_handlerblock, whereRaiseVarargs { kind: Reraise }is emitted, so unmatched exceptions are re‑raised.
try/except*:
- Uses
CheckEgMatchwith the documented[orig, list, rest, type] -> [orig, list, new_rest, match]shape.SetExcInfois used before running the handler body so a bareraisepicks up the right match.- The “no more to reraise” path does
PopTop(drop theNoneresult) andPopExcept, while the re‑raise path deliberately omitsPopExceptand just executesRaiseVarargs { kind: Raise }, leaving block unwinding to the VM.These all line up with the stack_effect definitions in
bytecode.rs; I don’t see a mismatch.Also applies to: 1586-1591, 2053-2147, 2178-2420
2923-2957: SetupAnnotations usage remains consistent with annotation detectionThe rename to
Instruction::SetupAnnotationshas been plumbed through all relevant sites:
- Module and REPL compilation (
compile_program,compile_program_single) callfind_annand emitSetupAnnotationsonly when needed.- Class bodies (
compile_class_body) also callfind_annand emitSetupAnnotationsonce at the top when annotations are present.Since
compile_annotated_assignis already gated on!ctx.in_func()for runtime annotation storage, this preserves the prior behavior with the new opcode name.Also applies to: 1034-1053, 1063-1065, 3019-3022
3821-4022: CompareOp/ComparisonOperator wiring for patterns and comparisons looks correctThe migration to
ComparisonOperator+Instruction::CompareOpis coherent:
- Mapping in
compile_addcompare:
CmpOp::Eq/NotEq/Lt/LtE/Gt/GtEmap toEqual,NotEqual,Less,LessOrEqual,Greater,GreaterOrEqualrespectively.- Structural pattern matching:
- Mapping patterns use
CompareOp { op: GreaterOrEqual }for thelen(subject) >= sizeandlen(subject) >= size-1checks.- Sequence patterns use
CompareOp { op: Equal }forlen(subject) == sizeandGreaterOrEqualwhere a*restis present.- Value patterns use
CompareOp { op: Equal }forsubject == value.- Other relational/inclusion ops (
In/NotIn,Is/IsNot) delegate toContainsOpandIsOpwith the rightInvertflag.Given
CompareOphas stack effect-1and the surrounding code always leaves the subject underneath the comparison, these changes look correct.Also applies to: 4132-4209, 4211-4223, 4361-4376
4484-4517: Annotation storage and augmented assignment now use StoreSubscr consistentlyTwo key sites now rely on
StoreSubscrand are consistent with its stack contract:
compile_annotated_assign:
- For simple
name: Toutside functions, aftercompile_annotationthe stack is[annotation], thenLoadNameAny(__annotations__), then the mangled name string, yielding[annotation, __annotations__, name]beforeStoreSubscr, which matches the[value, container, index]convention.
compile_augassign:
- The subscript case builds
[container, slice], duplicates both to get a copy for loading, usesSubscriptto obtain the original value, then performs the inplace binary op with the RHS and finally reorders to[result, container, slice]just beforeStoreSubscr.Both cases align with the
StoreSubscrstack_effect and with how simple subscript stores are compiled viacompile_subscript.Also applies to: 4588-4648
5443-5630: Async for/comprehensions correctly use EndAsyncFor under the new opcode layoutIn both explicit
async forloops and comprehensions:
compile_foremitsEndAsyncForon the async path after theelseblock, matchingEndAsyncFor => -2(it cleans up the async iterator state).compile_comprehensionemitsEndAsyncForwhen any generator is async, and usesGetAIter/GetANext+Resume { AfterAwait }, consistent with the updated async for bytecodes.Given the updated numeric assignment for
EndAsyncForand the unchanged stack_effect, the codegen here still looks correct.Also applies to: 3331-3377
|
#6638 will change instruction a lot. Can this be aligned to that patch? |
153a03a to
5369dab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/compiler-core/src/bytecode.rs (1)
1063-1074: Verify transmute safety with non-contiguous discriminants.The
TryFrom<u8>implementation usesunsafe { core::mem::transmute::<u8, Self>(value) }after checkingvalue <= u8::from(LAST_INSTRUCTION). With explicit discriminants that have gaps (e.g., no opcodes at 0, 3-4, 11, 13, etc.), transmuting these gap values could produce invalidInstructionenum variants, which is undefined behavior.Consider either:
- Filling all gaps with
Reservedvariants (e.g.,Reserved0,Reserved3,Reserved4, etc.)- Using an explicit match instead of transmute to validate each discriminant
🔎 Example of safer implementation using match
impl TryFrom<u8> for Instruction { type Error = MarshalError; #[inline] fn try_from(value: u8) -> Result<Self, MarshalError> { - if value <= u8::from(LAST_INSTRUCTION) { - Ok(unsafe { core::mem::transmute::<u8, Self>(value) }) - } else { - Err(MarshalError::InvalidBytecode) - } + // Safe alternative: explicit validation + match value { + 1 => Ok(Self::BeforeAsyncWith), + 2 => Ok(Self::BeforeWith), + 5 => Ok(Self::BinarySubscr), + // ... other valid opcodes + 149 => Ok(Self::Resume { arg: Arg::marker() }), + _ => Err(MarshalError::InvalidBytecode), + } } }Note: This requires #[repr(u8)] on the enum and careful maintenance, but is memory-safe.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
crates/compiler-core/src/bytecode.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/compiler-core/src/bytecode.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (2)
crates/compiler-core/src/bytecode.rs (2)
679-712: Coordinate with PR #6638 before merging.Based on the PR comments, PR #6638 will significantly change the instruction structure. Since this PR assigns explicit discriminants to the
Instructionenum, merging both could create conflicts or require substantial rework.Recommend coordinating with the author of PR #6638 to determine:
- Which PR should merge first
- Whether discriminant assignments should be deferred until after #6638
- How to handle potential conflicts in the instruction enum structure
679-712: Cache (opcode 0) and Reserved3 (opcode 3) variants appear to be referenced but not defined in the Instruction enum.The review comment correctly identifies that opcode 0 is missing from the visible enum definition (BeforeAsyncWith starts at 1). However, the code references
CacheandReserved3in match statements withinstack_effect()(line 1846) and the display logic (line 2084), but these variants cannot be located in the enum definition. If not properly defined, this would cause compilation errors.This discrepancy needs clarification:
- If Cache and Reserved3 are intentionally omitted per CPython 3.13 spec (which removed CACHE as a separate opcode), they should be removed from match statements.
- If they should be included, they must be added to the enum with appropriate discriminants (Cache = 0, Reserved3 = 3).
5369dab to
0701649
Compare
94ad7d5 to
c56c91f
Compare
| LoadClosure(Arg<NameIdx>) = 253, // CPython uses pseudo-op 258 | ||
| LoadMethod { | ||
| idx: Arg<NameIdx>, | ||
| } = 254, // CPython uses pseudo-op 259 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will actually put these ops on pseudo ops once we get them.
Planning to do it for next task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That, will be great:)
If you already have a local branch with the changes, can you please open a draft PR? I might have some inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not yet. I will start one during weekend
crates/compiler-core/src/bytecode.rs
Outdated
| } else { | ||
| Err(MarshalError::InvalidBytecode) | ||
| match value { | ||
| 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not the opposite way? listing reserved may be easier. Reserved ops doesn't have args, so using names is also easy like Self::Reserved3 as u8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand. What is the purposed code change?
I created this match arm with a simple script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to avoid using literal numbers in the code. That's not easy to read, and maintainers must not forget to run the 'simple script'. Using name prevent to run the script.
if (1..=Self::Resume as u8).contains(value) && ![Self::Reserved3 as u8, Self::Reserved17 as u8, Self::Reserved141 as u8...].contains(value) {
Ok(core::mem::transmute::<u8, Self>(value))
} else {
Err(InvalidBytecode)
}
This code will automatically error-catachable by breaking build if we change reserved141 to something other. (Though not working the inverse way)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we have "holes" in our IDs, this can lead to undefined behavior.
I would love to use num_enum crate, but it doesn't support enum variants that holds a value:/
I do think we should rename the Instruction enum to Opcode, and have an Instruction struct that holds:
struct Instruction {
opcode: Opcode,
oparg: Oparg // Oparg enum
}Like a typed CodeUnit.
But that's out of scope for this PR, and needs a further dissection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we had that form originally. The commits of that change or comments of the PR might have note that why we moved here.
cc @coolreader18
| | Continue { target: l } | ||
| | Send { target: l } | ||
| // Dummy jump instructions (placeholders) | ||
| | JumpBackward { target: l } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to implement these to replace our own old ones, I hesitate to remove it now to recover it back later.
We will not implement any other instruction in this slot anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My philosophy is that we can always add new code, but to have a placeholder code (that can be deleted) is just clutter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit to delete them? They are deletable, but no reason to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO they are just noise, but I can re-add them, that's not an issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that could be. At the same time, it is also like TODO
| def test_boundaries(self): | ||
| self.assertEqual(dis.opmap["EXTENDED_ARG"], dis.EXTENDED_ARG) | ||
|
|
||
| @unittest.expectedFailure # TODO: RUSTPYTHON; AssertionError: 29 not less than or equal to 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because BINARY_OP_INPLACE_ADD_UNICODE opcode was added, we are missing some truncation logic when formatting the dis output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In @crates/codegen/src/compile.rs:
- Around line 1016-1025: The stack effect for PopExcept is incorrectly
documented as 0 but the VM's frame implementation (where PopExcept does let
prev_exc = self.pop_value()) consumes one value; update the stack_effect mapping
for the PopExcept opcode (the entry for PopExcept in your
stack_effect/stack_effects function or match table) from 0 to -1 so the
documented contract matches the VM behavior and downstream stack analysis.
- Around line 1881-1885: The generic-class keyword handling is wrong: instead of
emitting individual STRING constants for each kw name and increasing nargs with
keywords, build a single tuple constant of keyword name strings (use
ConstantData::Tuple) and emit that one tuple onto the stack, and emit the call
with nargs equal to the number of positional arguments only (do not add keywords
to nargs); mirror the pattern in compile_call_inner so the VM's
collect_keyword_args receives a PyTuple at the top of the stack and the
Call/CallKw ABI is satisfied.
In @crates/compiler-core/src/bytecode.rs:
- Around line 679-958: The VM currently ends bytecode dispatch with a catch-all
that calls unreachable!(), so placeholder enum variants in Bytecode (e.g.,
BinaryOpInplaceAddUnicode, BinarySlice, EndFor, LoadLocals, PushNull) can cause
a panic if accidentally executed; change the dispatch default arm to return a
descriptive runtime error (or Err) indicating an invalid/placeholder opcode
rather than panicking, and add an explicit match arm or helper check (e.g.,
is_placeholder_opcode) that logs or maps placeholder discriminants to that
error; additionally, add a validation step in bytecode loading or TryFrom<u8>
conversion for Bytecode to reject placeholder discriminants early (or emit
warnings) and document the retained placeholder variants in the Bytecode enum
comment to clarify they must never be emitted by codegen.
In @crates/vm/src/frame.rs:
- Around line 1782-1784: The match currently uses a catch-all arm with
unreachable!() for the Instruction enum which defeats compile-time
exhaustiveness checking; replace the wildcard arm by explicitly matching every
Instruction variant used in this match so the compiler will error if new
variants are added (i.e., remove the `_ => unreachable!()` arm in the match on
`instruction` and add explicit arms for each `Instruction::<Variant>` handled
here), or if the enum is intended to be extensible annotate the enum definition
with `#[non_exhaustive]` and then handle the `_` case with a documented
conversion/error path; ensure you update the match in the function/method in
frame.rs where `instruction` is matched so future additions to `Instruction`
cause a compile-time failure instead of a runtime panic.
In @scripts/generate_opcode_metadata.py:
- Around line 21-29: The cpython_name property contains multiple return
statements after the intended two-step transformation, making the last two
returns unreachable; edit the cpython_name method to keep the special-case for
"CopyItem" and the intended conversion (assign rust_name to name, apply the two
regex substitutions and return name.upper()), and remove the extraneous return
lines (the duplicate re.sub returns) so only the correct transformation and
single return remain in the cpython_name property.
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode.rs (1)
975-999: Document intentional gaps in opcode ranges.The
TryFrom<u8>implementation validates four separate ranges:
- CPython opcodes:
Cache(0) toYieldValue(118)- Resume: 149
- RustPython-only:
Break(119) toSubscript(135)- Pseudo opcodes:
Jump(252) toPopBlock(255)This leaves gaps (136-148, 150-251) that will return
InvalidBytecodeerrors. Consider:
- Adding a comment in the TryFrom implementation documenting these intentional gaps.
- Verifying that deserialization of legacy bytecode won't encounter these gap values.
📝 Add documentation for opcode gaps
#[inline] fn try_from(value: u8) -> Result<Self, MarshalError> { + // Valid opcode ranges: + // - 0..=118: CPython 3.13 compatible opcodes + // - 149: Resume instruction + // - 119..=135: RustPython-specific opcodes + // - 252..=255: Pseudo opcodes (not emitted in bytecode) + // Gaps (136-148, 150-251) are reserved for future use let cpython_start = u8::from(Self::Cache);
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
Lib/_opcode_metadata.pyis excluded by!Lib/**Lib/test/test__opcode.pyis excluded by!Lib/**Lib/test/test_dis.pyis excluded by!Lib/**
📒 Files selected for processing (7)
crates/codegen/src/compile.rscrates/compiler-core/src/bytecode.rscrates/jit/src/instructions.rscrates/jit/tests/common.rscrates/stdlib/src/opcode.rscrates/vm/src/frame.rsscripts/generate_opcode_metadata.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/stdlib/src/opcode.rscrates/jit/src/instructions.rscrates/jit/tests/common.rscrates/vm/src/frame.rscrates/codegen/src/compile.rscrates/compiler-core/src/bytecode.rs
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
scripts/generate_opcode_metadata.py
🧠 Learnings (1)
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: When modifying bytecode instructions, perform a full clean build by running `rm -r target/debug/build/rustpython-* && find . | grep -E '\.pyc$' | xargs rm -r`
Applied to files:
scripts/generate_opcode_metadata.py
🧬 Code graph analysis (2)
crates/codegen/src/compile.rs (1)
crates/compiler-core/src/bytecode.rs (8)
from(394-396)from(437-439)from(964-967)from(1044-1046)from(1050-1052)try_from(974-999)try_from(1020-1025)try_from(1034-1040)
crates/compiler-core/src/bytecode.rs (3)
crates/vm/src/exceptions.rs (4)
value(438-438)value(602-602)w(1775-1775)w(1776-1776)crates/vm/src/frame.rs (3)
arg(2646-2647)name(1322-1322)name(2638-2638)crates/codegen/src/compile.rs (1)
name(1210-1212)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-2025)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
🔇 Additional comments (16)
crates/compiler-core/src/bytecode.rs (2)
1763-1914: LGTM! Stack effect calculations properly updated.The stack effect calculations have been correctly updated to use the renamed instruction variants (
LoadName,StoreName,DeleteName,Call,CallKw,RaiseVarargs, etc.). The logic for each instruction's stack manipulation appears correct.
1982-2104: LGTM! Disassembly formatting correctly updated.The
fmt_dismethod has been properly updated to display the renamed instruction variants with their appropriate names (e.g.,BINARY_SUBSCR,DELETE_SUBSCR,LOAD_NAME,STORE_NAME,CALL,CALL_KW,RAISE_VARARGS, etc.).scripts/generate_opcode_metadata.py (2)
55-81: LGTM! Generation logic correctly parses and outputs opcode metadata.The script properly:
- Reads the Rust source file
- Extracts the Instruction enum body
- Parses opcodes with their discriminants
- Generates sorted output in CPython 3.13 compatible format
The dynamic parsing approach eliminates the need for manual updates when opcodes change.
32-37: Python version requirement is acceptable for this development script.This is a development-only script located in the
scripts/directory used to generate opcode metadata. The script already indicates compatibility with CPython 3.13 in its header comment, and the project'swhats_left.pyexplicitly requires Python 3.13+. Sinceitertools.batchedwas introduced in Python 3.12, its use is fully compatible with the project's stated Python version requirements.No action needed.
crates/jit/tests/common.rs (1)
99-105: LGTM! Test instruction handling correctly updated.The test stack machine has been properly updated to handle the renamed instruction variants:
LoadNameAny→LoadNameStoreLocal→StoreNameThe logic remains the same, only the variant names changed.
crates/jit/src/instructions.rs (2)
457-476: LGTM! JIT call instruction updated.The JIT compiler correctly handles the renamed
Callinstruction (previouslyCallFunctionPositional). The implementation logic remains unchanged.
492-545: LGTM! Comparison instruction renamed.The
CompareOpinstruction (previouslyCompareOperation) is correctly handled in the JIT with proper type conversions and condition code mappings for both integer and float comparisons.crates/stdlib/src/opcode.rs (1)
66-83: LGTM! Opcode name classification correctly updated.The
has_namefunction has been properly updated to recognize the renamed instruction variants:
DeleteLocal→DeleteNameLoadNameAny→LoadNameStoreLocal→StoreNameAll three renamed variants correctly appear in the pattern match for instructions that use
co_names.crates/vm/src/frame.rs (1)
394-394: LGTM! Instruction variant renames are consistent.The renamed instruction variants align well with standard Python bytecode naming conventions:
RaiseVarargs,BinarySubscr,CallKw,Call,CompareOp,DeleteName,DeleteSubscr,LoadName,PopExcept,SetupAnnotations,StoreName, andStoreSubscrare all clearer and more conventional names.The logic remains unchanged—these are purely API surface updates.
Also applies to: 613-613, 714-718, 756-756, 825-825, 838-838, 1195-1195, 1431-1431, 1460-1460, 1562-1562, 1624-1624, 1630-1630
crates/codegen/src/compile.rs (7)
1818-1819: The kind mapping correctly covers all three raise forms (bare raise, raise exc, raise exc from cause), and the encoding aligns with the RaiseKind enum and opcode metadata. No issues found.
5222-5227: Annotation assignment, augassign, and normal call paths updated to new ops—monitor for stack effect regressionsAll three locations (lines 5222–5227, 5363–5364, 6111–6115) correctly use the updated instructions (LoadName, StoreSubscr, Call, CallKw). Run the full test suite plus targeted cases (subscripting stores,
__annotations__in class/module, kwargs calls, exception handling) to verify stack effects before merge.
1345-1346: SetupAnnotations instruction verified: correctly gated and scopedSetupAnnotations is properly emitted in all three entry points (module, function, class) using consistent gating with
Self::find_ann(), which recursively detects annotated assignments. The runtime implementation correctly initializes the__annotations__dict only when needed. No silent breakage risk identified—scopes align with Python semantics and the instruction behavior is sound.
1597-1602: LoadName/StoreName/DeleteName scope semantics are correctLoadNameAny does not exist in the codebase. The current implementation correctly emits
LoadNameforNameOp::Name, which handles the proper LEGB lookup (Locals → Enclosing → Globals → Builtins): it first triesself.locals.mapping().subscript(name, vm), and on KeyError falls back toself.load_global_or_builtin(name, vm). This is the correct behavior for module and class scopes. StoreName and DeleteName similarly uselocals.mapping()for storage/deletion, which is appropriate for these contexts. No behavior regression exists.
4338-4339: Pattern-matching opcode renames (BinarySubscr / CompareOp / DeleteSubscr) are properly defined and handled
All three opcodes are correctly defined in the Instruction enum (BinarySubscr = 5, DeleteSubscr = 9, CompareOp with op field), with proper VM dispatch handlers in frame.rs, correct stack effect tracking, and consistent bytecode serialization. All usages in compile.rs align with the defined opcode metadata.
497-510: Subscript opcodes are properly defined with documented stack semanticsThe
StoreSubscr(opcode 39, -3 stack effect),DeleteSubscr(opcode 9, -2 stack effect), andBinarySubscr(opcode 5, -1 stack effect) instructions are well-defined incrates/compiler-core/src/bytecode.rswith clear stack effect documentation. The usage patterns incompile.rsare correct: for store operations, the container and index are on the stack followed by the value; for delete operations, the container and index are consumed. No stack order issues detected.
3564-3596: Verify LoadName semantics for.type_paramsaccess in PEP 695 generic class contextsWhile
.type_paramsis registered with theTYPE_PARAMflag (making it a bound symbol), the code usesLoadNameinstructions to access it rather thanLoadDeref. This appears to work because:
- In
builtins.rs(line 1040),.type_paramsis explicitly set in the class function's namespace dictionary before execution- The
compile_namefunction has special handling forTypeParamsscope that allows fallback to parent scope lookup- At runtime,
LoadNamesucceeds by finding.type_paramsin the namespace dictHowever, this relies on runtime namespace provisioning rather than the symbol scope mechanism. Confirm that:
.type_paramsis guaranteed to be in the namespace dict in all PEP 695 execution paths- No optimization or refactoring removes the
.type_paramsnamespace entry before the class code executes- The fallback logic in
compile_nameconsistently resolves.type_paramscorrectly across all TypeParams scope access patterns (lines 3589-3590, 3679-3680, 3690, and others)
| // We need to pop: exc, prev_exc (via PopExcept) | ||
| if preserve_tos { | ||
| emit!(self, Instruction::Swap { index: 2 }); | ||
| } | ||
| emit!(self, Instruction::PopTop); // exc | ||
| if preserve_tos { | ||
| emit!(self, Instruction::Swap { index: 2 }); | ||
| } | ||
| emit!(self, Instruction::PopException); // prev_exc is restored | ||
| emit!(self, Instruction::PopExcept); // prev_exc is restored | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find PopExcept / PushExcInfo / RaiseVarargs implementations + stack comments
rg -n --hidden --no-messages "PopExcept|PushExcInfo|RaiseVarargs|ReraiseFromStack" crates | head -n 200Repository: RustPython/RustPython
Length of output: 5787
🏁 Script executed:
#!/bin/bash
# Inspect Instruction definition in compiler-core bytecode
rg -n --hidden --no-messages "enum Instruction" crates/compiler-core/src/bytecode.rs | head -n 20Repository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
#!/bin/bash
# Look for bytecode.rs file to understand Instruction definition
find crates -name "bytecode.rs" -type fRepository: RustPython/RustPython
Length of output: 101
🏁 Script executed:
#!/bin/bash
# Read bytecode.rs around the Instruction enum and stack effect docs
sed -n '1835,1910p' crates/compiler-core/src/bytecode.rsRepository: RustPython/RustPython
Length of output: 3439
🏁 Script executed:
#!/bin/bash
# Also read the Instruction enum definition
sed -n '677,750p' crates/compiler-core/src/bytecode.rsRepository: RustPython/RustPython
Length of output: 1937
🏁 Script executed:
#!/bin/bash
# Check the PopExcept and PushExcInfo implementations in frame.rs
sed -n '1425,1470p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 2441
🏁 Script executed:
#!/bin/bash
# Check PushExcInfo implementation
sed -n '1500,1520p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 893
🏁 Script executed:
#!/bin/bash
# Read more context around the stack_effect function to understand the contract
sed -n '1830,1905p' crates/compiler-core/src/bytecode.rsRepository: RustPython/RustPython
Length of output: 3463
🏁 Script executed:
#!/bin/bash
# Check if there are any test cases or comments explaining the expected behavior
rg -n "PopExcept" crates --context=3 -A 2 -B 2Repository: RustPython/RustPython
Length of output: 7469
🏁 Script executed:
#!/bin/bash
# Look at the compile.rs code around all mentioned line ranges to check for consistency
sed -n '1014,1030p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 877
🏁 Script executed:
#!/bin/bash
# Check other PopExcept usages in compile.rs
sed -n '1058,1070p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 582
🏁 Script executed:
#!/bin/bash
# Look at lines 2382-2400 mentioned in the review to check expected stack state
sed -n '2375,2410p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 1437
🏁 Script executed:
#!/bin/bash
# Check if there are any recent comments or issues about PopExcept stack effect
rg -n "stack.effect|stack_effect" crates/compiler-core/src/bytecode.rs | grep -A 2 -B 2 "PopExcept"Repository: RustPython/RustPython
Length of output: 47
PopExcept's documented stack effect is incorrect and must be fixed
The bytecode stack effect documentation states PopExcept => 0, (bytecode.rs:1897), but the VM implementation (frame.rs:1431) actually pops one value: let prev_exc = self.pop_value();. This is a -1 stack effect, not 0.
All usage sites in compile.rs correctly assume PopExcept has a -1 effect (e.g., lines 1016-1024 comment "prev_exc is restored"), and the comments match the actual behavior. However, the incorrect documented contract could cause issues with stack analysis tools or future verification passes that rely on the stack_effect function.
Fix the stack_effect documentation in bytecode.rs:1897 from PopExcept => 0, to PopExcept => -1, to match the actual VM implementation.
🤖 Prompt for AI Agents
In @crates/codegen/src/compile.rs around lines 1016 - 1025, The stack effect for
PopExcept is incorrectly documented as 0 but the VM's frame implementation
(where PopExcept does let prev_exc = self.pop_value()) consumes one value;
update the stack_effect mapping for the PopExcept opcode (the entry for
PopExcept in your stack_effect/stack_effects function or match table) from 0 to
-1 so the documented contract matches the VM behavior and downstream stack
analysis.
| emit!(self, Instruction::Call { nargs: 1 }); | ||
| } | ||
| None => { | ||
| emit!(self, Instruction::CallFunctionPositional { nargs: 0 }); | ||
| emit!(self, Instruction::Call { nargs: 0 }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the compile.rs file at the mentioned lines
head -c 1000000 crates/codegen/src/compile.rs | wc -lRepository: RustPython/RustPython
Length of output: 69
🏁 Script executed:
# Look at the code around the specified lines
sed -n '1875,1890p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 762
🏁 Script executed:
# Find all occurrences of CallKw and Call instructions in the codebase
rg -n "CallKw|Instruction::Call\b" crates/codegen/src/compile.rs | head -50Repository: RustPython/RustPython
Length of output: 1029
🏁 Script executed:
# Find the compile_call_inner function
rg -n "fn compile_call_inner" cratesRepository: RustPython/RustPython
Length of output: 126
🏁 Script executed:
# Search for CALL and CALL_KW instruction implementations in the VM
rg -n "CALL_KW|CALL\b" crates --type rust | grep -i "opcode\|instruction\|match" | head -40Repository: RustPython/RustPython
Length of output: 158
🏁 Script executed:
# Read the compile_call_inner function (around line 6127)
sed -n '6120,6200p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 3137
🏁 Script executed:
# Look for CallKw enum variant definition
rg -n "CallKw" crates/codegen/src/compile.rs -B2 -A2Repository: RustPython/RustPython
Length of output: 583
🏁 Script executed:
# Find instruction definitions - look for CallKw struct/variant
fd "bytecode.rs\|instruction" crates --type f | head -10Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Search for where Call/CallKw instructions are defined
rg -n "pub enum Instruction|struct Call|struct CallKw" crates --type rust | head -20Repository: RustPython/RustPython
Length of output: 249
🏁 Script executed:
# Read the Instruction enum definition in bytecode.rs around line 677
sed -n '677,750p' crates/compiler-core/src/bytecode.rsRepository: RustPython/RustPython
Length of output: 1937
🏁 Script executed:
# Search for Call and CallKw variants in the enum
rg -n "Call\s*\{|CallKw\s*\{" crates/compiler-core/src/bytecode.rs -A2Repository: RustPython/RustPython
Length of output: 1373
🏁 Script executed:
# Find VM implementation of CALL and CALL_KW instructions
rg -n "Instruction::Call\s*\{|Instruction::CallKw\s*\{" crates/vm/src --type rust -B2 -A5 | head -80Repository: RustPython/RustPython
Length of output: 927
🏁 Script executed:
# Look for the execute function that handles instructions in the VM
fd "interpreter.rs\|execute.rs" crates/vm/src --type fRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Find collect_keyword_args and collect_positional_args implementations
rg -n "fn collect_keyword_args|fn collect_positional_args" crates/vm/src -A10Repository: RustPython/RustPython
Length of output: 1214
🏁 Script executed:
# Look at the context around CallKw in frame.rs to understand the full call flow
sed -n '700,730p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 1505
🏁 Script executed:
# Search for stack effect calculations to understand the contract better
rg -n "CallKw.*stack|collect_keyword" crates --type rust | head -20Repository: RustPython/RustPython
Length of output: 351
🏁 Script executed:
# Look at the context around line 3733-3735 where CallKw is emitted differently
sed -n '3720,3760p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 1713
🏁 Script executed:
# Find what function this is in
sed -n '3700,3750p' crates/codegen/src/compile.rs | head -60Repository: RustPython/RustPython
Length of output: 2052
🏁 Script executed:
# Check if there are multiple compile functions handling different call contexts
rg -n "fn compile.*call" crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 304
🏁 Script executed:
# Look for the function containing the class definition code at line 3733
sed -n '3600,3650p' crates/codegen/src/compile.rs | head -60Repository: RustPython/RustPython
Length of output: 1745
🏁 Script executed:
# Search for tests of generic class definitions to verify this code path executes
rg -n "generic.*class|TypeParam" crates --type rust | grep test | head -20Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Get the full function signature where line 3733 is located
sed -n '3600,3650p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 1745
🏁 Script executed:
# Look at the full context of the compile_class_def function, especially the CallKw usage
sed -n '3630,3780p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 5811
🏁 Script executed:
# Check if this is within an if statement checking for generics/type params
sed -n '3650,3770p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 4735
🏁 Script executed:
# Let's verify how the generic class CallKw differs from the normal call path
# by examining what should be on the stack
rg -n "emit_load_const.*Tuple|BuildTuple" crates/codegen/src/compile.rs | grep -A2 -B2 "keyword\|kw"Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check the implementation of compile_keywords to see the correct pattern
rg -n "fn compile_keywords" crates/codegen/src/compile.rs -A20Repository: RustPython/RustPython
Length of output: 1112
🏁 Script executed:
# Look at how non-generic class handles keywords via compile_call_inner
sed -n '6160,6175p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 663
🏁 Script executed:
# Verify the emit_load_const pattern used in the generic class case (line 3728)
sed -n '3720,3745p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 1017
🏁 Script executed:
# Let's trace through what stack state emerges from the generic class keyword code
# Look at the exact sequence of stack operations
sed -n '3721,3743p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 931
🏁 Script executed:
# Search for tests or examples of generic classes with keyword arguments
rg -n "class.*\[.*\].*:" tests --type py | head -10Repository: RustPython/RustPython
Length of output: 110
🏁 Script executed:
# Check if there are any Python test files with generic classes
fd "test.*\.py" tests --type f | xargs grep -l "class.*\[.*\]" | head -5Repository: RustPython/RustPython
Length of output: 157
🏁 Script executed:
# Find test directories
find . -type d -name "*test*" | head -10Repository: RustPython/RustPython
Length of output: 136
🏁 Script executed:
# Look for Python files with generics
rg -n "TypeVar|class.*\[" --type py | head -20Repository: RustPython/RustPython
Length of output: 1579
🏁 Script executed:
# Let's check if there are any tests that would exercise generic class definitions
rg -n "class.*Generic\|TypeParams" --type py | head -10Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's verify the stack state at the generic class CallKw point
# by checking if the pattern matches what compile_keywords produces
sed -n '6059,6085p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 1154
🏁 Script executed:
# Let's verify the exact stack layout difference
# For the generic class case, trace what's on stack before CallKw
# Line 3721: first load function
# Line 3724: load class name string
# Line 3727-3730: load bases
# Line 3732: load generic_base
# Then the keyword loop at 3734-3742
# Now let's check what compile_keywords produces vs what generic class produces
sed -n '6059,6085p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 1154
🏁 Script executed:
# Check the exact difference: does compile_keywords build a map or tuple?
sed -n '6079,6088p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 304
🏁 Script executed:
# Look at CallMethodKeyword to see if there's another CallKw pattern
rg -n "CallMethodKeyword" crates/codegen/src/compile.rs -B3 -A3Repository: RustPython/RustPython
Length of output: 446
🏁 Script executed:
# Check the stack effect difference between the two patterns
# Look at line 6165 in compile_call_inner
sed -n '6163,6167p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 258
🏁 Script executed:
# Let me verify the critical difference in stack patterns
# Generic class (line 3734-3742): for each keyword: emit STRING, then compile_expression for value
# compile_call_inner (line 6161-6167): builds a TUPLE of all strings, then compiles expressions
# Let's confirm this difference by looking at what's built
echo "=== Generic class pattern ==="
sed -n '3734,3742p' crates/codegen/src/compile.rs
echo ""
echo "=== compile_call_inner pattern ==="
sed -n '6161,6167p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 744
🏁 Script executed:
# Verify if compile_keywords (used by compile_call_inner) creates a tuple
# and compile_keywords is called from where
rg -n "compile_keywords" crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 215
🏁 Script executed:
# The key question: does the generic class CallKw ever get executed?
# Look for any tests or examples that would use generic classes with kwargs
rg -n "class.*\[.*\].*:" extra_tests --type py | head -20Repository: RustPython/RustPython
Length of output: 176
🏁 Script executed:
# Let me trace the exact stack layout for the generic class CallKw case
# At line 3734-3742, what's the exact sequence?
echo "=== Full generic class keyword handling (lines 3721-3743) ==="
sed -n '3721,3743p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 993
🏁 Script executed:
# Now compare with the correct compile_call_inner pattern (lines 6161-6168)
echo "=== Correct compile_call_inner CallKw pattern (lines 6161-6168) ==="
sed -n '6161,6168p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 452
🏁 Script executed:
# Let's verify the critical difference:
# In generic class: individual STRING constants are emitted
# In compile_call_inner: a TUPLE of strings is built and emitted as a single constant
# Check the exact bytecode produced
echo "=== What does emit_load_const produce for each? ==="
rg -n "fn emit_load_const" crates/codegen/src/compile.rs -A5Repository: RustPython/RustPython
Length of output: 387
🏁 Script executed:
# Let me understand the exact stack layout difference
# For generic class, we need to see the full picture
echo "=== Stack analysis ==="
echo "Generic class with kwargs: emits individual STRINGS in loop, then CallKw"
echo "compile_call_inner with kwargs: builds TUPLE of strings, emits LoadConst with tuple, then CallKw"
echo ""
# Let's look at what the VM actually does with CallKw
sed -n '2072,2090p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 986
🏁 Script executed:
# The key insight: generic class emits strings individually in a loop
# But collect_keyword_args pops a TUPLE first!
# This is a stack mismatch!
# Let's verify by checking if generic class CallKw is even reachable
# Search for any test that would hit the is_generic && !arguments.keywords.is_empty() path
rg -n "TypeParam\|generic" extra_tests --type py | head -20Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Search for Python 3.12+ generic class tests
fd "*.py" extra_tests | xargs grep -l "class.*\[.*\]:" 2>/dev/null | head -10Repository: RustPython/RustPython
Length of output: 348
🏁 Script executed:
# Final confirmation: check if there's any comment or note about this being a known issue
rg -n "TODO\|FIXME\|BUG\|HACK" crates/codegen/src/compile.rs | grep -A2 -B2 "373[0-5]\|CallKw"Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Verify the nargs calculation in the generic class case one more time
echo "=== Generic class nargs calculation ==="
sed -n '3739,3746p' crates/codegen/src/compile.rs
echo ""
echo "=== compile_call_inner nargs calculation ==="
sed -n '6170,6172p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 452
🏁 Script executed:
# Let me get the exact nargs value at line 3733-3735
sed -n '3720,3747p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 1130
🏁 Script executed:
# Understand what nargs contains before the keyword handling
sed -n '3718,3735p' crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 774
Critical: CallKw nargs ABI violation and stack layout mismatch in generic class handler
The generic class keyword argument handling at lines 3721–3747 contains two critical bugs that contradict the compile_call_inner pattern (lines 6161–6168):
-
Stack Layout Mismatch: Generic class emits individual STRING constants in a loop for each keyword name, resulting in stack layout
[string_1, value_1, string_2, value_2, ...]. The VM'scollect_keyword_args(frame.rs:2072–2090) expects a singlePyTupleat the top of the stack, and will panic when attempting to downcast the final keyword value (not a tuple) toPyTuple. -
nargs ABI Violation: Generic class calculates
nargs: nargs + arguments.keywords.len(), including keyword count in nargs. This contradicts the VM contract where nargs represents positional argument count only. The correct pattern incompile_call_inneremits a tuple constant for keyword names and passes only positional count as nargs.
The non-generic class path correctly uses compile_call_inner (line 3777), which follows the proper ABI. Fix the generic class code to build a tuple of keyword name strings (via ConstantData::Tuple) and pass only the positional argument count as nargs, matching the compile_call_inner implementation.
🤖 Prompt for AI Agents
In @crates/codegen/src/compile.rs around lines 1881 - 1885, The generic-class
keyword handling is wrong: instead of emitting individual STRING constants for
each kw name and increasing nargs with keywords, build a single tuple constant
of keyword name strings (use ConstantData::Tuple) and emit that one tuple onto
the stack, and emit the call with nargs equal to the number of positional
arguments only (do not add keywords to nargs); mirror the pattern in
compile_call_inner so the VM's collect_keyword_args receives a PyTuple at the
top of the stack and the Call/CallKw ABI is satisfied.
| Cache = 0, // Placeholder | ||
| BeforeAsyncWith = 1, | ||
| BeforeWith = 2, | ||
| BinaryOpInplaceAddUnicode = 3, // Placeholder | ||
| BinarySlice = 4, // Placeholder | ||
| BinarySubscr = 5, | ||
| CheckEgMatch = 6, | ||
| CheckExcMatch = 7, | ||
| CleanupThrow = 8, | ||
| DeleteSubscr = 9, | ||
| EndAsyncFor = 10, | ||
| EndFor = 11, // Placeholder | ||
| EndSend = 12, | ||
| ExitInitCheck = 13, // Placeholder | ||
| FormatSimple = 14, | ||
| FormatWithSpec = 15, | ||
| GetAIter = 16, | ||
| Reserved = 17, | ||
| GetANext = 18, | ||
| GetIter = 19, | ||
| GetLen = 20, | ||
| GetYieldFromIter = 21, | ||
| InterpreterExit = 22, // Placeholder | ||
| LoadAssertionError = 23, // Placeholder | ||
| LoadBuildClass = 24, | ||
| LoadLocals = 25, // Placeholder | ||
| MakeFunction = 26, | ||
| MatchKeys = 27, | ||
| MatchMapping = 28, | ||
| MatchSequence = 29, | ||
| Nop = 30, | ||
| PopExcept = 31, | ||
| PopTop = 32, | ||
| PushExcInfo = 33, | ||
| PushNull = 34, // Placeholder | ||
| ReturnGenerator = 35, // Placeholder | ||
| ReturnValue = 36, | ||
| SetupAnnotations = 37, | ||
| StoreSlice = 38, // Placeholder | ||
| StoreSubscr = 39, | ||
| ToBool = 40, | ||
| UnaryInvert = 41, | ||
| UnaryNegative = 42, | ||
| UnaryNot = 43, | ||
| WithExceptStart = 44, | ||
| // ==================== With-argument instructions (opcode > 44) ==================== | ||
| BinaryOp { | ||
| op: Arg<BinaryOperator>, | ||
| }, | ||
| // 46: BUILD_CONST_KEY_MAP - not implemented, placeholder | ||
| } = 45, | ||
| BuildConstKeyMap { | ||
| size: Arg<u32>, | ||
| }, | ||
| // 47: BUILD_LIST | ||
| } = 46, // Placeholder | ||
| BuildList { | ||
| size: Arg<u32>, | ||
| }, | ||
| // 48: BUILD_MAP | ||
| } = 47, | ||
| BuildMap { | ||
| size: Arg<u32>, | ||
| }, | ||
| // 49: BUILD_SET | ||
| } = 48, | ||
| BuildSet { | ||
| size: Arg<u32>, | ||
| }, | ||
| // 50: BUILD_SLICE | ||
| } = 49, | ||
| BuildSlice { | ||
| argc: Arg<BuildSliceArgCount>, | ||
| }, | ||
| // 51: BUILD_STRING | ||
| } = 50, | ||
| BuildString { | ||
| size: Arg<u32>, | ||
| }, | ||
| // 52: BUILD_TUPLE | ||
| } = 51, | ||
| BuildTuple { | ||
| size: Arg<u32>, | ||
| }, | ||
| // 53: CALL | ||
| CallFunctionPositional { | ||
| } = 52, | ||
| Call { | ||
| nargs: Arg<u32>, | ||
| }, | ||
| // 54: CALL_FUNCTION_EX | ||
| } = 53, | ||
| CallFunctionEx { | ||
| has_kwargs: Arg<bool>, | ||
| }, | ||
| // 55: CALL_INTRINSIC_1 | ||
| } = 54, | ||
| CallIntrinsic1 { | ||
| func: Arg<IntrinsicFunction1>, | ||
| }, | ||
| // 56: CALL_INTRINSIC_2 | ||
| } = 55, | ||
| CallIntrinsic2 { | ||
| func: Arg<IntrinsicFunction2>, | ||
| }, | ||
| // 57: CALL_KW | ||
| CallFunctionKeyword { | ||
| } = 56, | ||
| CallKw { | ||
| nargs: Arg<u32>, | ||
| }, | ||
| // 58: COMPARE_OP | ||
| CompareOperation { | ||
| } = 57, | ||
| CompareOp { | ||
| op: Arg<ComparisonOperator>, | ||
| }, | ||
| // 59: CONTAINS_OP | ||
| ContainsOp(Arg<Invert>), | ||
| // 60: CONVERT_VALUE | ||
| } = 58, | ||
| ContainsOp(Arg<Invert>) = 59, | ||
| ConvertValue { | ||
| oparg: Arg<ConvertValueOparg>, | ||
| }, | ||
| // 61: COPY | ||
| } = 60, | ||
| CopyItem { | ||
| index: Arg<u32>, | ||
| }, | ||
| // 62: COPY_FREE_VARS - not implemented, placeholder | ||
| } = 61, | ||
| CopyFreeVars { | ||
| count: Arg<u32>, | ||
| }, | ||
| // 63: DELETE_ATTR | ||
| } = 62, // Placeholder | ||
| DeleteAttr { | ||
| idx: Arg<NameIdx>, | ||
| }, | ||
| // 64: DELETE_DEREF | ||
| DeleteDeref(Arg<NameIdx>), | ||
| // 65: DELETE_FAST | ||
| DeleteFast(Arg<NameIdx>), | ||
| // 66: DELETE_GLOBAL | ||
| DeleteGlobal(Arg<NameIdx>), | ||
| // 67: DELETE_NAME | ||
| DeleteLocal(Arg<NameIdx>), | ||
| // 68: DICT_MERGE - not implemented, placeholder | ||
| } = 63, | ||
| DeleteDeref(Arg<NameIdx>) = 64, | ||
| DeleteFast(Arg<NameIdx>) = 65, | ||
| DeleteGlobal(Arg<NameIdx>) = 66, | ||
| DeleteName(Arg<NameIdx>) = 67, | ||
| DictMerge { | ||
| index: Arg<u32>, | ||
| }, | ||
| // 69: DICT_UPDATE | ||
| } = 68, // Placeholder | ||
| DictUpdate { | ||
| index: Arg<u32>, | ||
| }, | ||
| // 70: ENTER_EXECUTOR - not implemented, placeholder | ||
| EnterExecutor { | ||
| index: Arg<u32>, | ||
| }, | ||
| // 71: EXTENDED_ARG | ||
| ExtendedArg, | ||
| // 72: FOR_ITER | ||
| } = 69, | ||
| EnterExecutor = 70, // Placeholder | ||
| ExtendedArg = 71, | ||
| ForIter { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 73: GET_AWAITABLE | ||
| GetAwaitable, | ||
| // 74: IMPORT_FROM | ||
| } = 72, | ||
| GetAwaitable = 73, // TODO: Make this instruction to hold an oparg | ||
| ImportFrom { | ||
| idx: Arg<NameIdx>, | ||
| }, | ||
| // 75: IMPORT_NAME | ||
| } = 74, | ||
| ImportName { | ||
| idx: Arg<NameIdx>, | ||
| }, | ||
| // 76: IS_OP | ||
| IsOp(Arg<Invert>), | ||
| // 77: JUMP_BACKWARD - not implemented, placeholder | ||
| } = 75, | ||
| IsOp(Arg<Invert>) = 76, | ||
| JumpBackward { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 78: JUMP_BACKWARD_NO_INTERRUPT - not implemented, placeholder | ||
| } = 77, // Placeholder | ||
| JumpBackwardNoInterrupt { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 79: JUMP_FORWARD - not implemented, placeholder | ||
| } = 78, // Placeholder | ||
| JumpForward { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 80: LIST_APPEND | ||
| } = 79, // Placeholder | ||
| ListAppend { | ||
| i: Arg<u32>, | ||
| }, | ||
| // 81: LIST_EXTEND - not implemented, placeholder | ||
| } = 80, | ||
| ListExtend { | ||
| i: Arg<u32>, | ||
| }, | ||
| // 82: LOAD_ATTR | ||
| } = 81, // Placeholder | ||
| LoadAttr { | ||
| idx: Arg<NameIdx>, | ||
| }, | ||
| // 83: LOAD_CONST | ||
| } = 82, | ||
| LoadConst { | ||
| idx: Arg<u32>, | ||
| }, | ||
| // 84: LOAD_DEREF | ||
| LoadDeref(Arg<NameIdx>), | ||
| // 85: LOAD_FAST | ||
| LoadFast(Arg<NameIdx>), | ||
| // 86: LOAD_FAST_AND_CLEAR | ||
| LoadFastAndClear(Arg<NameIdx>), | ||
| // 87: LOAD_FAST_CHECK - not implemented, placeholder | ||
| LoadFastCheck(Arg<NameIdx>), | ||
| // 88: LOAD_FAST_LOAD_FAST - not implemented, placeholder | ||
| } = 83, | ||
| LoadDeref(Arg<NameIdx>) = 84, | ||
| LoadFast(Arg<NameIdx>) = 85, | ||
| LoadFastAndClear(Arg<NameIdx>) = 86, | ||
| LoadFastCheck(Arg<NameIdx>) = 87, // Placeholder | ||
| LoadFastLoadFast { | ||
| arg: Arg<u32>, | ||
| }, | ||
| // 89: LOAD_FROM_DICT_OR_DEREF - not implemented, placeholder | ||
| LoadFromDictOrDeref(Arg<NameIdx>), | ||
| // 90: LOAD_FROM_DICT_OR_GLOBALS - not implemented, placeholder | ||
| LoadFromDictOrGlobals(Arg<NameIdx>), | ||
| // 91: LOAD_GLOBAL | ||
| LoadGlobal(Arg<NameIdx>), | ||
| // 92: LOAD_NAME | ||
| LoadNameAny(Arg<NameIdx>), | ||
| // 93: LOAD_SUPER_ATTR - not implemented, placeholder | ||
| } = 88, // Placeholder | ||
| LoadFromDictOrDeref(Arg<NameIdx>) = 89, // Placeholder | ||
| LoadFromDictOrGlobals(Arg<NameIdx>) = 90, // Placeholder | ||
| LoadGlobal(Arg<NameIdx>) = 91, | ||
| LoadName(Arg<NameIdx>) = 92, | ||
| LoadSuperAttr { | ||
| arg: Arg<u32>, | ||
| }, | ||
| // 94: MAKE_CELL - not implemented, placeholder | ||
| MakeCell(Arg<NameIdx>), | ||
| // 95: MAP_ADD | ||
| } = 93, // Placeholder | ||
| MakeCell(Arg<NameIdx>) = 94, // Placeholder | ||
| MapAdd { | ||
| i: Arg<u32>, | ||
| }, | ||
| // 96: MATCH_CLASS | ||
| MatchClass(Arg<u32>), | ||
| // 97: POP_JUMP_IF_FALSE | ||
| } = 95, | ||
| MatchClass(Arg<u32>) = 96, | ||
| PopJumpIfFalse { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 98: POP_JUMP_IF_NONE - not implemented, placeholder | ||
| } = 97, | ||
| PopJumpIfNone { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 99: POP_JUMP_IF_NOT_NONE - not implemented, placeholder | ||
| } = 98, // Placeholder | ||
| PopJumpIfNotNone { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 100: POP_JUMP_IF_TRUE | ||
| } = 99, // Placeholder | ||
| PopJumpIfTrue { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 101: RAISE_VARARGS | ||
| Raise { | ||
| } = 100, | ||
| RaiseVarargs { | ||
| kind: Arg<RaiseKind>, | ||
| }, | ||
| // 102: RERAISE | ||
| } = 101, | ||
| Reraise { | ||
| depth: Arg<u32>, | ||
| }, | ||
| // 103: RETURN_CONST | ||
| } = 102, | ||
| ReturnConst { | ||
| idx: Arg<u32>, | ||
| }, | ||
| // 104: SEND | ||
| } = 103, | ||
| Send { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 105: SET_ADD | ||
| } = 104, | ||
| SetAdd { | ||
| i: Arg<u32>, | ||
| }, | ||
| // 106: SET_FUNCTION_ATTRIBUTE | ||
| } = 105, | ||
| SetFunctionAttribute { | ||
| attr: Arg<MakeFunctionFlags>, | ||
| }, | ||
| // 107: SET_UPDATE - not implemented, placeholder | ||
| } = 106, | ||
| SetUpdate { | ||
| i: Arg<u32>, | ||
| }, | ||
| // 108: STORE_ATTR | ||
| } = 107, // Placeholder | ||
| StoreAttr { | ||
| idx: Arg<NameIdx>, | ||
| }, | ||
| // 109: STORE_DEREF | ||
| StoreDeref(Arg<NameIdx>), | ||
| // 110: STORE_FAST | ||
| StoreFast(Arg<NameIdx>), | ||
| // 111: STORE_FAST_LOAD_FAST | ||
| } = 108, | ||
| StoreDeref(Arg<NameIdx>) = 109, | ||
| StoreFast(Arg<NameIdx>) = 110, | ||
| StoreFastLoadFast { | ||
| store_idx: Arg<NameIdx>, | ||
| load_idx: Arg<NameIdx>, | ||
| }, | ||
| // 112: STORE_FAST_STORE_FAST - not implemented, placeholder | ||
| } = 111, | ||
| StoreFastStoreFast { | ||
| arg: Arg<u32>, | ||
| }, | ||
| // 113: STORE_GLOBAL | ||
| StoreGlobal(Arg<NameIdx>), | ||
| // 114: STORE_NAME | ||
| StoreLocal(Arg<NameIdx>), | ||
| // 115: SWAP | ||
| } = 112, // Placeholder | ||
| StoreGlobal(Arg<NameIdx>) = 113, | ||
| StoreName(Arg<NameIdx>) = 114, | ||
| Swap { | ||
| index: Arg<u32>, | ||
| }, | ||
| // 116: UNPACK_EX | ||
| } = 115, | ||
| UnpackEx { | ||
| args: Arg<UnpackExArgs>, | ||
| }, | ||
| // 117: UNPACK_SEQUENCE | ||
| } = 116, | ||
| UnpackSequence { | ||
| size: Arg<u32>, | ||
| }, | ||
| // 118: YIELD_VALUE | ||
| } = 117, | ||
| YieldValue { | ||
| arg: Arg<u32>, | ||
| }, | ||
| // ==================== RustPython-only instructions (119+) ==================== | ||
| // 119: BREAK | ||
| } = 118, | ||
| Resume { | ||
| arg: Arg<u32>, | ||
| } = 149, | ||
| // ==================== RustPython-only instructions (119-135) ==================== | ||
| // Ideally, we want to be fully aligned with CPython opcodes, but we still have some leftovers. | ||
| // So we assign random IDs to these opcodes. | ||
| Break { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 120: BUILD_LIST_FROM_TUPLES | ||
| } = 119, | ||
| BuildListFromTuples { | ||
| size: Arg<u32>, | ||
| }, | ||
| // 121: BUILD_MAP_FOR_CALL | ||
| } = 120, | ||
| BuildMapForCall { | ||
| size: Arg<u32>, | ||
| }, | ||
| // 122: BUILD_SET_FROM_TUPLES | ||
| } = 121, | ||
| BuildSetFromTuples { | ||
| size: Arg<u32>, | ||
| }, | ||
| // 123: BUILD_TUPLE_FROM_ITER | ||
| BuildTupleFromIter, | ||
| // 124: BUILD_TUPLE_FROM_TUPLES | ||
| } = 122, | ||
| BuildTupleFromIter = 123, | ||
| BuildTupleFromTuples { | ||
| size: Arg<u32>, | ||
| }, | ||
| // 125: CALL_METHOD | ||
| } = 124, | ||
| CallMethodPositional { | ||
| nargs: Arg<u32>, | ||
| }, | ||
| // 126: CALL_METHOD_KW | ||
| } = 125, | ||
| CallMethodKeyword { | ||
| nargs: Arg<u32>, | ||
| }, | ||
| // 127: CALL_METHOD_EX | ||
| } = 126, | ||
| CallMethodEx { | ||
| has_kwargs: Arg<bool>, | ||
| }, | ||
| // 128: CONTINUE | ||
| } = 127, | ||
| Continue { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 129: JUMP (CPython uses pseudo-op 256) | ||
| Jump { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 130: JUMP_IF_FALSE_OR_POP | ||
| } = 128, | ||
| JumpIfFalseOrPop { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 131: JUMP_IF_TRUE_OR_POP | ||
| } = 129, | ||
| JumpIfTrueOrPop { | ||
| target: Arg<Label>, | ||
| }, | ||
| // 132: JUMP_IF_NOT_EXC_MATCH | ||
| JumpIfNotExcMatch(Arg<Label>), | ||
| // 133: LOAD_CLASSDEREF | ||
| LoadClassDeref(Arg<NameIdx>), | ||
| // 134: LOAD_CLOSURE (CPython uses pseudo-op 258) | ||
| LoadClosure(Arg<NameIdx>), | ||
| // 135: LOAD_METHOD (CPython uses pseudo-op 259) | ||
| LoadMethod { | ||
| idx: Arg<NameIdx>, | ||
| }, | ||
| // 136: POP_BLOCK (CPython uses pseudo-op 263) | ||
| PopBlock, | ||
| // 137: REVERSE | ||
| } = 130, | ||
| JumpIfNotExcMatch(Arg<Label>) = 131, | ||
| LoadClassDeref(Arg<NameIdx>) = 132, | ||
| Reverse { | ||
| amount: Arg<u32>, | ||
| }, | ||
| // 138: SET_EXC_INFO | ||
| SetExcInfo, | ||
| // 139: SUBSCRIPT | ||
| Subscript, | ||
| // 140-148: Reserved (padding to keep RESUME at 149) | ||
| Reserved140, | ||
| Reserved141, | ||
| Reserved142, | ||
| Reserved143, | ||
| Reserved144, | ||
| Reserved145, | ||
| Reserved146, | ||
| Reserved147, | ||
| Reserved148, | ||
| // 149: RESUME | ||
| Resume { | ||
| arg: Arg<u32>, | ||
| }, | ||
| // If you add a new instruction here, be sure to keep LAST_INSTRUCTION updated | ||
| } = 133, | ||
| SetExcInfo = 134, | ||
| Subscript = 135, | ||
| // ===== Pseudo Opcodes (252+) ====== | ||
| Jump { | ||
| target: Arg<Label>, | ||
| } = 252, // CPython uses pseudo-op 256 | ||
| LoadClosure(Arg<NameIdx>) = 253, // CPython uses pseudo-op 258 | ||
| LoadMethod { | ||
| idx: Arg<NameIdx>, | ||
| } = 254, // CPython uses pseudo-op 259 | ||
| PopBlock = 255, // CPython uses pseudo-op 263 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check placeholder opcode handling in VM and JIT
echo "=== Checking VM frame dispatch for placeholder handling ==="
rg -n -C3 "BinarySlice|EndFor|LoadLocals|PushNull" crates/vm/src/frame.rs
echo "=== Checking JIT instruction handling ==="
rg -n -C3 "BinarySlice|EndFor|LoadLocals|PushNull" crates/jit/src/instructions.rs
echo "=== Checking compiler emission ==="
rg -n -C3 "BinarySlice|EndFor|LoadLocals|PushNull" crates/codegen/src/compile.rsRepository: RustPython/RustPython
Length of output: 201
🏁 Script executed:
# Also check if there's a TryFrom or match statement that handles these opcodes
echo "=== Checking for opcode dispatch mechanisms ==="
rg -n "impl.*TryFrom.*Instruction\|match.*opcode\|Instruction::" crates/compiler-core/src/bytecode.rs | head -20Repository: RustPython/RustPython
Length of output: 112
🏁 Script executed:
# Look for panic or unreachable patterns on instruction handling
echo "=== Checking for error handling on unsupported opcodes ==="
rg -n -C2 "panic\!|unreachable\!|unimplemented\!" crates/vm/src/frame.rs | grep -A2 -B2 "opcode\|instruction"Repository: RustPython/RustPython
Length of output: 249
🏁 Script executed:
# Find the actual structure
echo "=== Repository structure ==="
git ls-files | grep -E "(vm|jit|interpreter)" | head -20
echo "=== Looking for instruction-related files ==="
git ls-files | grep -E "instruction|opcode|dispatch|frame" | head -20
echo "=== Check if crates/vm/src/frame.rs exists ==="
ls -la crates/vm/src/ 2>/dev/null || echo "crates/vm/src not found"
echo "=== List all crates ==="
ls -la crates/ | grep "^d"Repository: RustPython/RustPython
Length of output: 5171
🏁 Script executed:
# Search for instruction enum usage
echo "=== Finding where Instruction enum is used ==="
rg -l "Instruction::" | head -10
echo "=== Search for match statements on Instruction ==="
rg -n "match.*instruction\|match.*opcode" -i | head -20Repository: RustPython/RustPython
Length of output: 363
🏁 Script executed:
# Check the bytecode.rs file itself for usage patterns
echo "=== Check bytecode.rs for impl blocks ==="
rg -n "^impl" crates/compiler-core/src/bytecode.rs
echo "=== Look for execute or dispatch methods ==="
rg -n "fn.*execute\|fn.*dispatch" crates/compiler-core/src/bytecode.rsRepository: RustPython/RustPython
Length of output: 2145
🏁 Script executed:
# Check the TryFrom implementation for Instruction
echo "=== TryFrom<u8> for Instruction (around line 970) ==="
sed -n '970,1010p' crates/compiler-core/src/bytecode.rsRepository: RustPython/RustPython
Length of output: 1293
🏁 Script executed:
# Check the Instruction impl block at line 1711
echo "=== Instruction impl block (around line 1711) ==="
sed -n '1711,1750p' crates/compiler-core/src/bytecode.rsRepository: RustPython/RustPython
Length of output: 1360
🏁 Script executed:
# Look for instruction dispatch in frame.rs
echo "=== Looking for instruction dispatch in frame.rs ==="
rg -n "Instruction::" crates/vm/src/frame.rs | head -30Repository: RustPython/RustPython
Length of output: 2292
🏁 Script executed:
# Check what happens with unsupported instructions
echo "=== Looking for dispatch patterns in frame.rs ==="
sed -n '1,100p' crates/vm/src/frame.rs | head -50Repository: RustPython/RustPython
Length of output: 1863
🏁 Script executed:
# Find the execute_instruction method or similar
echo "=== Find main execution loop ==="
rg -n "fn.*execute\|fn.*dispatch\|match.*instruction" crates/vm/src/frame.rs | head -15Repository: RustPython/RustPython
Length of output: 97
🏁 Script executed:
# Look for the large match statement that handles all instructions
echo "=== Search for instruction execution pattern ==="
rg -n "bytecode::Instruction::" crates/vm/src/frame.rs | wc -l
# Find where the instruction execution happens
echo "=== Looking for instruction match arms ==="
sed -n '550,800p' crates/vm/src/frame.rs | grep -n "Instruction::"Repository: RustPython/RustPython
Length of output: 2474
🏁 Script executed:
# Get a larger view to see the dispatch pattern
echo "=== Finding the execute instruction method ==="
rg -n "fn execute\|fn apply\|match.*instruction" crates/vm/src/frame.rs -A5 | head -40Repository: RustPython/RustPython
Length of output: 111
🏁 Script executed:
# Search for all variants mentioned in the review as placeholders
echo "=== Check if placeholder instructions are handled ==="
rg -n "BinaryOpInplaceAddUnicode|BinarySlice|EndFor|LoadLocals|PushNull|BuildConstKeyMap" crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 118
🏁 Script executed:
# See the complete instruction match statement structure
wc -l crates/vm/src/frame.rs
sed -n '560,900p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 14993
🏁 Script executed:
# Let's see if the match statement has a catch-all or default case
echo "=== Looking for match statement structure and completeness ==="
sed -n '560,2700p' crates/vm/src/frame.rs | tail -200Repository: RustPython/RustPython
Length of output: 8160
🏁 Script executed:
# Search for the end of the match statement
echo "=== Find where the match statement ends ==="
rg -n "^ }" crates/vm/src/frame.rs | tail -5Repository: RustPython/RustPython
Length of output: 185
🏁 Script executed:
# Check if there's a wildcard or default pattern in the match
echo "=== Check for default/wildcard patterns in match ==="
rg -n "_ =>" crates/vm/src/frame.rs | grep -A2 -B2 "instruction\|opcode"Repository: RustPython/RustPython
Length of output: 117
🏁 Script executed:
# Get context around line 2651 to see if there's a wildcard pattern
echo "=== Context around match closing brace ==="
sed -n '2600,2660p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 2614
🏁 Script executed:
# Count all instruction variants in the enum to see if all are handled
echo "=== Count Instruction enum variants ==="
grep -E "^\s+\w+\s*(\{|=)" crates/compiler-core/src/bytecode.rs | grep -v "impl\|^--" | wc -l
echo "=== Count instruction match arms in frame.rs ==="
rg "bytecode::Instruction::" crates/vm/src/frame.rs | wc -lRepository: RustPython/RustPython
Length of output: 161
🏁 Script executed:
# Look for any unreachable or panic patterns related to instructions
echo "=== Search for instruction validation ==="
rg -n "is_valid\|panic\|unreachable\|match.*instruction" crates/compiler-core/src/bytecode.rs | head -20Repository: RustPython/RustPython
Length of output: 106
🏁 Script executed:
# Check if the match is marked as exhaustive
echo "=== Check the actual match structure ==="
sed -n '585,595p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 502
🏁 Script executed:
# This is critical - 437 variants but only 117 match arms
# Let's find the complete match statement in frame.rs
echo "=== Find the main instruction match statement ==="
sed -n '585,2500p' crates/vm/src/frame.rs | grep -n "^ }" | head -1Repository: RustPython/RustPython
Length of output: 129
🏁 Script executed:
# Let's see what's around line 585 more broadly to see if the match closes
echo "=== Checking match statement closure ==="
sed -n '1200,1250p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 2885
🏁 Script executed:
# Search for where the big instruction match ends
echo "=== Finding match arms count more accurately ==="
rg -n "bytecode::Instruction::" crates/vm/src/frame.rs | tail -20Repository: RustPython/RustPython
Length of output: 1398
🏁 Script executed:
# Check if there's any wildcard or default handling
echo "=== Check the last match arms ==="
sed -n '1100,1200p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 4826
🏁 Script executed:
# Look for any code that might catch unhandled instructions
echo "=== Search for error handling ==="
rg -n "unreachable\|panic\|todo\|unimplemented" crates/vm/src/frame.rs | head -20Repository: RustPython/RustPython
Length of output: 98
🏁 Script executed:
# Let's get the actual size of the match statement by looking more carefully
echo "=== Get the full match statement in execute_instruction ==="
sed -n '540,1800p' crates/vm/src/frame.rs | tail -50Repository: RustPython/RustPython
Length of output: 1974
🏁 Script executed:
# Look for where the match statement actually ends
echo "=== Search for the end of the instruction match ==="
rg -n "^ \}" crates/vm/src/frame.rs | head -10Repository: RustPython/RustPython
Length of output: 260
🏁 Script executed:
# Check if there's a wildcard pattern at the end
echo "=== Last 20 lines of instruction match arms ==="
sed -n '1770,1800p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 1252
🏁 Script executed:
# Let's verify by finding all unhandled instruction variants
echo "=== All Instruction variants ==="
sed -n '679,958p' crates/compiler-core/src/bytecode.rs | grep -E "^\s+\w+(\s|{)" | head -60Repository: RustPython/RustPython
Length of output: 138
🏁 Script executed:
# Check if compiler validation prevents emitting placeholder opcodes
echo "=== Search for placeholder validation in codegen ==="
rg -n "Placeholder\|BinarySlice\|EndFor" crates/codegen/src/compile.rs | head -20Repository: RustPython/RustPython
Length of output: 117
🏁 Script executed:
# Check all Instruction variants more carefully
echo "=== List instruction variants from bytecode.rs ==="
sed -n '679,958p' crates/compiler-core/src/bytecode.rs | grep "=" | grep -v "impl\|//" | head -70Repository: RustPython/RustPython
Length of output: 1544
🏁 Script executed:
# Check codegen to see if it emits placeholder opcodes
echo "=== Check if codegen emits these opcodes ==="
rg -n "emit\|Instruction::" crates/codegen/src/compile.rs | grep -E "Binary|EndFor|LoadLocals|PushNull" | head -20Repository: RustPython/RustPython
Length of output: 109
🏁 Script executed:
# Check if there's validation before bytecode generation
echo "=== Look for any bytecode validation ==="
rg -n "validate\|check.*instruction\|verify" crates/compiler-core/src/bytecode.rsRepository: RustPython/RustPython
Length of output: 105
🏁 Script executed:
# Verify if the TryFrom implementation actually accepts placeholder discriminants
echo "=== Re-check the TryFrom ranges ==="
sed -n '970,1010p' crates/compiler-core/src/bytecode.rsRepository: RustPython/RustPython
Length of output: 1275
🏁 Script executed:
# Check what the ranges actually are - look at the enum values
echo "=== Extract discriminant values for placeholders ==="
sed -n '679,958p' crates/compiler-core/src/bytecode.rs | grep -E "BinarySlice|EndFor|LoadLocals|PushNull|BinaryOpInplaceAddUnicode"Repository: RustPython/RustPython
Length of output: 326
🏁 Script executed:
# Confirm the actual enum discriminant values
echo "=== Get exact discriminant values from enum ==="
sed -n '679,958p' crates/compiler-core/src/bytecode.rs | grep -E "= [0-9]+" | head -80Repository: RustPython/RustPython
Length of output: 2024
🏁 Script executed:
# Check if YieldValue is indeed 118
echo "=== Find YieldValue discriminant ==="
rg -n "YieldValue.*=" crates/compiler-core/src/bytecode.rsRepository: RustPython/RustPython
Length of output: 203
🏁 Script executed:
# Find YieldValue in the enum definition
echo "=== Find YieldValue enum variant ==="
sed -n '679,958p' crates/compiler-core/src/bytecode.rs | grep -n "YieldValue"Repository: RustPython/RustPython
Length of output: 122
🏁 Script executed:
# Get context around line 118 to see what's there
echo "=== Get all discriminants with their values ==="
sed -n '679,960p' crates/compiler-core/src/bytecode.rs | tail -100Repository: RustPython/RustPython
Length of output: 2580
VM dispatch for placeholder opcodes relies on panic rather than graceful error handling.
Placeholder instruction variants (e.g., BinaryOpInplaceAddUnicode, BinarySlice, EndFor, LoadLocals, PushNull) are defined in the enum and pass TryFrom<u8> validation since their discriminants fall within accepted ranges (0–118, 149, 119–135, 252–255). However, the VM's instruction dispatch ends with a catch-all pattern that calls unreachable!(), which will panic if any unhandled instruction reaches execution.
While this prevents silent failures, it's fragile: if codegen or bytecode loading ever accidentally emits a placeholder opcode, the VM will crash instead of failing gracefully. Consider:
- Adding explicit error handling or a warning for placeholders in the catch-all pattern.
- Validating at compile time that codegen never emits these opcodes.
- Documenting why these opcodes exist in the enum if they're never used.
🤖 Prompt for AI Agents
In @crates/compiler-core/src/bytecode.rs around lines 679 - 958, The VM
currently ends bytecode dispatch with a catch-all that calls unreachable!(), so
placeholder enum variants in Bytecode (e.g., BinaryOpInplaceAddUnicode,
BinarySlice, EndFor, LoadLocals, PushNull) can cause a panic if accidentally
executed; change the dispatch default arm to return a descriptive runtime error
(or Err) indicating an invalid/placeholder opcode rather than panicking, and add
an explicit match arm or helper check (e.g., is_placeholder_opcode) that logs or
maps placeholder discriminants to that error; additionally, add a validation
step in bytecode loading or TryFrom<u8> conversion for Bytecode to reject
placeholder discriminants early (or emit warnings) and document the retained
placeholder variants in the Bytecode enum comment to clarify they must never be
emitted by codegen.
| _ => { | ||
| unreachable!("{instruction:?} instruction should not be executed") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's see the context around lines 1782-1784 in frame.rs
head -1790 crates/vm/src/frame.rs | tail -20Repository: RustPython/RustPython
Length of output: 732
🏁 Script executed:
# Find the Instruction enum definition
fd -t f "bytecode.rs" | head -5Repository: RustPython/RustPython
Length of output: 144
🏁 Script executed:
# Let's also check if there's an instruction definition in frame.rs itself
rg "enum Instruction" crates/vm/src/ -A 20Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Find and read the Instruction enum definition
rg "pub enum Instruction" crates/compiler-core/src/bytecode.rs -A 100Repository: RustPython/RustPython
Length of output: 2516
🏁 Script executed:
# Get more context around lines 1782-1784 to see the full match statement
sed -n '1700,1790p' crates/vm/src/frame.rs | head -100Repository: RustPython/RustPython
Length of output: 4048
🏁 Script executed:
# Find the beginning of the match statement that contains lines 1782-1784
# Start from line 1782 and go backwards to find the match statement
sed -n '1600,1782p' crates/vm/src/frame.rs | tail -200Repository: RustPython/RustPython
Length of output: 8457
🏁 Script executed:
# Search for the function name that contains this code
sed -n '1650,1782p' crates/vm/src/frame.rs | grep -E "fn |match " | head -20Repository: RustPython/RustPython
Length of output: 119
🏁 Script executed:
# Find the function definition containing this match statement
sed -n '1400,1700p' crates/vm/src/frame.rs | grep -B 50 "match.*instruction" | head -60Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's search for the execute or dispatch function that contains this match
rg "fn execute" crates/vm/src/frame.rs -A 5 | head -40Repository: RustPython/RustPython
Length of output: 1542
🏁 Script executed:
# Find the beginning of execute_instruction function
rg "fn execute_instruction" crates/vm/src/frame.rs -A 500 | head -100Repository: RustPython/RustPython
Length of output: 4067
🏁 Script executed:
# Count the number of instruction variants and see if the match is exhaustive
rg "bytecode::Instruction::" crates/vm/src/frame.rs | wc -lRepository: RustPython/RustPython
Length of output: 68
🏁 Script executed:
# Check if there are placeholder instructions and how they're handled
rg "Placeholder" crates/compiler-core/src/bytecode.rs | head -20Repository: RustPython/RustPython
Length of output: 788
🏁 Script executed:
# Look for comments around the catch-all unreachable to understand its intent
sed -n '1770,1790p' crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 786
🏁 Script executed:
# Get the full list of instruction variants from the enum
rg "^\s+([\w:]+)\s*=" crates/compiler-core/src/bytecode.rs | grep -E "pub enum Instruction" -A 200 | grep "=" | wc -lRepository: RustPython/RustPython
Length of output: 66
🏁 Script executed:
# Check if Rust compiler would complain about non-exhaustive match
# Look for warnings about unhandled variants
rg "unreachable" crates/vm/src/frame.rs -B 5 -A 2Repository: RustPython/RustPython
Length of output: 344
🏁 Script executed:
# Let's verify if the Instruction enum is marked as exhaustive
rg "enum Instruction" crates/compiler-core/src/bytecode.rs -B 2 -A 1Repository: RustPython/RustPython
Length of output: 232
🏁 Script executed:
# Get the complete list of all instruction variants from the enum
rg "pub enum Instruction" crates/compiler-core/src/bytecode.rs -A 500 | grep -E "^\s+\w+.*=" | sed 's/.*Instruction:://' | sed 's/\s*=.*//' | sort > /tmp/enum_variants.txt
# Get all the instruction variants handled in execute_instruction
rg "bytecode::Instruction::" crates/vm/src/frame.rs | sed 's/.*bytecode::Instruction:://' | sed 's/[^a-zA-Z_].*//' | sort | uniq > /tmp/handled_variants.txt
# Compare the two
echo "=== Variants in enum but NOT handled (should be empty or only placeholders) ==="
comm -23 /tmp/enum_variants.txt /tmp/handled_variants.txt
echo ""
echo "=== Total enum variants ==="
wc -l /tmp/enum_variants.txt
echo ""
echo "=== Total handled variants ==="
wc -l /tmp/handled_variants.txtRepository: RustPython/RustPython
Length of output: 2554
🏁 Script executed:
# Let me get a cleaner list of just the variant names from the enum
rg "pub enum Instruction" crates/compiler-core/src/bytecode.rs -A 500 | grep -E "^\s+[A-Z][a-zA-Z0-9]*(\s*{|$|\()" | awk '{print $1}' | sort | uniq > /tmp/enum_clean.txt
# Get handled variants more carefully
rg "Instruction::[A-Z]" crates/vm/src/frame.rs -o | sed 's/Instruction:://' | sort | uniq > /tmp/handled_clean.txt
echo "=== Unhandled instruction variants ==="
comm -23 /tmp/enum_clean.txt /tmp/handled_clean.txt
echo ""
echo "=== Count ==="
echo "Enum variants: $(wc -l < /tmp/enum_clean.txt)"
echo "Handled variants: $(wc -l < /tmp/handled_clean.txt)"Repository: RustPython/RustPython
Length of output: 1752
🏁 Script executed:
# Let's also check if there are any placeholder instructions that don't have handlers
rg "// Placeholder" crates/compiler-core/src/bytecode.rs | grep -oE "[A-Z][a-zA-Z0-9]*.*Placeholder" | sed 's/ .*//' > /tmp/placeholders.txt
echo "=== Placeholder instructions ==="
cat /tmp/placeholders.txtRepository: RustPython/RustPython
Length of output: 379
🏁 Script executed:
# Let me extract instruction handlers more carefully by looking at the actual match arms
rg "bytecode::Instruction::" crates/vm/src/frame.rs | head -100 | sed 's/.*Instruction:://' | sed 's/\s*=>.*//' | sed 's/\s*{.*//' | awk '{print $1}' | sort | uniqRepository: RustPython/RustPython
Length of output: 1286
🏁 Script executed:
# Count how many match arms are in execute_instruction
rg "bytecode::Instruction::" crates/vm/src/frame.rs | wc -lRepository: RustPython/RustPython
Length of output: 68
🏁 Script executed:
# Check if there are explicit handlers for the placeholder instructions
rg "Instruction::(Cache|BinaryOpInplaceAddUnicode|BinarySlice|EndFor|ExitInitCheck|InterpreterExit|LoadAssertionError|LoadLocals|PushNull|ReturnGenerator|StoreSlice)" crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's verify the total count of instruction variants more accurately
# by extracting just the variant names from the enum definition
sed -n '/pub enum Instruction/,/^}/p' crates/compiler-core/src/bytecode.rs | grep -E '^\s+[A-Z][a-zA-Z0-9]*\s*(\{|=|\()' | wc -lRepository: RustPython/RustPython
Length of output: 68
🏁 Script executed:
# Check the structure of the enum more carefully
sed -n '/pub enum Instruction/,/^}/p' crates/compiler-core/src/bytecode.rs | tail -50Repository: RustPython/RustPython
Length of output: 1280
🏁 Script executed:
# Let's verify if Rust's exhaustiveness checking would work here
# by checking if the enum has any special attributes
rg "#\[.*\]" crates/compiler-core/src/bytecode.rs | grep -B 5 "pub enum Instruction" | head -20Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's get the exact list of unhandled variants by comparing the handled list with enum
sed -n '/pub enum Instruction/,/^}/p' crates/compiler-core/src/bytecode.rs | grep -E '^\s+[A-Z]' | awk '{print $1}' | sort > /tmp/all_variants.txt
# Get all handled variants from execute_instruction
rg "bytecode::Instruction::" crates/vm/src/frame.rs | sed 's/.*Instruction:://' | sed 's/\s*=>.*//' | sed 's/\s*{.*//' | sed 's/\s*(.*//' | awk '{print $1}' | sort | uniq > /tmp/handled.txt
# Find variants that are not handled
echo "=== Variants NOT explicitly handled ==="
comm -23 /tmp/all_variants.txt /tmp/handled.txt
echo ""
echo "=== Are these all placeholders? ==="
# Check which of the unhandled are marked as placeholders
sed -n '/pub enum Instruction/,/^}/p' crates/compiler-core/src/bytecode.rs | grep -B 1 "Placeholder" | grep -E "^\s+[A-Z]" | awk '{print $1}' | sort > /tmp/placeholders_list.txt
cat /tmp/placeholders_list.txtRepository: RustPython/RustPython
Length of output: 1504
🏁 Script executed:
# Let's verify that these "unhandled" variants are actually being rejected
# by checking if there's any handling outside of execute_instruction
rg "LoadFast|LoadDeref|StoreFast|StoreDeref|DeleteFast" crates/vm/src/frame.rs | head -20Repository: RustPython/RustPython
Length of output: 544
🏁 Script executed:
# Check if there's pattern matching on the argument or other routing logic
# that might handle these variants separately
sed -n '1200,1400p' crates/vm/src/frame.rs | grep -E "match|fn " | head -30Repository: RustPython/RustPython
Length of output: 2371
🏁 Script executed:
# Let's look at the actual match statement more carefully
# and search for specific instruction handlers
sed -n '1300,1500p' crates/vm/src/frame.rs | grep -A 3 "LoadFast\|StoreFast\|DeleteFast"Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's see if the unhandled variants are indeed in the match statement
# by checking the line count and structure more carefully
rg "bytecode::Instruction::" crates/vm/src/frame.rs | grep -c "=> {"Repository: RustPython/RustPython
Length of output: 67
🏁 Script executed:
# Get a comprehensive list of what's actually being matched
sed -n '/fn execute_instruction/,/^ fn /p' crates/vm/src/frame.rs | grep "bytecode::Instruction::" | sed 's/.*Instruction:://' | sed 's/\s*=>.*//' | sed 's/\s*{.*//' | awk '{print $1}' | sort | uniq | wc -lRepository: RustPython/RustPython
Length of output: 68
🏁 Script executed:
# Let's verify the exact list of handled variants with a more precise approach
sed -n '/fn execute_instruction/,/^ fn [a-z]/p' crates/vm/src/frame.rs | grep "bytecode::Instruction::" | sed 's/.*bytecode::Instruction:://' | sed 's/\s*[=({].*//' | sort | uniq > /tmp/actually_handled.txt
cat /tmp/actually_handled.txt | wc -l
echo ""
echo "=== Sample of handled variants ==="
head -20 /tmp/actually_handled.txtRepository: RustPython/RustPython
Length of output: 366
🏁 Script executed:
# Check if these unhandled variants are truly all placeholders or only used variants
comm -23 /tmp/all_variants.txt /tmp/actually_handled.txt | while read var; do
grep -q "$var.*Placeholder" <(sed -n '/pub enum Instruction/,/^}/p' crates/compiler-core/src/bytecode.rs) && echo "$var: PLACEHOLDER" || echo "$var: NOT PLACEHOLDER"
doneRepository: RustPython/RustPython
Length of output: 4028
🏁 Script executed:
# Let's verify if the key instructions like LoadFast are actually being handled
# by checking the actual match arms
rg "Instruction::LoadFast\(" crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 118
🏁 Script executed:
# Let's count the total lines of the match statement to understand its scope
sed -n '/fn execute_instruction/,/^ fn /p' crates/vm/src/frame.rs | wc -lRepository: RustPython/RustPython
Length of output: 69
🏁 Script executed:
# Let's check the end of execute_instruction to see the catch-all
sed -n '/fn execute_instruction/,/^ fn /p' crates/vm/src/frame.rs | tail -20Repository: RustPython/RustPython
Length of output: 765
Add compile-time exhaustiveness checking for new instruction variants.
The catch-all pattern with unreachable!() disables Rust's exhaustiveness checking. If new Instruction variants are added to the enum without corresponding match arms here, they will panic at runtime rather than fail at compile time. Consider restructuring to leverage the compiler's exhaustiveness checking—for example, by removing the catch-all if all variants are already handled, or by using a #[non_exhaustive] attribute on the enum to make missing handlers a compile error.
🤖 Prompt for AI Agents
In @crates/vm/src/frame.rs around lines 1782 - 1784, The match currently uses a
catch-all arm with unreachable!() for the Instruction enum which defeats
compile-time exhaustiveness checking; replace the wildcard arm by explicitly
matching every Instruction variant used in this match so the compiler will error
if new variants are added (i.e., remove the `_ => unreachable!()` arm in the
match on `instruction` and add explicit arms for each `Instruction::<Variant>`
handled here), or if the enum is intended to be extensible annotate the enum
definition with `#[non_exhaustive]` and then handle the `_` case with a
documented conversion/error path; ensure you update the match in the
function/method in frame.rs where `instruction` is matched so future additions
to `Instruction` cause a compile-time failure instead of a runtime panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShaharNaveh uuuuh, this is totally right. I immediately broken build by forgetting change arm during merge. listing all is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
| @property | ||
| def cpython_name(self) -> str: | ||
| if self.rust_name == "CopyItem": | ||
| return "COPY" | ||
| name = self.rust_name | ||
| name = re.sub(r"(?<=[a-z0-9])([A-Z])", r"_\1", name) | ||
| return re.sub(r"(\D)(\d+)$", r"\1_\2", name).upper() | ||
| return re.sub(r"(?<=[a-z0-9])([A-Z])", r"_\1", self.rust_name).upper() | ||
| return re.sub(r"(?<!^)(?=[A-Z])", "_", self.rust_name).upper() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix unreachable code in cpython_name property.
Lines 28-29 are unreachable because line 27 always executes a return statement. This appears to be a copy-paste error during development.
🐛 Remove unreachable code
@property
def cpython_name(self) -> str:
if self.rust_name == "CopyItem":
return "COPY"
name = self.rust_name
name = re.sub(r"(?<=[a-z0-9])([A-Z])", r"_\1", name)
return re.sub(r"(\D)(\d+)$", r"\1_\2", name).upper()
- return re.sub(r"(?<=[a-z0-9])([A-Z])", r"_\1", self.rust_name).upper()
- return re.sub(r"(?<!^)(?=[A-Z])", "_", self.rust_name).upper()🤖 Prompt for AI Agents
In @scripts/generate_opcode_metadata.py around lines 21 - 29, The cpython_name
property contains multiple return statements after the intended two-step
transformation, making the last two returns unreachable; edit the cpython_name
method to keep the special-case for "CopyItem" and the intended conversion
(assign rust_name to name, apply the two regex substitutions and return
name.upper()), and remove the extraneous return lines (the duplicate re.sub
returns) so only the correct transformation and single return remain in the
cpython_name property.
youknowone
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Summary by CodeRabbit
Refactor
Chores
✏️ Tip: You can customize this high-level summary in your review settings.