Assign opcode ids #6637

ShaharNaveh · 2026-01-03T12:55:51Z

Summary by CodeRabbit

Refactor
- Internal bytecode instruction names and encodings were reorganized for clearer, strongly-discriminated layouts; this improves consistency in disassembly, error messages, and runtime handling without changing observable behavior.
Chores
- Opcode metadata generation was automated from source to keep tooling and printed opcode names synchronized.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-03T12:56:04Z

📝 Walkthrough

Walkthrough

Refactors the bytecode Instruction enum to use explicit numeric discriminants, renames and reshapes many opcodes (including argument-bearing variants), and propagates those renames across compiler, codegen, VM, JIT, stdlib metadata, tests, and the opcode-generation script.

Changes

Cohort / File(s)	Summary
Bytecode enum `crates/compiler-core/src/bytecode.rs`	Introduced explicit numeric discriminants and new/renamed variants (e.g., `BinaryOp { op }`, `Call`, `CallKw`, `RaiseVarargs`, `LoadName`, `StoreName`, `BinarySubscr`, `DeleteSubscr`, `PopExcept`, `SetupAnnotations`, `Jump { target }`, etc.). Adjusted TryFrom, display/serialization, label_arg, and stack-effect logic to match the new discriminants and shapes.
Code generation `crates/codegen/src/compile.rs`	Replaced emission sites to use renamed instructions (`Call`/`CallKw`, `LoadName`, `StoreName`, `StoreSubscr`/`DeleteSubscr`, `BinarySubscr`, `RaiseVarargs`, `CompareOp`, `PopExcept`, `SetupAnnotations`, etc.). Many call/raise/name handling sites updated.
VM / Frame execution `crates/vm/src/frame.rs`	Updated interpreter match arms to new variants (`Call`/`CallKw`, `LoadName`, `StoreName`, `BinarySubscr`, `DeleteSubscr`, `RaiseVarargs`, `PopExcept`, `CompareOp`, `SetupAnnotations`, etc.) and consolidated unreachable/default handling for removed placeholders.
JIT & runtime helpers `crates/jit/src/instructions.rs`, `crates/jit/tests/common.rs`	Adjusted internal dispatch and tests to new variant names (`Call`, `CompareOp`, `LoadName`, `StoreName`), preserving argument shapes where applicable.
Stdlib opcode utilities `crates/stdlib/src/opcode.rs`	Updated `Opcode::has_name`/`has_jump` checks to reflect renames (`LoadName`, `StoreName`, `DeleteName`) and removed some jump variants from the jump set.
Opcode metadata generator `scripts/generate_opcode_metadata.py`	Rewrote generator to parse the Rust `Instruction` enum directly, derive opcode mapping dynamically, add `Opcode` NamedTuple with `cpython_name`, and emit `Lib/_opcode_metadata.py` instead of relying on hard-coded maps.
Tests / misc `crates/jit/tests/*`, various call sites across crates	Propagated many variant renames in tests and call sites; ensure alignment with new discriminants and argument names.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Remove Rotate* & Duplicate* instructions #6303 — Large Instruction enum edits and propagation; likely overlaps in variant renames and encoding changes.
Unify BINARY_OP bytecodes #6317 — Consolidates binary operation variants into a unified BinaryOp; directly related to the BinaryOp change here.
Make use of Unary opcodes #6647 — Related enum/variant changes for unary operations and operator opcodes that overlap with this refactor.

Suggested reviewers

youknowone
arihant2math

Poem

🐇 Hop, hop — bytes align in rows,
Names and numbers find their pose,
Calls and raises, tidy, neat,
Opcodes march with steady feet,
A rabbit cheers: the bytecode grows! 🎉

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Assign opcode ids' directly and clearly describes the main objective of the changeset: assigning explicit discriminant values to instruction opcodes throughout the codebase.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

crates/compiler-core/src/bytecode.rs (2)

592-871: Opcode discriminant assignments and TryFrom mapping look sound

The explicit repr(u8) values and the whitelist in Instruction::try_from are internally consistent: every value admitted by the match corresponds to a real variant (including the 222–255 range), so the mem::transmute::<u8, Instruction> call is sound and there are no “hole” values that would cause UB. From a maintainability perspective, if you expect to churn opcodes often, you might eventually consider an explicit match value { ... => Ok(Variant), _ => Err(..) } mapping instead of a broad range + transmute, but the current approach is fine correctness‑wise.

Also applies to: 875-915

656-662: Align CheckExcMatch documentation with its current behavior

The doc comment for Instruction::CheckExcMatch still describes it as pushing a boolean result (STACK[-2] vs STACK[-1]), but the variant now carries a Label arg, is included in label_arg, and its stack_effect is -2, matching the “duplicate exc + type, then consume both and keep the original exc” pattern used in compile_try_statement. It would be less confusing for future readers if the comment were updated to describe the jump‑on‑no‑match semantics and the fact that it no longer produces a standalone bool on the stack.

Also applies to: 1643-1656, 1761-1762

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f0526b9 and f4b8be8.

📒 Files selected for processing (3)

crates/codegen/src/compile.rs
crates/compiler-core/src/bytecode.rs
crates/vm/src/frame.rs

🧰 Additional context used

📓 Path-based instructions (1)

**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

crates/vm/src/frame.rs
crates/compiler-core/src/bytecode.rs
crates/codegen/src/compile.rs

🧬 Code graph analysis (2)

crates/vm/src/frame.rs (2)

crates/vm/src/vm/mod.rs (2)

value (675-675)

value (677-677)

crates/vm/src/exceptions.rs (2)

value (426-426)

value (590-590)

crates/compiler-core/src/bytecode.rs (1)

crates/compiler-core/src/marshal.rs (1)

try_from (79-111)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: Run rust tests (windows-latest)
GitHub Check: Run snippets and cpython tests (windows-latest)
GitHub Check: Run snippets and cpython tests (macos-latest)
GitHub Check: Run snippets and cpython tests on wasm-wasi
GitHub Check: Check Rust code with clippy
GitHub Check: Run tests under miri
GitHub Check: Ensure compilation on various targets
GitHub Check: Check the WASM package and demo

🔇 Additional comments (19)

crates/vm/src/frame.rs (10)

557-564: LGTM: Instruction rename is consistent.

The rename from BinarySubscript to BinarySubscr aligns with the PR's opcode standardization objective. The implementation logic remains unchanged.

791-803: LGTM: Semantic clarification with DeleteName.

The rename from DeleteLocal to DeleteName better reflects that this instruction deletes by name from the locals mapping (using ass_subscript), distinguishing it from DeleteFast which operates on the fastlocals array by index. This improves code clarity.

804-804: LGTM: Instruction rename is consistent.

The rename from DeleteSubscript to DeleteSubscr aligns with the opcode standardization.

1350-1358: LGTM: Instruction rename is consistent.

The rename from PopException to PopExcept follows the abbreviated naming convention used elsewhere in the bytecode.

1370-1370: LGTM: Instruction rename is consistent.

The rename from Raise to RaiseVarargs better communicates that this instruction handles variable arguments for raise statements (reraise, raise with cause, etc.).

1418-1418: LGTM: Instruction rename is consistent.

The rename from SetupAnnotation (singular) to SetupAnnotations (plural) is more accurate since the instruction sets up the __annotations__ dictionary which holds multiple annotations.

1487-1492: LGTM: Semantic clarification with StoreName.

The rename from StoreLocal to StoreName better reflects that this instruction stores by name in the locals mapping (using ass_subscript), distinguishing it from StoreFast which stores by index in the fastlocals array. This improves code clarity.

1493-1493: LGTM: Instruction rename is consistent.

The rename from StoreSubscript to StoreSubscr aligns with the opcode standardization and abbreviated naming convention.

530-1578: Ensure code passes cargo fmt and cargo clippy before merging.

Per the coding guidelines for Rust files, run cargo fmt to format the code and cargo clippy to check for lints. Fix any warnings introduced by these changes before final submission.

700-722: CheckExcMatch implementation is correct.

The implementation properly validates that exception types inherit from BaseException, handles both single types and tuples, and leverages is_instance which correctly delegates tuple matching to object_recursive_isinstance. The bytecode instruction is correctly emitted during exception handler compilation, and the logic accurately mirrors Python's exception matching behavior.

crates/compiler-core/src/bytecode.rs (2)

1696-1823: Stack effects for new/renamed opcodes match codegen usage

The updated stack_effect arms for BinarySubscr, Subscript, StoreSubscr, DeleteSubscr, StoreName/DeleteName, CheckEgMatch, CheckExcMatch, RaiseVarargs, EndAsyncFor, SetupAnnotations, etc., line up with how they’re emitted in compile.rs:

StoreSubscr => -3 and DeleteSubscr => -2 match the [value, container, index] and [container, index] stack shapes produced by compile_subscript, compile_store, and compile_augassign.

BinarySubscr => -1 is used where we duplicate the subject and then index into it in pattern helpers.

RaiseVarargs’s -(kind.get(arg) as u8 as i32) matches the number of values the compiler pushes for each RaiseKind (0/1/2).

CheckEgMatch => 0 (pop 2, push 2) is consistent with the [orig, list, rest, type] -> [orig, list, new_rest, match] comment in compile_try_star_except.

CheckExcMatch(_) => -2 matches the CopyItem + exc_type pattern in compile_try_statement (3 stack entries down to the original exc).

I don’t see any inconsistency between these definitions and the codegen sites.

875-915: Confirm bytecode compatibility expectations after changing opcode values

By pinning explicit repr(u8) discriminants and updating Instruction::try_from accordingly, the binary encoding of bytecode has changed. Any previously serialized CodeUnit/CodeObject (e.g., cached .pyc-like artifacts) using the old opcode layout will now either fail to decode (InvalidBytecode) or, if produced by older code that also used transmute, could be misinterpreted if reused across versions. If RustPython ever persists bytecode across versions, it would be good to confirm that:

Either the magic/marshal version is bumped so old caches are rejected explicitly, or

You guarantee such caches are never reused across a change like this.

Otherwise, you may get hard‑to‑diagnose failures when loading stale compiled artifacts.

crates/codegen/src/compile.rs (7)

445-502: Subscript codegen matches new BinarySubscr/StoreSubscr/DeleteSubscr semantics

The various subscript paths are internally consistent with the new opcodes and their stack effects:

compile_subscript:

Load: evaluates value, then slice, and calls Subscript, giving [obj, index] -> result as expected.

Store: when used from compile_store, we start with the assigned value on the stack; after compiling value and slice (and BuildSlice when optimizing 2‑element slices) we end up with [value, container, sub], which matches StoreSubscr => -3.

Delete: compiles value then slice and calls DeleteSubscr, with [container, sub] matching DeleteSubscr => -2.

pattern_helper_sequence_subscr duplicates the subject and uses BinarySubscr with [subject, subject, idx] so the helper keeps the original subject while extracting an element, consistent with BinarySubscr => -1.

compile_annotated_assign builds [annotation, __annotations__, name] before StoreSubscr, so __annotations__[name] = annotation is encoded correctly.

compile_augassign’s subscript case carefully arranges the stack to [result, container, index] before StoreSubscr, again matching the [value, container, sub] convention.

Given the stack_effect definitions in bytecode.rs, all of these sites look correct.

Also applies to: 1801-1827, 3621-3622, 4484-4511, 4588-4648

1180-1302: NameOp::Name migration to LoadNameAny/StoreName/DeleteName is coherent

Routing NameOp::Name through LoadNameAny / StoreName / DeleteName is consistent with how indices are allocated:

name() and get_global_name_index() both index into CodeInfo.metadata.names, which is exactly what these opcodes consume.

Module/class bodies (compile_program, compile_program_single, compile_class_body) and class metadata setup (__module__, __qualname__, __doc__, __firstlineno__, __type_params__, __classcell__) all use name() and StoreName, so they’re aligned with the bytecode enum.

As long as the VM side now treats LOAD_NAME_ANY/STORE_NAME/DELETE_NAME as the generic name path (mirroring the old local/name opcodes), behavior should be unchanged.

Also applies to: 2959-3042

1501-1518: RaiseVarargs and exception-matching opcodes integrate cleanly in try/except and assert paths

The switch from Raise to RaiseVarargs and from JumpIfNotExcMatch to CheckExcMatch is wired consistently:

Stmt::Raise:

RaiseKind::Raise is emitted after compiling one expression (stack_effect -1).

RaiseKind::RaiseCause is emitted after compiling two expressions (stack_effect -2).

Bare raise uses RaiseKind::Reraise (stack_effect 0), matching the absence of new stack values.

In assert lowering, the explicit RaiseVarargs { kind: Raise } matches compiling exactly one exception instance.

Classic try/except:

For typed handlers, the CopyItem + exc_type + CheckExcMatch(next_handler) pattern creates [exc, exc, type] so that CheckExcMatch can consume the duplicate and type and still leave the original exc for the bound name or PopTop.

Unhandled cases jump to the final next_handler block, where RaiseVarargs { kind: Reraise } is emitted, so unmatched exceptions are re‑raised.

try/except*:

Uses CheckEgMatch with the documented [orig, list, rest, type] -> [orig, list, new_rest, match] shape.

SetExcInfo is used before running the handler body so a bare raise picks up the right match.

The “no more to reraise” path does PopTop (drop the None result) and PopExcept, while the re‑raise path deliberately omits PopExcept and just executes RaiseVarargs { kind: Raise }, leaving block unwinding to the VM.

These all line up with the stack_effect definitions in bytecode.rs; I don’t see a mismatch.

Also applies to: 1586-1591, 2053-2147, 2178-2420

2923-2957: SetupAnnotations usage remains consistent with annotation detection

The rename to Instruction::SetupAnnotations has been plumbed through all relevant sites:

Module and REPL compilation (compile_program, compile_program_single) call find_ann and emit SetupAnnotations only when needed.

Class bodies (compile_class_body) also call find_ann and emit SetupAnnotations once at the top when annotations are present.

Since compile_annotated_assign is already gated on !ctx.in_func() for runtime annotation storage, this preserves the prior behavior with the new opcode name.

Also applies to: 1034-1053, 1063-1065, 3019-3022

3821-4022: CompareOp/ComparisonOperator wiring for patterns and comparisons looks correct

The migration to ComparisonOperator + Instruction::CompareOp is coherent:

Mapping in compile_addcompare:

CmpOp::Eq/NotEq/Lt/LtE/Gt/GtE map to Equal, NotEqual, Less, LessOrEqual, Greater, GreaterOrEqual respectively.

Structural pattern matching:

Mapping patterns use CompareOp { op: GreaterOrEqual } for the len(subject) >= size and len(subject) >= size-1 checks.

Sequence patterns use CompareOp { op: Equal } for len(subject) == size and GreaterOrEqual where a *rest is present.

Value patterns use CompareOp { op: Equal } for subject == value.

Other relational/inclusion ops (In/NotIn, Is/IsNot) delegate to ContainsOp and IsOp with the right Invert flag.

Given CompareOp has stack effect -1 and the surrounding code always leaves the subject underneath the comparison, these changes look correct.

Also applies to: 4132-4209, 4211-4223, 4361-4376

4484-4517: Annotation storage and augmented assignment now use StoreSubscr consistently

Two key sites now rely on StoreSubscr and are consistent with its stack contract:

compile_annotated_assign:

For simple name: T outside functions, after compile_annotation the stack is [annotation], then LoadNameAny(__annotations__), then the mangled name string, yielding [annotation, __annotations__, name] before StoreSubscr, which matches the [value, container, index] convention.

compile_augassign:

The subscript case builds [container, slice], duplicates both to get a copy for loading, uses Subscript to obtain the original value, then performs the inplace binary op with the RHS and finally reorders to [result, container, slice] just before StoreSubscr.

Both cases align with the StoreSubscr stack_effect and with how simple subscript stores are compiled via compile_subscript.

Also applies to: 4588-4648

5443-5630: Async for/comprehensions correctly use EndAsyncFor under the new opcode layout

In both explicit async for loops and comprehensions:

compile_for emits EndAsyncFor on the async path after the else block, matching EndAsyncFor => -2 (it cleans up the async iterator state).

compile_comprehension emits EndAsyncFor when any generator is async, and uses GetAIter/GetANext + Resume { AfterAwait }, consistent with the updated async for bytecodes.

Given the updated numeric assignment for EndAsyncFor and the unchanged stack_effect, the codegen here still looks correct.

Also applies to: 3331-3377

youknowone · 2026-01-03T13:21:52Z

#6638 will change instruction a lot. Can this be aligned to that patch?

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/compiler-core/src/bytecode.rs (1)
1063-1074: Verify transmute safety with non-contiguous discriminants.

The TryFrom<u8> implementation uses unsafe { core::mem::transmute::<u8, Self>(value) } after checking value <= u8::from(LAST_INSTRUCTION). With explicit discriminants that have gaps (e.g., no opcodes at 0, 3-4, 11, 13, etc.), transmuting these gap values could produce invalid Instruction enum variants, which is undefined behavior.

Consider either:

Filling all gaps with Reserved variants (e.g., Reserved0, Reserved3, Reserved4, etc.)

Using an explicit match instead of transmute to validate each discriminant
🔎 Example of safer implementation using match
 impl TryFrom<u8> for Instruction {
     type Error = MarshalError;
 
     #[inline]
     fn try_from(value: u8) -> Result<Self, MarshalError> {
-        if value <= u8::from(LAST_INSTRUCTION) {
-            Ok(unsafe { core::mem::transmute::<u8, Self>(value) })
-        } else {
-            Err(MarshalError::InvalidBytecode)
-        }
+        // Safe alternative: explicit validation
+        match value {
+            1 => Ok(Self::BeforeAsyncWith),
+            2 => Ok(Self::BeforeWith),
+            5 => Ok(Self::BinarySubscr),
+            // ... other valid opcodes
+            149 => Ok(Self::Resume { arg: Arg::marker() }),
+            _ => Err(MarshalError::InvalidBytecode),
+        }
     }
 }
Note: This requires #[repr(u8)] on the enum and careful maintenance, but is memory-safe.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 153a03a and 5369dab.

📒 Files selected for processing (1)

crates/compiler-core/src/bytecode.rs

🧰 Additional context used

📓 Path-based instructions (1)

**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

crates/compiler-core/src/bytecode.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Check Rust code with clippy
GitHub Check: Run rust tests (windows-latest)
GitHub Check: Run snippets and cpython tests (windows-latest)
GitHub Check: Ensure compilation on various targets

🔇 Additional comments (2)

crates/compiler-core/src/bytecode.rs (2)

679-712: Coordinate with PR #6638 before merging.

Based on the PR comments, PR #6638 will significantly change the instruction structure. Since this PR assigns explicit discriminants to the Instruction enum, merging both could create conflicts or require substantial rework.

Recommend coordinating with the author of PR #6638 to determine:

Which PR should merge first

Whether discriminant assignments should be deferred until after #6638

How to handle potential conflicts in the instruction enum structure

679-712: Cache (opcode 0) and Reserved3 (opcode 3) variants appear to be referenced but not defined in the Instruction enum.

The review comment correctly identifies that opcode 0 is missing from the visible enum definition (BeforeAsyncWith starts at 1). However, the code references Cache and Reserved3 in match statements within stack_effect() (line 1846) and the display logic (line 2084), but these variants cannot be located in the enum definition. If not properly defined, this would cause compilation errors.

This discrepancy needs clarification:

If Cache and Reserved3 are intentionally omitted per CPython 3.13 spec (which removed CACHE as a separate opcode), they should be removed from match statements.

If they should be included, they must be added to the enum with appropriate discriminants (Cache = 0, Reserved3 = 3).

youknowone · 2026-01-06T14:43:34Z

crates/compiler-core/src/bytecode.rs

+    LoadClosure(Arg<NameIdx>) = 253, // CPython uses pseudo-op 258
+    LoadMethod {
+        idx: Arg<NameIdx>,
+    } = 254, // CPython uses pseudo-op 259


We will actually put these ops on pseudo ops once we get them.
Planning to do it for next task.

That, will be great:)
If you already have a local branch with the changes, can you please open a draft PR? I might have some inputs

not yet. I will start one during weekend

youknowone · 2026-01-06T14:46:27Z

crates/compiler-core/src/bytecode.rs

-        } else {
-            Err(MarshalError::InvalidBytecode)
+        match value {
+            1


Why not the opposite way? listing reserved may be easier. Reserved ops doesn't have args, so using names is also easy like Self::Reserved3 as u8

Not sure I understand. What is the purposed code change?

I created this match arm with a simple script

I'd like to avoid using literal numbers in the code. That's not easy to read, and maintainers must not forget to run the 'simple script'. Using name prevent to run the script.

if (1..=Self::Resume as u8).contains(value) && ![Self::Reserved3 as u8, Self::Reserved17 as u8, Self::Reserved141 as u8...].contains(value) { Ok(core::mem::transmute::<u8, Self>(value)) } else { Err(InvalidBytecode) }

This code will automatically error-catachable by breaking build if we change reserved141 to something other. (Though not working the inverse way)

But we have "holes" in our IDs, this can lead to undefined behavior.

I would love to use num_enum crate, but it doesn't support enum variants that holds a value:/

I do think we should rename the Instruction enum to Opcode, and have an Instruction struct that holds:

struct Instruction { opcode: Opcode, oparg: Oparg // Oparg enum }

Like a typed CodeUnit.
But that's out of scope for this PR, and needs a further dissection

Hmm, we had that form originally. The commits of that change or comments of the PR might have note that why we moved here.
cc @coolreader18

youknowone · 2026-01-06T14:47:36Z

crates/compiler-core/src/bytecode.rs

            | Continue { target: l }
-            | Send { target: l }
-            // Dummy jump instructions (placeholders)
-            | JumpBackward { target: l }


If we are going to implement these to replace our own old ones, I hesitate to remove it now to recover it back later.
We will not implement any other instruction in this slot anyway.

My philosophy is that we can always add new code, but to have a placeholder code (that can be deleted) is just clutter

What's the benefit to delete them? They are deletable, but no reason to do.

IMO they are just noise, but I can re-add them, that's not an issue

Yeah, that could be. At the same time, it is also like TODO

ShaharNaveh · 2026-01-09T12:36:12Z

Lib/test/test_dis.py

    def test_boundaries(self):
        self.assertEqual(dis.opmap["EXTENDED_ARG"], dis.EXTENDED_ARG)

+    @unittest.expectedFailure # TODO: RUSTPYTHON; AssertionError: 29 not less than or equal to 20


Because BINARY_OP_INPLACE_ADD_UNICODE opcode was added, we are missing some truncation logic when formatting the dis output

coderabbitai

Actionable comments posted: 5

🤖 Fix all issues with AI agents

In @crates/codegen/src/compile.rs:
- Around line 1016-1025: The stack effect for PopExcept is incorrectly
documented as 0 but the VM's frame implementation (where PopExcept does let
prev_exc = self.pop_value()) consumes one value; update the stack_effect mapping
for the PopExcept opcode (the entry for PopExcept in your
stack_effect/stack_effects function or match table) from 0 to -1 so the
documented contract matches the VM behavior and downstream stack analysis.
- Around line 1881-1885: The generic-class keyword handling is wrong: instead of
emitting individual STRING constants for each kw name and increasing nargs with
keywords, build a single tuple constant of keyword name strings (use
ConstantData::Tuple) and emit that one tuple onto the stack, and emit the call
with nargs equal to the number of positional arguments only (do not add keywords
to nargs); mirror the pattern in compile_call_inner so the VM's
collect_keyword_args receives a PyTuple at the top of the stack and the
Call/CallKw ABI is satisfied.

In @crates/compiler-core/src/bytecode.rs:
- Around line 679-958: The VM currently ends bytecode dispatch with a catch-all
that calls unreachable!(), so placeholder enum variants in Bytecode (e.g.,
BinaryOpInplaceAddUnicode, BinarySlice, EndFor, LoadLocals, PushNull) can cause
a panic if accidentally executed; change the dispatch default arm to return a
descriptive runtime error (or Err) indicating an invalid/placeholder opcode
rather than panicking, and add an explicit match arm or helper check (e.g.,
is_placeholder_opcode) that logs or maps placeholder discriminants to that
error; additionally, add a validation step in bytecode loading or TryFrom<u8>
conversion for Bytecode to reject placeholder discriminants early (or emit
warnings) and document the retained placeholder variants in the Bytecode enum
comment to clarify they must never be emitted by codegen.

In @crates/vm/src/frame.rs:
- Around line 1782-1784: The match currently uses a catch-all arm with
unreachable!() for the Instruction enum which defeats compile-time
exhaustiveness checking; replace the wildcard arm by explicitly matching every
Instruction variant used in this match so the compiler will error if new
variants are added (i.e., remove the `_ => unreachable!()` arm in the match on
`instruction` and add explicit arms for each `Instruction::<Variant>` handled
here), or if the enum is intended to be extensible annotate the enum definition
with `#[non_exhaustive]` and then handle the `_` case with a documented
conversion/error path; ensure you update the match in the function/method in
frame.rs where `instruction` is matched so future additions to `Instruction`
cause a compile-time failure instead of a runtime panic.

In @scripts/generate_opcode_metadata.py:
- Around line 21-29: The cpython_name property contains multiple return
statements after the intended two-step transformation, making the last two
returns unreachable; edit the cpython_name method to keep the special-case for
"CopyItem" and the intended conversion (assign rust_name to name, apply the two
regex substitutions and return name.upper()), and remove the extraneous return
lines (the duplicate re.sub returns) so only the correct transformation and
single return remain in the cpython_name property.

🧹 Nitpick comments (1)

crates/compiler-core/src/bytecode.rs (1)
975-999: Document intentional gaps in opcode ranges.

The TryFrom<u8> implementation validates four separate ranges:

CPython opcodes: Cache (0) to YieldValue (118)

Resume: 149

RustPython-only: Break (119) to Subscript (135)

Pseudo opcodes: Jump (252) to PopBlock (255)

This leaves gaps (136-148, 150-251) that will return InvalidBytecode errors. Consider:

Adding a comment in the TryFrom implementation documenting these intentional gaps.

Verifying that deserialization of legacy bytecode won't encounter these gap values.
📝 Add documentation for opcode gaps
     #[inline]
     fn try_from(value: u8) -> Result<Self, MarshalError> {
+        // Valid opcode ranges:
+        // - 0..=118: CPython 3.13 compatible opcodes
+        // - 149: Resume instruction
+        // - 119..=135: RustPython-specific opcodes
+        // - 252..=255: Pseudo opcodes (not emitted in bytecode)
+        // Gaps (136-148, 150-251) are reserved for future use
         let cpython_start = u8::from(Self::Cache);

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5369dab and 258f61c.

⛔ Files ignored due to path filters (3)

Lib/_opcode_metadata.py is excluded by !Lib/**
Lib/test/test__opcode.py is excluded by !Lib/**
Lib/test/test_dis.py is excluded by !Lib/**

📒 Files selected for processing (7)

crates/codegen/src/compile.rs
crates/compiler-core/src/bytecode.rs
crates/jit/src/instructions.rs
crates/jit/tests/common.rs
crates/stdlib/src/opcode.rs
crates/vm/src/frame.rs
scripts/generate_opcode_metadata.py

🧰 Additional context used

📓 Path-based instructions (2)

**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

crates/stdlib/src/opcode.rs
crates/jit/src/instructions.rs
crates/jit/tests/common.rs
crates/vm/src/frame.rs
crates/codegen/src/compile.rs
crates/compiler-core/src/bytecode.rs

**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

scripts/generate_opcode_metadata.py

🧠 Learnings (1)

📚 Learning: 2025-12-27T14:03:49.034Z

Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: When modifying bytecode instructions, perform a full clean build by running `rm -r target/debug/build/rustpython-* && find . | grep -E '\.pyc$' | xargs rm -r`

Applied to files:

scripts/generate_opcode_metadata.py

🧬 Code graph analysis (2)

crates/codegen/src/compile.rs (1)

crates/compiler-core/src/bytecode.rs (8)

from (394-396)

from (437-439)

from (964-967)

from (1044-1046)

from (1050-1052)

try_from (974-999)

try_from (1020-1025)

try_from (1034-1040)

crates/compiler-core/src/bytecode.rs (3)

crates/vm/src/exceptions.rs (4)

value (438-438)

value (602-602)

w (1775-1775)

w (1776-1776)

crates/vm/src/frame.rs (3)

arg (2646-2647)

name (1322-1322)

name (2638-2638)

crates/codegen/src/compile.rs (1)

name (1210-1212)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Run snippets and cpython tests (macos-latest)
GitHub Check: Run snippets and cpython tests (windows-2025)
GitHub Check: Run snippets and cpython tests (ubuntu-latest)

🔇 Additional comments (16)

crates/compiler-core/src/bytecode.rs (2)

1763-1914: LGTM! Stack effect calculations properly updated.

The stack effect calculations have been correctly updated to use the renamed instruction variants (LoadName, StoreName, DeleteName, Call, CallKw, RaiseVarargs, etc.). The logic for each instruction's stack manipulation appears correct.

1982-2104: LGTM! Disassembly formatting correctly updated.

The fmt_dis method has been properly updated to display the renamed instruction variants with their appropriate names (e.g., BINARY_SUBSCR, DELETE_SUBSCR, LOAD_NAME, STORE_NAME, CALL, CALL_KW, RAISE_VARARGS, etc.).

scripts/generate_opcode_metadata.py (2)

55-81: LGTM! Generation logic correctly parses and outputs opcode metadata.

The script properly:

Reads the Rust source file

Extracts the Instruction enum body

Parses opcodes with their discriminants

Generates sorted output in CPython 3.13 compatible format

The dynamic parsing approach eliminates the need for manual updates when opcodes change.

32-37: Python version requirement is acceptable for this development script.

This is a development-only script located in the scripts/ directory used to generate opcode metadata. The script already indicates compatibility with CPython 3.13 in its header comment, and the project's whats_left.py explicitly requires Python 3.13+. Since itertools.batched was introduced in Python 3.12, its use is fully compatible with the project's stated Python version requirements.

No action needed.

crates/jit/tests/common.rs (1)

99-105: LGTM! Test instruction handling correctly updated.

The test stack machine has been properly updated to handle the renamed instruction variants:

LoadNameAny → LoadName

StoreLocal → StoreName

The logic remains the same, only the variant names changed.

crates/jit/src/instructions.rs (2)

457-476: LGTM! JIT call instruction updated.

The JIT compiler correctly handles the renamed Call instruction (previously CallFunctionPositional). The implementation logic remains unchanged.

492-545: LGTM! Comparison instruction renamed.

The CompareOp instruction (previously CompareOperation) is correctly handled in the JIT with proper type conversions and condition code mappings for both integer and float comparisons.

crates/stdlib/src/opcode.rs (1)

66-83: LGTM! Opcode name classification correctly updated.

The has_name function has been properly updated to recognize the renamed instruction variants:

DeleteLocal → DeleteName

LoadNameAny → LoadName

StoreLocal → StoreName

All three renamed variants correctly appear in the pattern match for instructions that use co_names.

crates/vm/src/frame.rs (1)

394-394: LGTM! Instruction variant renames are consistent.

The renamed instruction variants align well with standard Python bytecode naming conventions:

RaiseVarargs, BinarySubscr, CallKw, Call, CompareOp, DeleteName, DeleteSubscr, LoadName, PopExcept, SetupAnnotations, StoreName, and StoreSubscr are all clearer and more conventional names.

The logic remains unchanged—these are purely API surface updates.

Also applies to: 613-613, 714-718, 756-756, 825-825, 838-838, 1195-1195, 1431-1431, 1460-1460, 1562-1562, 1624-1624, 1630-1630

crates/codegen/src/compile.rs (7)

1818-1819: The kind mapping correctly covers all three raise forms (bare raise, raise exc, raise exc from cause), and the encoding aligns with the RaiseKind enum and opcode metadata. No issues found.

5222-5227: Annotation assignment, augassign, and normal call paths updated to new ops—monitor for stack effect regressions

All three locations (lines 5222–5227, 5363–5364, 6111–6115) correctly use the updated instructions (LoadName, StoreSubscr, Call, CallKw). Run the full test suite plus targeted cases (subscripting stores, __annotations__ in class/module, kwargs calls, exception handling) to verify stack effects before merge.

1345-1346: SetupAnnotations instruction verified: correctly gated and scoped

SetupAnnotations is properly emitted in all three entry points (module, function, class) using consistent gating with Self::find_ann(), which recursively detects annotated assignments. The runtime implementation correctly initializes the __annotations__ dict only when needed. No silent breakage risk identified—scopes align with Python semantics and the instruction behavior is sound.

1597-1602: LoadName/StoreName/DeleteName scope semantics are correct

LoadNameAny does not exist in the codebase. The current implementation correctly emits LoadName for NameOp::Name, which handles the proper LEGB lookup (Locals → Enclosing → Globals → Builtins): it first tries self.locals.mapping().subscript(name, vm), and on KeyError falls back to self.load_global_or_builtin(name, vm). This is the correct behavior for module and class scopes. StoreName and DeleteName similarly use locals.mapping() for storage/deletion, which is appropriate for these contexts. No behavior regression exists.

4338-4339: Pattern-matching opcode renames (BinarySubscr / CompareOp / DeleteSubscr) are properly defined and handled
All three opcodes are correctly defined in the Instruction enum (BinarySubscr = 5, DeleteSubscr = 9, CompareOp with op field), with proper VM dispatch handlers in frame.rs, correct stack effect tracking, and consistent bytecode serialization. All usages in compile.rs align with the defined opcode metadata.

497-510: Subscript opcodes are properly defined with documented stack semantics

The StoreSubscr (opcode 39, -3 stack effect), DeleteSubscr (opcode 9, -2 stack effect), and BinarySubscr (opcode 5, -1 stack effect) instructions are well-defined in crates/compiler-core/src/bytecode.rs with clear stack effect documentation. The usage patterns in compile.rs are correct: for store operations, the container and index are on the stack followed by the value; for delete operations, the container and index are consumed. No stack order issues detected.

3564-3596: Verify LoadName semantics for .type_params access in PEP 695 generic class contexts

While .type_params is registered with the TYPE_PARAM flag (making it a bound symbol), the code uses LoadName instructions to access it rather than LoadDeref. This appears to work because:

In builtins.rs (line 1040), .type_params is explicitly set in the class function's namespace dictionary before execution

The compile_name function has special handling for TypeParams scope that allows fallback to parent scope lookup

At runtime, LoadName succeeds by finding .type_params in the namespace dict

However, this relies on runtime namespace provisioning rather than the symbol scope mechanism. Confirm that:

.type_params is guaranteed to be in the namespace dict in all PEP 695 execution paths

No optimization or refactoring removes the .type_params namespace entry before the class code executes

The fallback logic in compile_name consistently resolves .type_params correctly across all TypeParams scope access patterns (lines 3589-3590, 3679-3680, 3690, and others)

coderabbitai · 2026-01-09T12:59:56Z

crates/codegen/src/compile.rs

+                // We need to pop: exc, prev_exc (via PopExcept)
                if preserve_tos {
                    emit!(self, Instruction::Swap { index: 2 });
                }
                emit!(self, Instruction::PopTop); // exc
                if preserve_tos {
                    emit!(self, Instruction::Swap { index: 2 });
                }
-                emit!(self, Instruction::PopException); // prev_exc is restored
+                emit!(self, Instruction::PopExcept); // prev_exc is restored
            }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find PopExcept / PushExcInfo / RaiseVarargs implementations + stack comments rg -n --hidden --no-messages "PopExcept|PushExcInfo|RaiseVarargs|ReraiseFromStack" crates | head -n 200

Repository: RustPython/RustPython

Length of output: 5787

🏁 Script executed:

#!/bin/bash # Inspect Instruction definition in compiler-core bytecode rg -n --hidden --no-messages "enum Instruction" crates/compiler-core/src/bytecode.rs | head -n 20

Repository: RustPython/RustPython

Length of output: 91

🏁 Script executed:

#!/bin/bash # Look for bytecode.rs file to understand Instruction definition find crates -name "bytecode.rs" -type f

Repository: RustPython/RustPython

Length of output: 101

🏁 Script executed:

#!/bin/bash # Read bytecode.rs around the Instruction enum and stack effect docs sed -n '1835,1910p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 3439

🏁 Script executed:

#!/bin/bash # Also read the Instruction enum definition sed -n '677,750p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 1937

🏁 Script executed:

#!/bin/bash # Check the PopExcept and PushExcInfo implementations in frame.rs sed -n '1425,1470p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 2441

🏁 Script executed:

#!/bin/bash # Check PushExcInfo implementation sed -n '1500,1520p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 893

🏁 Script executed:

#!/bin/bash # Read more context around the stack_effect function to understand the contract sed -n '1830,1905p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 3463

🏁 Script executed:

#!/bin/bash # Check if there are any test cases or comments explaining the expected behavior rg -n "PopExcept" crates --context=3 -A 2 -B 2

Repository: RustPython/RustPython

Length of output: 7469

🏁 Script executed:

#!/bin/bash # Look at the compile.rs code around all mentioned line ranges to check for consistency sed -n '1014,1030p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 877

🏁 Script executed:

#!/bin/bash # Check other PopExcept usages in compile.rs sed -n '1058,1070p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 582

🏁 Script executed:

#!/bin/bash # Look at lines 2382-2400 mentioned in the review to check expected stack state sed -n '2375,2410p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 1437

🏁 Script executed:

#!/bin/bash # Check if there are any recent comments or issues about PopExcept stack effect rg -n "stack.effect|stack_effect" crates/compiler-core/src/bytecode.rs | grep -A 2 -B 2 "PopExcept"

Repository: RustPython/RustPython

Length of output: 47

PopExcept's documented stack effect is incorrect and must be fixed

The bytecode stack effect documentation states PopExcept => 0, (bytecode.rs:1897), but the VM implementation (frame.rs:1431) actually pops one value: let prev_exc = self.pop_value();. This is a -1 stack effect, not 0.

All usage sites in compile.rs correctly assume PopExcept has a -1 effect (e.g., lines 1016-1024 comment "prev_exc is restored"), and the comments match the actual behavior. However, the incorrect documented contract could cause issues with stack analysis tools or future verification passes that rely on the stack_effect function.

Fix the stack_effect documentation in bytecode.rs:1897 from PopExcept => 0, to PopExcept => -1, to match the actual VM implementation.

🤖 Prompt for AI Agents

In @crates/codegen/src/compile.rs around lines 1016 - 1025, The stack effect for PopExcept is incorrectly documented as 0 but the VM's frame implementation (where PopExcept does let prev_exc = self.pop_value()) consumes one value; update the stack_effect mapping for the PopExcept opcode (the entry for PopExcept in your stack_effect/stack_effects function or match table) from 0 to -1 so the documented contract matches the VM behavior and downstream stack analysis.

coderabbitai · 2026-01-09T12:59:56Z

crates/codegen/src/compile.rs

+                            emit!(self, Instruction::Call { nargs: 1 });
                        }
                        None => {
-                            emit!(self, Instruction::CallFunctionPositional { nargs: 0 });
+                            emit!(self, Instruction::Call { nargs: 0 });
                        }


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the compile.rs file at the mentioned lines head -c 1000000 crates/codegen/src/compile.rs | wc -l

Repository: RustPython/RustPython

Length of output: 69

🏁 Script executed:

# Look at the code around the specified lines sed -n '1875,1890p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 762

🏁 Script executed:

# Find all occurrences of CallKw and Call instructions in the codebase rg -n "CallKw|Instruction::Call\b" crates/codegen/src/compile.rs | head -50

Repository: RustPython/RustPython

Length of output: 1029

🏁 Script executed:

# Find the compile_call_inner function rg -n "fn compile_call_inner" crates

Repository: RustPython/RustPython

Length of output: 126

🏁 Script executed:

# Search for CALL and CALL_KW instruction implementations in the VM rg -n "CALL_KW|CALL\b" crates --type rust | grep -i "opcode\|instruction\|match" | head -40

Repository: RustPython/RustPython

Length of output: 158

🏁 Script executed:

# Read the compile_call_inner function (around line 6127) sed -n '6120,6200p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 3137

🏁 Script executed:

# Look for CallKw enum variant definition rg -n "CallKw" crates/codegen/src/compile.rs -B2 -A2

Repository: RustPython/RustPython

Length of output: 583

🏁 Script executed:

# Find instruction definitions - look for CallKw struct/variant fd "bytecode.rs\|instruction" crates --type f | head -10

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Search for where Call/CallKw instructions are defined rg -n "pub enum Instruction|struct Call|struct CallKw" crates --type rust | head -20

Repository: RustPython/RustPython

Length of output: 249

🏁 Script executed:

# Read the Instruction enum definition in bytecode.rs around line 677 sed -n '677,750p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 1937

🏁 Script executed:

# Search for Call and CallKw variants in the enum rg -n "Call\s*\{|CallKw\s*\{" crates/compiler-core/src/bytecode.rs -A2

Repository: RustPython/RustPython

Length of output: 1373

🏁 Script executed:

# Find VM implementation of CALL and CALL_KW instructions rg -n "Instruction::Call\s*\{|Instruction::CallKw\s*\{" crates/vm/src --type rust -B2 -A5 | head -80

Repository: RustPython/RustPython

Length of output: 927

🏁 Script executed:

# Look for the execute function that handles instructions in the VM fd "interpreter.rs\|execute.rs" crates/vm/src --type f

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Find collect_keyword_args and collect_positional_args implementations rg -n "fn collect_keyword_args|fn collect_positional_args" crates/vm/src -A10

Repository: RustPython/RustPython

Length of output: 1214

🏁 Script executed:

# Look at the context around CallKw in frame.rs to understand the full call flow sed -n '700,730p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 1505

🏁 Script executed:

# Search for stack effect calculations to understand the contract better rg -n "CallKw.*stack|collect_keyword" crates --type rust | head -20

Repository: RustPython/RustPython

Length of output: 351

🏁 Script executed:

# Look at the context around line 3733-3735 where CallKw is emitted differently sed -n '3720,3760p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 1713

🏁 Script executed:

# Find what function this is in sed -n '3700,3750p' crates/codegen/src/compile.rs | head -60

Repository: RustPython/RustPython

Length of output: 2052

🏁 Script executed:

# Check if there are multiple compile functions handling different call contexts rg -n "fn compile.*call" crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 304

🏁 Script executed:

# Look for the function containing the class definition code at line 3733 sed -n '3600,3650p' crates/codegen/src/compile.rs | head -60

Repository: RustPython/RustPython

Length of output: 1745

🏁 Script executed:

# Search for tests of generic class definitions to verify this code path executes rg -n "generic.*class|TypeParam" crates --type rust | grep test | head -20

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Get the full function signature where line 3733 is located sed -n '3600,3650p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 1745

🏁 Script executed:

# Look at the full context of the compile_class_def function, especially the CallKw usage sed -n '3630,3780p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 5811

🏁 Script executed:

# Check if this is within an if statement checking for generics/type params sed -n '3650,3770p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 4735

🏁 Script executed:

# Let's verify how the generic class CallKw differs from the normal call path # by examining what should be on the stack rg -n "emit_load_const.*Tuple|BuildTuple" crates/codegen/src/compile.rs | grep -A2 -B2 "keyword\|kw"

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Check the implementation of compile_keywords to see the correct pattern rg -n "fn compile_keywords" crates/codegen/src/compile.rs -A20

Repository: RustPython/RustPython

Length of output: 1112

🏁 Script executed:

# Look at how non-generic class handles keywords via compile_call_inner sed -n '6160,6175p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 663

🏁 Script executed:

# Verify the emit_load_const pattern used in the generic class case (line 3728) sed -n '3720,3745p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 1017

🏁 Script executed:

# Let's trace through what stack state emerges from the generic class keyword code # Look at the exact sequence of stack operations sed -n '3721,3743p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 931

🏁 Script executed:

# Search for tests or examples of generic classes with keyword arguments rg -n "class.*\[.*\].*:" tests --type py | head -10

Repository: RustPython/RustPython

Length of output: 110

🏁 Script executed:

# Check if there are any Python test files with generic classes fd "test.*\.py" tests --type f | xargs grep -l "class.*\[.*\]" | head -5

Repository: RustPython/RustPython

Length of output: 157

🏁 Script executed:

# Find test directories find . -type d -name "*test*" | head -10

Repository: RustPython/RustPython

Length of output: 136

🏁 Script executed:

# Look for Python files with generics rg -n "TypeVar|class.*\[" --type py | head -20

Repository: RustPython/RustPython

Length of output: 1579

🏁 Script executed:

# Let's check if there are any tests that would exercise generic class definitions rg -n "class.*Generic\|TypeParams" --type py | head -10

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Let's verify the stack state at the generic class CallKw point # by checking if the pattern matches what compile_keywords produces sed -n '6059,6085p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 1154

🏁 Script executed:

# Let's verify the exact stack layout difference # For the generic class case, trace what's on stack before CallKw # Line 3721: first load function # Line 3724: load class name string # Line 3727-3730: load bases # Line 3732: load generic_base # Then the keyword loop at 3734-3742 # Now let's check what compile_keywords produces vs what generic class produces sed -n '6059,6085p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 1154

🏁 Script executed:

# Check the exact difference: does compile_keywords build a map or tuple? sed -n '6079,6088p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 304

🏁 Script executed:

# Look at CallMethodKeyword to see if there's another CallKw pattern rg -n "CallMethodKeyword" crates/codegen/src/compile.rs -B3 -A3

Repository: RustPython/RustPython

Length of output: 446

🏁 Script executed:

# Check the stack effect difference between the two patterns # Look at line 6165 in compile_call_inner sed -n '6163,6167p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 258

🏁 Script executed:

# Let me verify the critical difference in stack patterns # Generic class (line 3734-3742): for each keyword: emit STRING, then compile_expression for value # compile_call_inner (line 6161-6167): builds a TUPLE of all strings, then compiles expressions # Let's confirm this difference by looking at what's built echo "=== Generic class pattern ===" sed -n '3734,3742p' crates/codegen/src/compile.rs echo "" echo "=== compile_call_inner pattern ===" sed -n '6161,6167p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 744

🏁 Script executed:

# Verify if compile_keywords (used by compile_call_inner) creates a tuple # and compile_keywords is called from where rg -n "compile_keywords" crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 215

🏁 Script executed:

# The key question: does the generic class CallKw ever get executed? # Look for any tests or examples that would use generic classes with kwargs rg -n "class.*\[.*\].*:" extra_tests --type py | head -20

Repository: RustPython/RustPython

Length of output: 176

🏁 Script executed:

# Let me trace the exact stack layout for the generic class CallKw case # At line 3734-3742, what's the exact sequence? echo "=== Full generic class keyword handling (lines 3721-3743) ===" sed -n '3721,3743p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 993

🏁 Script executed:

# Now compare with the correct compile_call_inner pattern (lines 6161-6168) echo "=== Correct compile_call_inner CallKw pattern (lines 6161-6168) ===" sed -n '6161,6168p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 452

🏁 Script executed:

# Let's verify the critical difference: # In generic class: individual STRING constants are emitted # In compile_call_inner: a TUPLE of strings is built and emitted as a single constant # Check the exact bytecode produced echo "=== What does emit_load_const produce for each? ===" rg -n "fn emit_load_const" crates/codegen/src/compile.rs -A5

Repository: RustPython/RustPython

Length of output: 387

🏁 Script executed:

# Let me understand the exact stack layout difference # For generic class, we need to see the full picture echo "=== Stack analysis ===" echo "Generic class with kwargs: emits individual STRINGS in loop, then CallKw" echo "compile_call_inner with kwargs: builds TUPLE of strings, emits LoadConst with tuple, then CallKw" echo "" # Let's look at what the VM actually does with CallKw sed -n '2072,2090p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 986

🏁 Script executed:

# The key insight: generic class emits strings individually in a loop # But collect_keyword_args pops a TUPLE first! # This is a stack mismatch! # Let's verify by checking if generic class CallKw is even reachable # Search for any test that would hit the is_generic && !arguments.keywords.is_empty() path rg -n "TypeParam\|generic" extra_tests --type py | head -20

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Search for Python 3.12+ generic class tests fd "*.py" extra_tests | xargs grep -l "class.*\[.*\]:" 2>/dev/null | head -10

Repository: RustPython/RustPython

Length of output: 348

🏁 Script executed:

# Final confirmation: check if there's any comment or note about this being a known issue rg -n "TODO\|FIXME\|BUG\|HACK" crates/codegen/src/compile.rs | grep -A2 -B2 "373[0-5]\|CallKw"

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Verify the nargs calculation in the generic class case one more time echo "=== Generic class nargs calculation ===" sed -n '3739,3746p' crates/codegen/src/compile.rs echo "" echo "=== compile_call_inner nargs calculation ===" sed -n '6170,6172p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 452

🏁 Script executed:

# Let me get the exact nargs value at line 3733-3735 sed -n '3720,3747p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 1130

🏁 Script executed:

# Understand what nargs contains before the keyword handling sed -n '3718,3735p' crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 774

Critical: CallKw nargs ABI violation and stack layout mismatch in generic class handler

The generic class keyword argument handling at lines 3721–3747 contains two critical bugs that contradict the compile_call_inner pattern (lines 6161–6168):

Stack Layout Mismatch: Generic class emits individual STRING constants in a loop for each keyword name, resulting in stack layout [string_1, value_1, string_2, value_2, ...]. The VM's collect_keyword_args (frame.rs:2072–2090) expects a single PyTuple at the top of the stack, and will panic when attempting to downcast the final keyword value (not a tuple) to PyTuple.

nargs ABI Violation: Generic class calculates nargs: nargs + arguments.keywords.len(), including keyword count in nargs. This contradicts the VM contract where nargs represents positional argument count only. The correct pattern in compile_call_inner emits a tuple constant for keyword names and passes only positional count as nargs.

The non-generic class path correctly uses compile_call_inner (line 3777), which follows the proper ABI. Fix the generic class code to build a tuple of keyword name strings (via ConstantData::Tuple) and pass only the positional argument count as nargs, matching the compile_call_inner implementation.

🤖 Prompt for AI Agents

In @crates/codegen/src/compile.rs around lines 1881 - 1885, The generic-class keyword handling is wrong: instead of emitting individual STRING constants for each kw name and increasing nargs with keywords, build a single tuple constant of keyword name strings (use ConstantData::Tuple) and emit that one tuple onto the stack, and emit the call with nargs equal to the number of positional arguments only (do not add keywords to nargs); mirror the pattern in compile_call_inner so the VM's collect_keyword_args receives a PyTuple at the top of the stack and the Call/CallKw ABI is satisfied.

coderabbitai · 2026-01-09T12:59:56Z

crates/compiler-core/src/bytecode.rs

+    Cache = 0, // Placeholder
+    BeforeAsyncWith = 1,
+    BeforeWith = 2,
+    BinaryOpInplaceAddUnicode = 3, // Placeholder
+    BinarySlice = 4,               // Placeholder
+    BinarySubscr = 5,
+    CheckEgMatch = 6,
+    CheckExcMatch = 7,
+    CleanupThrow = 8,
+    DeleteSubscr = 9,
+    EndAsyncFor = 10,
+    EndFor = 11, // Placeholder
+    EndSend = 12,
+    ExitInitCheck = 13, // Placeholder
+    FormatSimple = 14,
+    FormatWithSpec = 15,
+    GetAIter = 16,
+    Reserved = 17,
+    GetANext = 18,
+    GetIter = 19,
+    GetLen = 20,
+    GetYieldFromIter = 21,
+    InterpreterExit = 22,    // Placeholder
+    LoadAssertionError = 23, // Placeholder
+    LoadBuildClass = 24,
+    LoadLocals = 25, // Placeholder
+    MakeFunction = 26,
+    MatchKeys = 27,
+    MatchMapping = 28,
+    MatchSequence = 29,
+    Nop = 30,
+    PopExcept = 31,
+    PopTop = 32,
+    PushExcInfo = 33,
+    PushNull = 34,        // Placeholder
+    ReturnGenerator = 35, // Placeholder
+    ReturnValue = 36,
+    SetupAnnotations = 37,
+    StoreSlice = 38, // Placeholder
+    StoreSubscr = 39,
+    ToBool = 40,
+    UnaryInvert = 41,
+    UnaryNegative = 42,
+    UnaryNot = 43,
+    WithExceptStart = 44,
+    // ==================== With-argument instructions (opcode > 44) ====================
    BinaryOp {
        op: Arg<BinaryOperator>,
-    },
-    // 46: BUILD_CONST_KEY_MAP - not implemented, placeholder
+    } = 45,
    BuildConstKeyMap {
        size: Arg<u32>,
-    },
-    // 47: BUILD_LIST
+    } = 46, // Placeholder
    BuildList {
        size: Arg<u32>,
-    },
-    // 48: BUILD_MAP
+    } = 47,
    BuildMap {
        size: Arg<u32>,
-    },
-    // 49: BUILD_SET
+    } = 48,
    BuildSet {
        size: Arg<u32>,
-    },
-    // 50: BUILD_SLICE
+    } = 49,
    BuildSlice {
        argc: Arg<BuildSliceArgCount>,
-    },
-    // 51: BUILD_STRING
+    } = 50,
    BuildString {
        size: Arg<u32>,
-    },
-    // 52: BUILD_TUPLE
+    } = 51,
    BuildTuple {
        size: Arg<u32>,
-    },
-    // 53: CALL
-    CallFunctionPositional {
+    } = 52,
+    Call {
        nargs: Arg<u32>,
-    },
-    // 54: CALL_FUNCTION_EX
+    } = 53,
    CallFunctionEx {
        has_kwargs: Arg<bool>,
-    },
-    // 55: CALL_INTRINSIC_1
+    } = 54,
    CallIntrinsic1 {
        func: Arg<IntrinsicFunction1>,
-    },
-    // 56: CALL_INTRINSIC_2
+    } = 55,
    CallIntrinsic2 {
        func: Arg<IntrinsicFunction2>,
-    },
-    // 57: CALL_KW
-    CallFunctionKeyword {
+    } = 56,
+    CallKw {
        nargs: Arg<u32>,
-    },
-    // 58: COMPARE_OP
-    CompareOperation {
+    } = 57,
+    CompareOp {
        op: Arg<ComparisonOperator>,
-    },
-    // 59: CONTAINS_OP
-    ContainsOp(Arg<Invert>),
-    // 60: CONVERT_VALUE
+    } = 58,
+    ContainsOp(Arg<Invert>) = 59,
    ConvertValue {
        oparg: Arg<ConvertValueOparg>,
-    },
-    // 61: COPY
+    } = 60,
    CopyItem {
        index: Arg<u32>,
-    },
-    // 62: COPY_FREE_VARS - not implemented, placeholder
+    } = 61,
    CopyFreeVars {
        count: Arg<u32>,
-    },
-    // 63: DELETE_ATTR
+    } = 62, // Placeholder
    DeleteAttr {
        idx: Arg<NameIdx>,
-    },
-    // 64: DELETE_DEREF
-    DeleteDeref(Arg<NameIdx>),
-    // 65: DELETE_FAST
-    DeleteFast(Arg<NameIdx>),
-    // 66: DELETE_GLOBAL
-    DeleteGlobal(Arg<NameIdx>),
-    // 67: DELETE_NAME
-    DeleteLocal(Arg<NameIdx>),
-    // 68: DICT_MERGE - not implemented, placeholder
+    } = 63,
+    DeleteDeref(Arg<NameIdx>) = 64,
+    DeleteFast(Arg<NameIdx>) = 65,
+    DeleteGlobal(Arg<NameIdx>) = 66,
+    DeleteName(Arg<NameIdx>) = 67,
    DictMerge {
        index: Arg<u32>,
-    },
-    // 69: DICT_UPDATE
+    } = 68, // Placeholder
    DictUpdate {
        index: Arg<u32>,
-    },
-    // 70: ENTER_EXECUTOR - not implemented, placeholder
-    EnterExecutor {
-        index: Arg<u32>,
-    },
-    // 71: EXTENDED_ARG
-    ExtendedArg,
-    // 72: FOR_ITER
+    } = 69,
+    EnterExecutor = 70, // Placeholder
+    ExtendedArg = 71,
    ForIter {
        target: Arg<Label>,
-    },
-    // 73: GET_AWAITABLE
-    GetAwaitable,
-    // 74: IMPORT_FROM
+    } = 72,
+    GetAwaitable = 73, // TODO: Make this instruction to hold an oparg
    ImportFrom {
        idx: Arg<NameIdx>,
-    },
-    // 75: IMPORT_NAME
+    } = 74,
    ImportName {
        idx: Arg<NameIdx>,
-    },
-    // 76: IS_OP
-    IsOp(Arg<Invert>),
-    // 77: JUMP_BACKWARD - not implemented, placeholder
+    } = 75,
+    IsOp(Arg<Invert>) = 76,
    JumpBackward {
        target: Arg<Label>,
-    },
-    // 78: JUMP_BACKWARD_NO_INTERRUPT - not implemented, placeholder
+    } = 77, // Placeholder
    JumpBackwardNoInterrupt {
        target: Arg<Label>,
-    },
-    // 79: JUMP_FORWARD - not implemented, placeholder
+    } = 78, // Placeholder
    JumpForward {
        target: Arg<Label>,
-    },
-    // 80: LIST_APPEND
+    } = 79, // Placeholder
    ListAppend {
        i: Arg<u32>,
-    },
-    // 81: LIST_EXTEND - not implemented, placeholder
+    } = 80,
    ListExtend {
        i: Arg<u32>,
-    },
-    // 82: LOAD_ATTR
+    } = 81, // Placeholder
    LoadAttr {
        idx: Arg<NameIdx>,
-    },
-    // 83: LOAD_CONST
+    } = 82,
    LoadConst {
        idx: Arg<u32>,
-    },
-    // 84: LOAD_DEREF
-    LoadDeref(Arg<NameIdx>),
-    // 85: LOAD_FAST
-    LoadFast(Arg<NameIdx>),
-    // 86: LOAD_FAST_AND_CLEAR
-    LoadFastAndClear(Arg<NameIdx>),
-    // 87: LOAD_FAST_CHECK - not implemented, placeholder
-    LoadFastCheck(Arg<NameIdx>),
-    // 88: LOAD_FAST_LOAD_FAST - not implemented, placeholder
+    } = 83,
+    LoadDeref(Arg<NameIdx>) = 84,
+    LoadFast(Arg<NameIdx>) = 85,
+    LoadFastAndClear(Arg<NameIdx>) = 86,
+    LoadFastCheck(Arg<NameIdx>) = 87, // Placeholder
    LoadFastLoadFast {
        arg: Arg<u32>,
-    },
-    // 89: LOAD_FROM_DICT_OR_DEREF - not implemented, placeholder
-    LoadFromDictOrDeref(Arg<NameIdx>),
-    // 90: LOAD_FROM_DICT_OR_GLOBALS - not implemented, placeholder
-    LoadFromDictOrGlobals(Arg<NameIdx>),
-    // 91: LOAD_GLOBAL
-    LoadGlobal(Arg<NameIdx>),
-    // 92: LOAD_NAME
-    LoadNameAny(Arg<NameIdx>),
-    // 93: LOAD_SUPER_ATTR - not implemented, placeholder
+    } = 88, // Placeholder
+    LoadFromDictOrDeref(Arg<NameIdx>) = 89, // Placeholder
+    LoadFromDictOrGlobals(Arg<NameIdx>) = 90, // Placeholder
+    LoadGlobal(Arg<NameIdx>) = 91,
+    LoadName(Arg<NameIdx>) = 92,
    LoadSuperAttr {
        arg: Arg<u32>,
-    },
-    // 94: MAKE_CELL - not implemented, placeholder
-    MakeCell(Arg<NameIdx>),
-    // 95: MAP_ADD
+    } = 93, // Placeholder
+    MakeCell(Arg<NameIdx>) = 94, // Placeholder
    MapAdd {
        i: Arg<u32>,
-    },
-    // 96: MATCH_CLASS
-    MatchClass(Arg<u32>),
-    // 97: POP_JUMP_IF_FALSE
+    } = 95,
+    MatchClass(Arg<u32>) = 96,
    PopJumpIfFalse {
        target: Arg<Label>,
-    },
-    // 98: POP_JUMP_IF_NONE - not implemented, placeholder
+    } = 97,
    PopJumpIfNone {
        target: Arg<Label>,
-    },
-    // 99: POP_JUMP_IF_NOT_NONE - not implemented, placeholder
+    } = 98, // Placeholder
    PopJumpIfNotNone {
        target: Arg<Label>,
-    },
-    // 100: POP_JUMP_IF_TRUE
+    } = 99, // Placeholder
    PopJumpIfTrue {
        target: Arg<Label>,
-    },
-    // 101: RAISE_VARARGS
-    Raise {
+    } = 100,
+    RaiseVarargs {
        kind: Arg<RaiseKind>,
-    },
-    // 102: RERAISE
+    } = 101,
    Reraise {
        depth: Arg<u32>,
-    },
-    // 103: RETURN_CONST
+    } = 102,
    ReturnConst {
        idx: Arg<u32>,
-    },
-    // 104: SEND
+    } = 103,
    Send {
        target: Arg<Label>,
-    },
-    // 105: SET_ADD
+    } = 104,
    SetAdd {
        i: Arg<u32>,
-    },
-    // 106: SET_FUNCTION_ATTRIBUTE
+    } = 105,
    SetFunctionAttribute {
        attr: Arg<MakeFunctionFlags>,
-    },
-    // 107: SET_UPDATE - not implemented, placeholder
+    } = 106,
    SetUpdate {
        i: Arg<u32>,
-    },
-    // 108: STORE_ATTR
+    } = 107, // Placeholder
    StoreAttr {
        idx: Arg<NameIdx>,
-    },
-    // 109: STORE_DEREF
-    StoreDeref(Arg<NameIdx>),
-    // 110: STORE_FAST
-    StoreFast(Arg<NameIdx>),
-    // 111: STORE_FAST_LOAD_FAST
+    } = 108,
+    StoreDeref(Arg<NameIdx>) = 109,
+    StoreFast(Arg<NameIdx>) = 110,
    StoreFastLoadFast {
        store_idx: Arg<NameIdx>,
        load_idx: Arg<NameIdx>,
-    },
-    // 112: STORE_FAST_STORE_FAST - not implemented, placeholder
+    } = 111,
    StoreFastStoreFast {
        arg: Arg<u32>,
-    },
-    // 113: STORE_GLOBAL
-    StoreGlobal(Arg<NameIdx>),
-    // 114: STORE_NAME
-    StoreLocal(Arg<NameIdx>),
-    // 115: SWAP
+    } = 112, // Placeholder
+    StoreGlobal(Arg<NameIdx>) = 113,
+    StoreName(Arg<NameIdx>) = 114,
    Swap {
        index: Arg<u32>,
-    },
-    // 116: UNPACK_EX
+    } = 115,
    UnpackEx {
        args: Arg<UnpackExArgs>,
-    },
-    // 117: UNPACK_SEQUENCE
+    } = 116,
    UnpackSequence {
        size: Arg<u32>,
-    },
-    // 118: YIELD_VALUE
+    } = 117,
    YieldValue {
        arg: Arg<u32>,
-    },
-    // ==================== RustPython-only instructions (119+) ====================
-    // 119: BREAK
+    } = 118,
+    Resume {
+        arg: Arg<u32>,
+    } = 149,
+    // ==================== RustPython-only instructions (119-135) ====================
+    // Ideally, we want to be fully aligned with CPython opcodes, but we still have some leftovers.
+    // So we assign random IDs to these opcodes.
    Break {
        target: Arg<Label>,
-    },
-    // 120: BUILD_LIST_FROM_TUPLES
+    } = 119,
    BuildListFromTuples {
        size: Arg<u32>,
-    },
-    // 121: BUILD_MAP_FOR_CALL
+    } = 120,
    BuildMapForCall {
        size: Arg<u32>,
-    },
-    // 122: BUILD_SET_FROM_TUPLES
+    } = 121,
    BuildSetFromTuples {
        size: Arg<u32>,
-    },
-    // 123: BUILD_TUPLE_FROM_ITER
-    BuildTupleFromIter,
-    // 124: BUILD_TUPLE_FROM_TUPLES
+    } = 122,
+    BuildTupleFromIter = 123,
    BuildTupleFromTuples {
        size: Arg<u32>,
-    },
-    // 125: CALL_METHOD
+    } = 124,
    CallMethodPositional {
        nargs: Arg<u32>,
-    },
-    // 126: CALL_METHOD_KW
+    } = 125,
    CallMethodKeyword {
        nargs: Arg<u32>,
-    },
-    // 127: CALL_METHOD_EX
+    } = 126,
    CallMethodEx {
        has_kwargs: Arg<bool>,
-    },
-    // 128: CONTINUE
+    } = 127,
    Continue {
        target: Arg<Label>,
-    },
-    // 129: JUMP (CPython uses pseudo-op 256)
-    Jump {
-        target: Arg<Label>,
-    },
-    // 130: JUMP_IF_FALSE_OR_POP
+    } = 128,
    JumpIfFalseOrPop {
        target: Arg<Label>,
-    },
-    // 131: JUMP_IF_TRUE_OR_POP
+    } = 129,
    JumpIfTrueOrPop {
        target: Arg<Label>,
-    },
-    // 132: JUMP_IF_NOT_EXC_MATCH
-    JumpIfNotExcMatch(Arg<Label>),
-    // 133: LOAD_CLASSDEREF
-    LoadClassDeref(Arg<NameIdx>),
-    // 134: LOAD_CLOSURE (CPython uses pseudo-op 258)
-    LoadClosure(Arg<NameIdx>),
-    // 135: LOAD_METHOD (CPython uses pseudo-op 259)
-    LoadMethod {
-        idx: Arg<NameIdx>,
-    },
-    // 136: POP_BLOCK (CPython uses pseudo-op 263)
-    PopBlock,
-    // 137: REVERSE
+    } = 130,
+    JumpIfNotExcMatch(Arg<Label>) = 131,
+    LoadClassDeref(Arg<NameIdx>) = 132,
    Reverse {
        amount: Arg<u32>,
-    },
-    // 138: SET_EXC_INFO
-    SetExcInfo,
-    // 139: SUBSCRIPT
-    Subscript,
-    // 140-148: Reserved (padding to keep RESUME at 149)
-    Reserved140,
-    Reserved141,
-    Reserved142,
-    Reserved143,
-    Reserved144,
-    Reserved145,
-    Reserved146,
-    Reserved147,
-    Reserved148,
-    // 149: RESUME
-    Resume {
-        arg: Arg<u32>,
-    },
-    // If you add a new instruction here, be sure to keep LAST_INSTRUCTION updated
+    } = 133,
+    SetExcInfo = 134,
+    Subscript = 135,
+    // ===== Pseudo Opcodes (252+) ======
+    Jump {
+        target: Arg<Label>,
+    } = 252, // CPython uses pseudo-op 256
+    LoadClosure(Arg<NameIdx>) = 253, // CPython uses pseudo-op 258
+    LoadMethod {
+        idx: Arg<NameIdx>,
+    } = 254, // CPython uses pseudo-op 259
+    PopBlock = 255,                  // CPython uses pseudo-op 263
 }


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check placeholder opcode handling in VM and JIT echo "=== Checking VM frame dispatch for placeholder handling ===" rg -n -C3 "BinarySlice|EndFor|LoadLocals|PushNull" crates/vm/src/frame.rs echo "=== Checking JIT instruction handling ===" rg -n -C3 "BinarySlice|EndFor|LoadLocals|PushNull" crates/jit/src/instructions.rs echo "=== Checking compiler emission ===" rg -n -C3 "BinarySlice|EndFor|LoadLocals|PushNull" crates/codegen/src/compile.rs

Repository: RustPython/RustPython

Length of output: 201

🏁 Script executed:

# Also check if there's a TryFrom or match statement that handles these opcodes echo "=== Checking for opcode dispatch mechanisms ===" rg -n "impl.*TryFrom.*Instruction\|match.*opcode\|Instruction::" crates/compiler-core/src/bytecode.rs | head -20

Repository: RustPython/RustPython

Length of output: 112

🏁 Script executed:

# Look for panic or unreachable patterns on instruction handling echo "=== Checking for error handling on unsupported opcodes ===" rg -n -C2 "panic\!|unreachable\!|unimplemented\!" crates/vm/src/frame.rs | grep -A2 -B2 "opcode\|instruction"

Repository: RustPython/RustPython

Length of output: 249

🏁 Script executed:

# Find the actual structure echo "=== Repository structure ===" git ls-files | grep -E "(vm|jit|interpreter)" | head -20 echo "=== Looking for instruction-related files ===" git ls-files | grep -E "instruction|opcode|dispatch|frame" | head -20 echo "=== Check if crates/vm/src/frame.rs exists ===" ls -la crates/vm/src/ 2>/dev/null || echo "crates/vm/src not found" echo "=== List all crates ===" ls -la crates/ | grep "^d"

Repository: RustPython/RustPython

Length of output: 5171

🏁 Script executed:

# Search for instruction enum usage echo "=== Finding where Instruction enum is used ===" rg -l "Instruction::" | head -10 echo "=== Search for match statements on Instruction ===" rg -n "match.*instruction\|match.*opcode" -i | head -20

Repository: RustPython/RustPython

Length of output: 363

🏁 Script executed:

# Check the bytecode.rs file itself for usage patterns echo "=== Check bytecode.rs for impl blocks ===" rg -n "^impl" crates/compiler-core/src/bytecode.rs echo "=== Look for execute or dispatch methods ===" rg -n "fn.*execute\|fn.*dispatch" crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 2145

🏁 Script executed:

# Check the TryFrom implementation for Instruction echo "=== TryFrom<u8> for Instruction (around line 970) ===" sed -n '970,1010p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 1293

🏁 Script executed:

# Check the Instruction impl block at line 1711 echo "=== Instruction impl block (around line 1711) ===" sed -n '1711,1750p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 1360

🏁 Script executed:

# Look for instruction dispatch in frame.rs echo "=== Looking for instruction dispatch in frame.rs ===" rg -n "Instruction::" crates/vm/src/frame.rs | head -30

Repository: RustPython/RustPython

Length of output: 2292

🏁 Script executed:

# Check what happens with unsupported instructions echo "=== Looking for dispatch patterns in frame.rs ===" sed -n '1,100p' crates/vm/src/frame.rs | head -50

Repository: RustPython/RustPython

Length of output: 1863

🏁 Script executed:

# Find the execute_instruction method or similar echo "=== Find main execution loop ===" rg -n "fn.*execute\|fn.*dispatch\|match.*instruction" crates/vm/src/frame.rs | head -15

Repository: RustPython/RustPython

Length of output: 97

🏁 Script executed:

# Look for the large match statement that handles all instructions echo "=== Search for instruction execution pattern ===" rg -n "bytecode::Instruction::" crates/vm/src/frame.rs | wc -l # Find where the instruction execution happens echo "=== Looking for instruction match arms ===" sed -n '550,800p' crates/vm/src/frame.rs | grep -n "Instruction::"

Repository: RustPython/RustPython

Length of output: 2474

🏁 Script executed:

# Get a larger view to see the dispatch pattern echo "=== Finding the execute instruction method ===" rg -n "fn execute\|fn apply\|match.*instruction" crates/vm/src/frame.rs -A5 | head -40

Repository: RustPython/RustPython

Length of output: 111

🏁 Script executed:

# Search for all variants mentioned in the review as placeholders echo "=== Check if placeholder instructions are handled ===" rg -n "BinaryOpInplaceAddUnicode|BinarySlice|EndFor|LoadLocals|PushNull|BuildConstKeyMap" crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 118

🏁 Script executed:

# See the complete instruction match statement structure wc -l crates/vm/src/frame.rs sed -n '560,900p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 14993

🏁 Script executed:

# Let's see if the match statement has a catch-all or default case echo "=== Looking for match statement structure and completeness ===" sed -n '560,2700p' crates/vm/src/frame.rs | tail -200

Repository: RustPython/RustPython

Length of output: 8160

🏁 Script executed:

# Search for the end of the match statement echo "=== Find where the match statement ends ===" rg -n "^ }" crates/vm/src/frame.rs | tail -5

Repository: RustPython/RustPython

Length of output: 185

🏁 Script executed:

# Check if there's a wildcard or default pattern in the match echo "=== Check for default/wildcard patterns in match ===" rg -n "_ =>" crates/vm/src/frame.rs | grep -A2 -B2 "instruction\|opcode"

Repository: RustPython/RustPython

Length of output: 117

🏁 Script executed:

# Get context around line 2651 to see if there's a wildcard pattern echo "=== Context around match closing brace ===" sed -n '2600,2660p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 2614

🏁 Script executed:

# Count all instruction variants in the enum to see if all are handled echo "=== Count Instruction enum variants ===" grep -E "^\s+\w+\s*(\{|=)" crates/compiler-core/src/bytecode.rs | grep -v "impl\|^--" | wc -l echo "=== Count instruction match arms in frame.rs ===" rg "bytecode::Instruction::" crates/vm/src/frame.rs | wc -l

Repository: RustPython/RustPython

Length of output: 161

🏁 Script executed:

# Look for any unreachable or panic patterns related to instructions echo "=== Search for instruction validation ===" rg -n "is_valid\|panic\|unreachable\|match.*instruction" crates/compiler-core/src/bytecode.rs | head -20

Repository: RustPython/RustPython

Length of output: 106

🏁 Script executed:

# Check if the match is marked as exhaustive echo "=== Check the actual match structure ===" sed -n '585,595p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 502

🏁 Script executed:

# This is critical - 437 variants but only 117 match arms # Let's find the complete match statement in frame.rs echo "=== Find the main instruction match statement ===" sed -n '585,2500p' crates/vm/src/frame.rs | grep -n "^ }" | head -1

Repository: RustPython/RustPython

Length of output: 129

🏁 Script executed:

# Let's see what's around line 585 more broadly to see if the match closes echo "=== Checking match statement closure ===" sed -n '1200,1250p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 2885

🏁 Script executed:

# Search for where the big instruction match ends echo "=== Finding match arms count more accurately ===" rg -n "bytecode::Instruction::" crates/vm/src/frame.rs | tail -20

Repository: RustPython/RustPython

Length of output: 1398

🏁 Script executed:

# Check if there's any wildcard or default handling echo "=== Check the last match arms ===" sed -n '1100,1200p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 4826

🏁 Script executed:

# Look for any code that might catch unhandled instructions echo "=== Search for error handling ===" rg -n "unreachable\|panic\|todo\|unimplemented" crates/vm/src/frame.rs | head -20

Repository: RustPython/RustPython

Length of output: 98

🏁 Script executed:

# Let's get the actual size of the match statement by looking more carefully echo "=== Get the full match statement in execute_instruction ===" sed -n '540,1800p' crates/vm/src/frame.rs | tail -50

Repository: RustPython/RustPython

Length of output: 1974

🏁 Script executed:

# Look for where the match statement actually ends echo "=== Search for the end of the instruction match ===" rg -n "^ \}" crates/vm/src/frame.rs | head -10

Repository: RustPython/RustPython

Length of output: 260

🏁 Script executed:

# Check if there's a wildcard pattern at the end echo "=== Last 20 lines of instruction match arms ===" sed -n '1770,1800p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 1252

🏁 Script executed:

# Let's verify by finding all unhandled instruction variants echo "=== All Instruction variants ===" sed -n '679,958p' crates/compiler-core/src/bytecode.rs | grep -E "^\s+\w+(\s|{)" | head -60

Repository: RustPython/RustPython

Length of output: 138

🏁 Script executed:

# Check if compiler validation prevents emitting placeholder opcodes echo "=== Search for placeholder validation in codegen ===" rg -n "Placeholder\|BinarySlice\|EndFor" crates/codegen/src/compile.rs | head -20

Repository: RustPython/RustPython

Length of output: 117

🏁 Script executed:

# Check all Instruction variants more carefully echo "=== List instruction variants from bytecode.rs ===" sed -n '679,958p' crates/compiler-core/src/bytecode.rs | grep "=" | grep -v "impl\|//" | head -70

Repository: RustPython/RustPython

Length of output: 1544

🏁 Script executed:

# Check codegen to see if it emits placeholder opcodes echo "=== Check if codegen emits these opcodes ===" rg -n "emit\|Instruction::" crates/codegen/src/compile.rs | grep -E "Binary|EndFor|LoadLocals|PushNull" | head -20

Repository: RustPython/RustPython

Length of output: 109

🏁 Script executed:

# Check if there's validation before bytecode generation echo "=== Look for any bytecode validation ===" rg -n "validate\|check.*instruction\|verify" crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 105

🏁 Script executed:

# Verify if the TryFrom implementation actually accepts placeholder discriminants echo "=== Re-check the TryFrom ranges ===" sed -n '970,1010p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 1275

🏁 Script executed:

# Check what the ranges actually are - look at the enum values echo "=== Extract discriminant values for placeholders ===" sed -n '679,958p' crates/compiler-core/src/bytecode.rs | grep -E "BinarySlice|EndFor|LoadLocals|PushNull|BinaryOpInplaceAddUnicode"

Repository: RustPython/RustPython

Length of output: 326

🏁 Script executed:

# Confirm the actual enum discriminant values echo "=== Get exact discriminant values from enum ===" sed -n '679,958p' crates/compiler-core/src/bytecode.rs | grep -E "= [0-9]+" | head -80

Repository: RustPython/RustPython

Length of output: 2024

🏁 Script executed:

# Check if YieldValue is indeed 118 echo "=== Find YieldValue discriminant ===" rg -n "YieldValue.*=" crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 203

🏁 Script executed:

# Find YieldValue in the enum definition echo "=== Find YieldValue enum variant ===" sed -n '679,958p' crates/compiler-core/src/bytecode.rs | grep -n "YieldValue"

Repository: RustPython/RustPython

Length of output: 122

🏁 Script executed:

# Get context around line 118 to see what's there echo "=== Get all discriminants with their values ===" sed -n '679,960p' crates/compiler-core/src/bytecode.rs | tail -100

Repository: RustPython/RustPython

Length of output: 2580

VM dispatch for placeholder opcodes relies on panic rather than graceful error handling.

Placeholder instruction variants (e.g., BinaryOpInplaceAddUnicode, BinarySlice, EndFor, LoadLocals, PushNull) are defined in the enum and pass TryFrom<u8> validation since their discriminants fall within accepted ranges (0–118, 149, 119–135, 252–255). However, the VM's instruction dispatch ends with a catch-all pattern that calls unreachable!(), which will panic if any unhandled instruction reaches execution.

While this prevents silent failures, it's fragile: if codegen or bytecode loading ever accidentally emits a placeholder opcode, the VM will crash instead of failing gracefully. Consider:

Adding explicit error handling or a warning for placeholders in the catch-all pattern.

Validating at compile time that codegen never emits these opcodes.

Documenting why these opcodes exist in the enum if they're never used.

🤖 Prompt for AI Agents

In @crates/compiler-core/src/bytecode.rs around lines 679 - 958, The VM currently ends bytecode dispatch with a catch-all that calls unreachable!(), so placeholder enum variants in Bytecode (e.g., BinaryOpInplaceAddUnicode, BinarySlice, EndFor, LoadLocals, PushNull) can cause a panic if accidentally executed; change the dispatch default arm to return a descriptive runtime error (or Err) indicating an invalid/placeholder opcode rather than panicking, and add an explicit match arm or helper check (e.g., is_placeholder_opcode) that logs or maps placeholder discriminants to that error; additionally, add a validation step in bytecode loading or TryFrom<u8> conversion for Bytecode to reject placeholder discriminants early (or emit warnings) and document the retained placeholder variants in the Bytecode enum comment to clarify they must never be emitted by codegen.

coderabbitai · 2026-01-09T12:59:56Z

crates/vm/src/frame.rs

+            _ => {
+                unreachable!("{instruction:?} instruction should not be executed")
            }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's see the context around lines 1782-1784 in frame.rs head -1790 crates/vm/src/frame.rs | tail -20

Repository: RustPython/RustPython

Length of output: 732

🏁 Script executed:

# Find the Instruction enum definition fd -t f "bytecode.rs" | head -5

Repository: RustPython/RustPython

Length of output: 144

🏁 Script executed:

# Let's also check if there's an instruction definition in frame.rs itself rg "enum Instruction" crates/vm/src/ -A 20

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Find and read the Instruction enum definition rg "pub enum Instruction" crates/compiler-core/src/bytecode.rs -A 100

Repository: RustPython/RustPython

Length of output: 2516

🏁 Script executed:

# Get more context around lines 1782-1784 to see the full match statement sed -n '1700,1790p' crates/vm/src/frame.rs | head -100

Repository: RustPython/RustPython

Length of output: 4048

🏁 Script executed:

# Find the beginning of the match statement that contains lines 1782-1784 # Start from line 1782 and go backwards to find the match statement sed -n '1600,1782p' crates/vm/src/frame.rs | tail -200

Repository: RustPython/RustPython

Length of output: 8457

🏁 Script executed:

# Search for the function name that contains this code sed -n '1650,1782p' crates/vm/src/frame.rs | grep -E "fn |match " | head -20

Repository: RustPython/RustPython

Length of output: 119

🏁 Script executed:

# Find the function definition containing this match statement sed -n '1400,1700p' crates/vm/src/frame.rs | grep -B 50 "match.*instruction" | head -60

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Let's search for the execute or dispatch function that contains this match rg "fn execute" crates/vm/src/frame.rs -A 5 | head -40

Repository: RustPython/RustPython

Length of output: 1542

🏁 Script executed:

# Find the beginning of execute_instruction function rg "fn execute_instruction" crates/vm/src/frame.rs -A 500 | head -100

Repository: RustPython/RustPython

Length of output: 4067

🏁 Script executed:

# Count the number of instruction variants and see if the match is exhaustive rg "bytecode::Instruction::" crates/vm/src/frame.rs | wc -l

Repository: RustPython/RustPython

Length of output: 68

🏁 Script executed:

# Check if there are placeholder instructions and how they're handled rg "Placeholder" crates/compiler-core/src/bytecode.rs | head -20

Repository: RustPython/RustPython

Length of output: 788

🏁 Script executed:

# Look for comments around the catch-all unreachable to understand its intent sed -n '1770,1790p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 786

🏁 Script executed:

# Get the full list of instruction variants from the enum rg "^\s+([\w:]+)\s*=" crates/compiler-core/src/bytecode.rs | grep -E "pub enum Instruction" -A 200 | grep "=" | wc -l

Repository: RustPython/RustPython

Length of output: 66

🏁 Script executed:

# Check if Rust compiler would complain about non-exhaustive match # Look for warnings about unhandled variants rg "unreachable" crates/vm/src/frame.rs -B 5 -A 2

Repository: RustPython/RustPython

Length of output: 344

🏁 Script executed:

# Let's verify if the Instruction enum is marked as exhaustive rg "enum Instruction" crates/compiler-core/src/bytecode.rs -B 2 -A 1

Repository: RustPython/RustPython

Length of output: 232

🏁 Script executed:

# Get the complete list of all instruction variants from the enum rg "pub enum Instruction" crates/compiler-core/src/bytecode.rs -A 500 | grep -E "^\s+\w+.*=" | sed 's/.*Instruction:://' | sed 's/\s*=.*//' | sort > /tmp/enum_variants.txt # Get all the instruction variants handled in execute_instruction rg "bytecode::Instruction::" crates/vm/src/frame.rs | sed 's/.*bytecode::Instruction:://' | sed 's/[^a-zA-Z_].*//' | sort | uniq > /tmp/handled_variants.txt # Compare the two echo "=== Variants in enum but NOT handled (should be empty or only placeholders) ===" comm -23 /tmp/enum_variants.txt /tmp/handled_variants.txt echo "" echo "=== Total enum variants ===" wc -l /tmp/enum_variants.txt echo "" echo "=== Total handled variants ===" wc -l /tmp/handled_variants.txt

Repository: RustPython/RustPython

Length of output: 2554

🏁 Script executed:

# Let me get a cleaner list of just the variant names from the enum rg "pub enum Instruction" crates/compiler-core/src/bytecode.rs -A 500 | grep -E "^\s+[A-Z][a-zA-Z0-9]*(\s*{|$|\()" | awk '{print $1}' | sort | uniq > /tmp/enum_clean.txt # Get handled variants more carefully rg "Instruction::[A-Z]" crates/vm/src/frame.rs -o | sed 's/Instruction:://' | sort | uniq > /tmp/handled_clean.txt echo "=== Unhandled instruction variants ===" comm -23 /tmp/enum_clean.txt /tmp/handled_clean.txt echo "" echo "=== Count ===" echo "Enum variants: $(wc -l < /tmp/enum_clean.txt)" echo "Handled variants: $(wc -l < /tmp/handled_clean.txt)"

Repository: RustPython/RustPython

Length of output: 1752

🏁 Script executed:

# Let's also check if there are any placeholder instructions that don't have handlers rg "// Placeholder" crates/compiler-core/src/bytecode.rs | grep -oE "[A-Z][a-zA-Z0-9]*.*Placeholder" | sed 's/ .*//' > /tmp/placeholders.txt echo "=== Placeholder instructions ===" cat /tmp/placeholders.txt

Repository: RustPython/RustPython

Length of output: 379

🏁 Script executed:

# Let me extract instruction handlers more carefully by looking at the actual match arms rg "bytecode::Instruction::" crates/vm/src/frame.rs | head -100 | sed 's/.*Instruction:://' | sed 's/\s*=>.*//' | sed 's/\s*{.*//' | awk '{print $1}' | sort | uniq

Repository: RustPython/RustPython

Length of output: 1286

🏁 Script executed:

# Count how many match arms are in execute_instruction rg "bytecode::Instruction::" crates/vm/src/frame.rs | wc -l

Repository: RustPython/RustPython

Length of output: 68

🏁 Script executed:

# Check if there are explicit handlers for the placeholder instructions rg "Instruction::(Cache|BinaryOpInplaceAddUnicode|BinarySlice|EndFor|ExitInitCheck|InterpreterExit|LoadAssertionError|LoadLocals|PushNull|ReturnGenerator|StoreSlice)" crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Let's verify the total count of instruction variants more accurately # by extracting just the variant names from the enum definition sed -n '/pub enum Instruction/,/^}/p' crates/compiler-core/src/bytecode.rs | grep -E '^\s+[A-Z][a-zA-Z0-9]*\s*(\{|=|\()' | wc -l

Repository: RustPython/RustPython

Length of output: 68

🏁 Script executed:

# Check the structure of the enum more carefully sed -n '/pub enum Instruction/,/^}/p' crates/compiler-core/src/bytecode.rs | tail -50

Repository: RustPython/RustPython

Length of output: 1280

🏁 Script executed:

# Let's verify if Rust's exhaustiveness checking would work here # by checking if the enum has any special attributes rg "#\[.*\]" crates/compiler-core/src/bytecode.rs | grep -B 5 "pub enum Instruction" | head -20

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Let's get the exact list of unhandled variants by comparing the handled list with enum sed -n '/pub enum Instruction/,/^}/p' crates/compiler-core/src/bytecode.rs | grep -E '^\s+[A-Z]' | awk '{print $1}' | sort > /tmp/all_variants.txt # Get all handled variants from execute_instruction rg "bytecode::Instruction::" crates/vm/src/frame.rs | sed 's/.*Instruction:://' | sed 's/\s*=>.*//' | sed 's/\s*{.*//' | sed 's/\s*(.*//' | awk '{print $1}' | sort | uniq > /tmp/handled.txt # Find variants that are not handled echo "=== Variants NOT explicitly handled ===" comm -23 /tmp/all_variants.txt /tmp/handled.txt echo "" echo "=== Are these all placeholders? ===" # Check which of the unhandled are marked as placeholders sed -n '/pub enum Instruction/,/^}/p' crates/compiler-core/src/bytecode.rs | grep -B 1 "Placeholder" | grep -E "^\s+[A-Z]" | awk '{print $1}' | sort > /tmp/placeholders_list.txt cat /tmp/placeholders_list.txt

Repository: RustPython/RustPython

Length of output: 1504

🏁 Script executed:

# Let's verify that these "unhandled" variants are actually being rejected # by checking if there's any handling outside of execute_instruction rg "LoadFast|LoadDeref|StoreFast|StoreDeref|DeleteFast" crates/vm/src/frame.rs | head -20

Repository: RustPython/RustPython

Length of output: 544

🏁 Script executed:

# Check if there's pattern matching on the argument or other routing logic # that might handle these variants separately sed -n '1200,1400p' crates/vm/src/frame.rs | grep -E "match|fn " | head -30

Repository: RustPython/RustPython

Length of output: 2371

🏁 Script executed:

# Let's look at the actual match statement more carefully # and search for specific instruction handlers sed -n '1300,1500p' crates/vm/src/frame.rs | grep -A 3 "LoadFast\|StoreFast\|DeleteFast"

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Let's see if the unhandled variants are indeed in the match statement # by checking the line count and structure more carefully rg "bytecode::Instruction::" crates/vm/src/frame.rs | grep -c "=> {"

Repository: RustPython/RustPython

Length of output: 67

🏁 Script executed:

# Get a comprehensive list of what's actually being matched sed -n '/fn execute_instruction/,/^ fn /p' crates/vm/src/frame.rs | grep "bytecode::Instruction::" | sed 's/.*Instruction:://' | sed 's/\s*=>.*//' | sed 's/\s*{.*//' | awk '{print $1}' | sort | uniq | wc -l

Repository: RustPython/RustPython

Length of output: 68

🏁 Script executed:

# Let's verify the exact list of handled variants with a more precise approach sed -n '/fn execute_instruction/,/^ fn [a-z]/p' crates/vm/src/frame.rs | grep "bytecode::Instruction::" | sed 's/.*bytecode::Instruction:://' | sed 's/\s*[=({].*//' | sort | uniq > /tmp/actually_handled.txt cat /tmp/actually_handled.txt | wc -l echo "" echo "=== Sample of handled variants ===" head -20 /tmp/actually_handled.txt

Repository: RustPython/RustPython

Length of output: 366

🏁 Script executed:

# Check if these unhandled variants are truly all placeholders or only used variants comm -23 /tmp/all_variants.txt /tmp/actually_handled.txt | while read var; do grep -q "$var.*Placeholder" <(sed -n '/pub enum Instruction/,/^}/p' crates/compiler-core/src/bytecode.rs) && echo "$var: PLACEHOLDER" || echo "$var: NOT PLACEHOLDER" done

Repository: RustPython/RustPython

Length of output: 4028

🏁 Script executed:

# Let's verify if the key instructions like LoadFast are actually being handled # by checking the actual match arms rg "Instruction::LoadFast\(" crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 118

🏁 Script executed:

# Let's count the total lines of the match statement to understand its scope sed -n '/fn execute_instruction/,/^ fn /p' crates/vm/src/frame.rs | wc -l

Repository: RustPython/RustPython

Length of output: 69

🏁 Script executed:

# Let's check the end of execute_instruction to see the catch-all sed -n '/fn execute_instruction/,/^ fn /p' crates/vm/src/frame.rs | tail -20

Repository: RustPython/RustPython

Length of output: 765

Add compile-time exhaustiveness checking for new instruction variants.

The catch-all pattern with unreachable!() disables Rust's exhaustiveness checking. If new Instruction variants are added to the enum without corresponding match arms here, they will panic at runtime rather than fail at compile time. Consider restructuring to leverage the compiler's exhaustiveness checking—for example, by removing the catch-all if all variants are already handled, or by using a #[non_exhaustive] attribute on the enum to make missing handlers a compile error.

🤖 Prompt for AI Agents

In @crates/vm/src/frame.rs around lines 1782 - 1784, The match currently uses a catch-all arm with unreachable!() for the Instruction enum which defeats compile-time exhaustiveness checking; replace the wildcard arm by explicitly matching every Instruction variant used in this match so the compiler will error if new variants are added (i.e., remove the `_ => unreachable!()` arm in the match on `instruction` and add explicit arms for each `Instruction::<Variant>` handled here), or if the enum is intended to be extensible annotate the enum definition with `#[non_exhaustive]` and then handle the `_` case with a documented conversion/error path; ensure you update the match in the function/method in frame.rs where `instruction` is matched so future additions to `Instruction` cause a compile-time failure instead of a runtime panic.

@ShaharNaveh uuuuh, this is totally right. I immediately broken build by forgetting change arm during merge. listing all is better.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

coderabbitai · 2026-01-09T12:59:56Z

scripts/generate_opcode_metadata.py

+    @property
+    def cpython_name(self) -> str:
+        if self.rust_name == "CopyItem":
+            return "COPY"
+        name = self.rust_name
+        name = re.sub(r"(?<=[a-z0-9])([A-Z])", r"_\1", name)
+        return re.sub(r"(\D)(\d+)$", r"\1_\2", name).upper()
+        return re.sub(r"(?<=[a-z0-9])([A-Z])", r"_\1", self.rust_name).upper()
+        return re.sub(r"(?<!^)(?=[A-Z])", "_", self.rust_name).upper()


⚠️ Potential issue | 🔴 Critical

Fix unreachable code in cpython_name property.

Lines 28-29 are unreachable because line 27 always executes a return statement. This appears to be a copy-paste error during development.

🐛 Remove unreachable code

@property def cpython_name(self) -> str: if self.rust_name == "CopyItem": return "COPY" name = self.rust_name name = re.sub(r"(?<=[a-z0-9])([A-Z])", r"_\1", name) return re.sub(r"(\D)(\d+)$", r"\1_\2", name).upper() - return re.sub(r"(?<=[a-z0-9])([A-Z])", r"_\1", self.rust_name).upper() - return re.sub(r"(?<!^)(?=[A-Z])", "_", self.rust_name).upper()

🤖 Prompt for AI Agents

In @scripts/generate_opcode_metadata.py around lines 21 - 29, The cpython_name property contains multiple return statements after the intended two-step transformation, making the last two returns unreachable; edit the cpython_name method to keep the special-case for "CopyItem" and the intended conversion (assign rust_name to name, apply the two regex substitutions and return name.upper()), and remove the extraneous return lines (the duplicate re.sub returns) so only the correct transformation and single return remain in the cpython_name property.

youknowone

👍

coderabbitai bot reviewed Jan 3, 2026

View reviewed changes

ShaharNaveh force-pushed the bytecode-opcode-id branch from 153a03a to 5369dab Compare January 6, 2026 08:22

ShaharNaveh marked this pull request as draft January 6, 2026 08:22

coderabbitai bot reviewed Jan 6, 2026

View reviewed changes

ShaharNaveh force-pushed the bytecode-opcode-id branch from 5369dab to 0701649 Compare January 6, 2026 11:26

ShaharNaveh added 6 commits January 6, 2026 16:11

All opcodes have id

5e4f61f

Fix names

2b6b0bd

Fix frame.rs

6b96d0e

Fix codegen

2a6672f

re add GetYieldFromIter

c5eab6a

Fix frame.rs

c56c91f

ShaharNaveh force-pushed the bytecode-opcode-id branch from 94ad7d5 to c56c91f Compare January 6, 2026 14:11

ShaharNaveh added 4 commits January 6, 2026 16:14

Fix opcode.rs

04f6d42

Fix TryFrom<u8> for Instruction

a7a0eda

Mark failing tests

1646643

fix jit

e166fda

youknowone reviewed Jan 6, 2026

View reviewed changes

ShaharNaveh added 10 commits January 7, 2026 10:56

Fix more jit

6df35bf

Merge remote-tracking branch 'upstream/main' into bytecode-opcode-id

2ab530e

Merge remote-tracking branch 'upstream/main' into bytecode-opcode-id

04139e4

VarArgs -> Varargs

b222c80

Refactor scripts/generate_opcode_metadata.py

0006700

Add placeholder opcodes

89c20f6

Fix

6cdf6f5

Don't crash on placejolder bytecode

8e1433c

Unmark passing test

9e6cbed

Mark failing test

258f61c

ShaharNaveh commented Jan 9, 2026

View reviewed changes

ShaharNaveh marked this pull request as ready for review January 9, 2026 12:50

ShaharNaveh requested a review from youknowone January 9, 2026 12:50

coderabbitai bot reviewed Jan 9, 2026

View reviewed changes

youknowone approved these changes Jan 9, 2026

View reviewed changes

youknowone merged commit c2bfdf3 into RustPython:main Jan 9, 2026
13 checks passed

This was referenced Jan 10, 2026

Remove Reverse bytecode #6675

Merged

Move Instruction enum to its own file #6693

Merged

Super instructions #6694

Merged

Move OpArg to its own file #6703

Merged

terryluan12 pushed a commit to terryluan12/RustPython that referenced this pull request Jan 15, 2026

Assign opcode ids (RustPython#6637)

b9a1554

Assign opcode ids #6637

Assign opcode ids #6637

Uh oh!

Conversation

ShaharNaveh commented Jan 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

youknowone commented Jan 3, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ShaharNaveh Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

ShaharNaveh commented Jan 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 3, 2026 •

edited

Loading

ShaharNaveh Jan 9, 2026 •

edited

Loading

coderabbitai bot Jan 9, 2026 •

edited

Loading