Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Jan 3, 2026

Fix #6555

And also generate opcode metadata

These instructions are newly introduced in Python 3.11 (exception) and 3.12 (with)

Summary by CodeRabbit

  • New Features

    • Added proper exception handling and error detection for async operations (async for, async with, async comprehensions outside async functions).
    • Optimized comprehension execution through inlining support.
    • Enhanced async generator finalization.
  • Bug Fixes

    • Improved exception chain preservation during generator and coroutine execution.
    • Fixed exception context handling across yields and awaits.
  • Chores

    • Updated internal bytecode structure for CPython 3.13 compatibility.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 3, 2026

📝 Walkthrough

Walkthrough

Implements a comprehensive Python async/exception handling system with CPython 3.13 bytecode alignment: exception table generation via varint encoding, frame exception-table-driven unwinding, async error variants, PEP 709 comprehension inlining, async generator destructors, and expanded 1-byte-aligned instruction set.

Changes

Cohort / File(s) Summary
Exception Table & Varint Infrastructure
crates/compiler-core/src/bytecode.rs, crates/compiler-core/src/varint.rs, crates/compiler-core/src/lib.rs
Added public ExceptionTableEntry, encode_exception_table(), find_exception_handler() for exception handling; new varint.rs with variable-length encoding (write_varint, read_varint, signed variants, start-marker support); exposed varint module publicly.
Codegen Exception Handler Wiring
crates/codegen/src/ir.rs, crates/codegen/src/error.rs
Added ExceptHandlerInfo struct with handler block/stack depth/lasti preservation tracking; integrated exception table generation into CodeInfo.finalize_code(); added three new error variants (InvalidAsyncFor, InvalidAsyncWith, InvalidAsyncComprehension).
Comprehension Inlining (PEP 709)
crates/codegen/src/symboltable.rs
Added comp_inlined field to SymbolTable; new merging step to propagate inlined comprehension bindings from child to parent scope; differentiation between generator vs non-generator comprehensions for inlining eligibility.
Instruction Set & Bytecode Expansion
crates/compiler-core/src/bytecode.rs
Expanded Instruction enum with CPython 3.13 alignment: 60+ no-argument variants (Cache, BeforeAsyncWith, LoadConst, etc.) and 80+ with-argument variants (BinaryOp, BuildList, CallFunctionPositional, JumpBackward, etc.); updated RaiseKind enum with BareRaise, Raise, RaiseCause, ReraiseFromStack.
Frame Exception Unwinding Redesign
crates/vm/src/frame.rs
Removed internal block-tracking structures (Block, BlockType, blocks field); replaced with exception-table-driven unwinding via find_exception_handler(); added reraise detection to skip traceback/context updates; enhanced yield-from/await logic; added LoadFastAndClear for inlined comprehensions; new exception state (PUSH_EXC_INFO) management.
VM Exception Context & Coroutine Handling
crates/vm/src/vm/mod.rs, crates/vm/src/coroutine.rs
Added frame-level exception context push/pop for isolation in with_frame; removed take_exception() method; enhanced contextualize_exception() with cycle-detection traversal; refactored run_with_context to swap/restore generator exception state across yields; added early-exit for generator close when not started.
Async Generator Finalization
crates/vm/src/builtins/asyncgenerator.rs
Imported and declared Destructor in PyAsyncGen class; implemented Destructor trait with del() method to trigger finalizer on drop if not closed.
Opcode & Code Object Utilities
crates/stdlib/src/opcode.rs, crates/vm/src/builtins/code.rs, crates/vm/src/version.rs
Converted opcode metadata methods (is_valid, has_arg, has_const, etc.) from const fn to runtime validation via Instruction::try_from() pattern matching; refactored _varname_from_oparg_ to multi-source lookup (varnames → cellvars → freevars); updated PYC_MAGIC_NUMBER from 3531 to 2997.
Bytecode Instruction Handling
crates/jit/src/instructions.rs
Adjusted block termination logic to inspect last instruction before jumping; expanded reachability signaling to treat Jump as unreachable code (alongside ReturnValue/ReturnConst); refined trap insertion for truly unterminated blocks; removed explicit SetupLoop handling.
Opcode Metadata Generation
scripts/generate_opcode_metadata.py
New script to generate CPython 3.13-compatible opcode metadata: parses RustPython Instruction enum, maps variant names to CPython names via lookup table, generates Lib/_opcode_metadata.py with opcode name-to-index mapping and compatibility constants.
Spell-Check Dictionary
.cspell.dict/cpython.txt
Added token CLASSDEREF to spell-check dictionary.

Sequence Diagram

sequenceDiagram
    participant Codegen as Codegen<br/>(ir.rs)
    participant ExcTable as Exception<br/>Table Gen
    participant Bytecode as Bytecode<br/>(bytecode.rs)
    participant Frame as Frame<br/>(frame.rs)
    participant Handler as Handler<br/>Block
    
    Codegen->>Codegen: Collect exception handlers<br/>per instruction
    Codegen->>Codegen: Build ExceptHandlerInfo<br/>(block, depth, lasti)
    Codegen->>ExcTable: finalize_code() calls<br/>generate_exception_table()
    ExcTable->>ExcTable: Encode entries via<br/>varint encoding
    ExcTable->>Bytecode: Return serialized<br/>exception table
    Bytecode->>Bytecode: Create CodeObject<br/>with exception table
    
    Note over Frame,Handler: Runtime Exception Handling
    
    Frame->>Frame: Execute instruction<br/>raises exception
    Frame->>Bytecode: find_exception_handler()<br/>(exception_table, offset)
    Bytecode->>Bytecode: Decode & match<br/>varint entries
    Bytecode-->>Frame: Return matching handler<br/>(target, depth, push_lasti)
    Frame->>Frame: Push exception info<br/>to stack
    Frame->>Handler: Jump to handler block
    Handler->>Handler: Execute exception<br/>handler logic
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • code object linetable #6150 — Introduces exception-table and linetable fields in code objects with varint utilities that this PR directly completes and integrates into codegen/bytecode generation.
  • sys.set_asyncgen_hook #6439 — Adds async-generator finalization hook infrastructure (call_finalizer) that this PR extends by implementing the Destructor trait-based del() path for proper cleanup.
  • Implement more except* #6545 — Implements exception state handling (SetExcInfo, except*) in frame/bytecode that overlaps with this PR's frame unwinding rewrite and exception-table-driven exception routing.

Suggested reviewers

  • ShaharNaveh
  • coolreader18

Poem

🐰 Hopping through exception tables fine,
With varint encoding in a line,
Async handlers now take their place,
Frame unwinding at a faster pace—
Comprehensions inlined just right,
CPython 3.13 shines bright!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'Rework compiler about exception handling' accurately describes the main objective, which is a comprehensive rework of the compiler's exception handling system across multiple modules including error types, IR structures, symbol tables, bytecode, and frame execution logic.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone mentioned this pull request Jan 3, 2026
@youknowone youknowone requested a review from ShaharNaveh January 3, 2026 13:28
@youknowone youknowone force-pushed the exceptions branch 4 times, most recently from 1ea292b to d95ffa4 Compare January 3, 2026 15:16
@youknowone youknowone marked this pull request as ready for review January 3, 2026 15:39
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
scripts/generate_opcode_metadata.py (2)

14-28: Regex-based Instruction extraction is convenient but somewhat brittle

The re.search(r"pub enum Instruction \{(.+?)\n\}", ...) + line‑based variant extraction will work as long as Instruction stays as a simple enum with each variant starting with a capitalized identifier and the closing brace on its own line.

If Instruction ever grows nested braces, attributes/macros around variants, or a different closing‑brace layout, this may fail silently or mis‑parse.

You might consider:

  • tightening the regex to anchor on ^pub enum Instruction and ^} with re.MULTILINE, or
  • using a tiny Rust/syn‑based helper (or a more structural Python parser) if this script becomes part of regular tooling.

188-246: Deriving metadata constants from data instead of hardcoding would reduce drift risk

HAVE_ARGUMENT = 44 and MIN_INSTRUMENTED_OPCODE = 236 are currently hardcoded to CPython 3.13 expectations. That’s fine now, but it’s easy for them to fall out of sync with future Instruction layout changes.

Consider deriving them from opmap instead, e.g.:

  • compute HAVE_ARGUMENT as the smallest opcode index whose CPython name is known to require an argument (e.g. WITH_EXCEPT_START), or at least assert that opmap['WITH_EXCEPT_START'] == 44, and
  • compute MIN_INSTRUMENTED_OPCODE from the minimum index of the relevant instrumented opcodes, or assert they match expected constants.

Even a simple assert using find_opcode(...) against expected values would catch mismatches early.

crates/vm/src/frame.rs (1)

2741-2759: pop_multiple now panics loudly on stack underflow – acceptable as an internal invariant check

The added diagnostic panic in pop_multiple (with opcode name and source metadata) is a good guardrail against internal bytecode/VM invariants being violated and should help debug compiler bugs quickly. It does mean a bad bytecode stream will crash the interpreter rather than raising a Python SystemError, but that’s probably acceptable given this is “should never happen” territory.

If you ever want to make this failure mode more user‑friendly, you could convert the panic into a fatal() call that raises a SystemError (or at least a clearly tagged internal error) instead of aborting the process, but that’s not strictly necessary.

crates/stdlib/src/opcode.rs (1)

55-123: Minor redundancy in validation pattern.

Each has_* function calls is_valid() then immediately does Instruction::try_from() again. Since is_valid() already performs the conversion, you could consider caching the result or combining the checks. However, this is a minor efficiency concern and the code is functionally correct.

🔎 Optional: Simplify by reusing conversion result
-        pub fn has_const(opcode: i32) -> bool {
-            if !Self::is_valid(opcode) {
-                return false;
-            }
-            if let Ok(instr) = Instruction::try_from(opcode as u8) {
+        pub fn has_const(opcode: i32) -> bool {
+            if !(0..=255).contains(&opcode) {
+                return false;
+            }
+            if let Ok(instr) = Instruction::try_from(opcode as u8) {
                 matches!(
                     instr,
                     Instruction::LoadConst { .. } | Instruction::ReturnConst { .. }
                 )
             } else {
                 false
             }
         }
crates/compiler-core/src/varint.rs (1)

100-138: Tests cover basic cases but could be more thorough.

The tests verify encoding size and round-trip behavior, which is good. Consider adding value verification for the basic write_varint/read_varint functions to ensure the decoded values match the encoded ones.

🔎 Optional: Add value verification to tests
 #[test]
 fn test_write_read_varint() {
     let mut buf = Vec::new();
     write_varint(&mut buf, 0);
     write_varint(&mut buf, 63);
     write_varint(&mut buf, 64);
     write_varint(&mut buf, 4095);

     // Values: 0, 63, 64, 4095
     assert_eq!(buf.len(), 1 + 1 + 2 + 2);
+
+    // Verify round-trip
+    let mut pos = 0;
+    assert_eq!(read_varint(&buf, &mut pos), Some(0));
+    assert_eq!(read_varint(&buf, &mut pos), Some(63));
+    assert_eq!(read_varint(&buf, &mut pos), Some(64));
+    assert_eq!(read_varint(&buf, &mut pos), Some(4095));
 }
crates/compiler-core/src/bytecode.rs (3)

61-85: Consider early termination optimization for find_exception_handler.

The current implementation scans all entries even after finding a match. If entries are sorted by start offset (which is typical), you could add an early termination when start > offset to avoid scanning remaining entries.

🔎 Optional optimization
 pub fn find_exception_handler(table: &[u8], offset: u32) -> Option<ExceptionTableEntry> {
     let mut pos = 0;
     while pos < table.len() {
         let start = read_varint_with_start(table, &mut pos)?;
         let size = read_varint(table, &mut pos)?;
         let target = read_varint(table, &mut pos)?;
         let depth_lasti = read_varint(table, &mut pos)?;
 
         let end = start + size;
         let depth = (depth_lasti >> 1) as u16;
         let push_lasti = (depth_lasti & 1) != 0;
 
         if offset >= start && offset < end {
             return Some(ExceptionTableEntry {
                 start,
                 end,
                 target,
                 depth,
                 push_lasti,
             });
         }
+        // Early termination if entries are sorted by start offset
+        // if start > offset {
+        //     break;
+        // }
     }
     None
 }

2063-2069: Reasonable defensive default for BuildSlice stack effect.

Using try_get with a fallback prevents panics on invalid args. However, consider logging when the fallback is used to help identify potential compiler bugs that generate invalid arguments.


2366-2436: Good test coverage for exception table encoding.

The tests cover basic roundtrip, empty table edge case, and boundary conditions. Consider adding a test with large offset/depth values to exercise varint encoding edge cases.

🔎 Optional: Add test for large values
#[test]
fn test_exception_table_large_values() {
    // Test varint encoding with values > 63 (requires multi-byte encoding)
    let entries = vec![ExceptionTableEntry::new(1000, 2000, 5000, 255, true)];
    let encoded = encode_exception_table(&entries);
    
    let handler = find_exception_handler(&encoded, 1500);
    assert!(handler.is_some());
    let handler = handler.unwrap();
    assert_eq!(handler.start, 1000);
    assert_eq!(handler.end, 2000);
    assert_eq!(handler.target, 5000);
    assert_eq!(handler.depth, 255);
    assert!(handler.push_lasti);
}
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f0526b9 and 9bf4fd4.

⛔ Files ignored due to path filters (15)
  • Lib/_opcode_metadata.py is excluded by !Lib/**
  • Lib/importlib/_bootstrap_external.py is excluded by !Lib/**
  • Lib/test/test__opcode.py is excluded by !Lib/**
  • Lib/test/test_asyncgen.py is excluded by !Lib/**
  • Lib/test/test_compile.py is excluded by !Lib/**
  • Lib/test/test_contextlib.py is excluded by !Lib/**
  • Lib/test/test_contextlib_async.py is excluded by !Lib/**
  • Lib/test/test_dis.py is excluded by !Lib/**
  • Lib/test/test_exceptions.py is excluded by !Lib/**
  • Lib/test/test_generators.py is excluded by !Lib/**
  • Lib/test/test_importlib/test_util.py is excluded by !Lib/**
  • Lib/test/test_patma.py is excluded by !Lib/**
  • Lib/test/test_traceback.py is excluded by !Lib/**
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_compiler_core__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (17)
  • .cspell.dict/cpython.txt
  • crates/codegen/src/compile.rs
  • crates/codegen/src/error.rs
  • crates/codegen/src/ir.rs
  • crates/codegen/src/symboltable.rs
  • crates/compiler-core/src/bytecode.rs
  • crates/compiler-core/src/lib.rs
  • crates/compiler-core/src/varint.rs
  • crates/jit/src/instructions.rs
  • crates/stdlib/src/opcode.rs
  • crates/vm/src/builtins/asyncgenerator.rs
  • crates/vm/src/builtins/code.rs
  • crates/vm/src/coroutine.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/version.rs
  • crates/vm/src/vm/mod.rs
  • scripts/generate_opcode_metadata.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/version.rs
  • crates/codegen/src/error.rs
  • crates/stdlib/src/opcode.rs
  • crates/vm/src/builtins/asyncgenerator.rs
  • crates/vm/src/builtins/code.rs
  • crates/vm/src/vm/mod.rs
  • crates/codegen/src/symboltable.rs
  • crates/codegen/src/ir.rs
  • crates/compiler-core/src/varint.rs
  • crates/compiler-core/src/lib.rs
  • crates/vm/src/coroutine.rs
  • crates/jit/src/instructions.rs
  • crates/vm/src/frame.rs
  • crates/compiler-core/src/bytecode.rs
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

  • scripts/generate_opcode_metadata.py
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: RustPython is a Python 3 interpreter written in Rust implementing Python 3.13.0+ compatibility with complete Python environment in Rust (not CPython bindings), clean implementation without compatibility hacks, and cross-platform support including WebAssembly compilation
📚 Learning: 2025-12-09T08:46:58.660Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6358
File: crates/vm/src/exception_group.rs:173-185
Timestamp: 2025-12-09T08:46:58.660Z
Learning: In crates/vm/src/exception_group.rs, the derive() method intentionally always creates a BaseExceptionGroup instance rather than preserving the original exception class type. This is a deliberate design decision that differs from CPython's behavior.

Applied to files:

  • crates/vm/src/vm/mod.rs
  • crates/codegen/src/ir.rs
  • crates/vm/src/coroutine.rs
  • crates/vm/src/frame.rs
  • crates/compiler-core/src/bytecode.rs
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: When modifying bytecode instructions, perform a full clean build by running `rm -r target/debug/build/rustpython-* && find . | grep -E '\.pyc$' | xargs rm -r`

Applied to files:

  • scripts/generate_opcode_metadata.py
🧬 Code graph analysis (3)
crates/stdlib/src/opcode.rs (1)
crates/compiler-core/src/bytecode.rs (3)
  • try_from (1131-1137)
  • try_from (1158-1163)
  • try_from (1172-1178)
crates/compiler-core/src/varint.rs (1)
crates/vm/src/builtins/code.rs (1)
  • read_varint (1078-1096)
crates/vm/src/coroutine.rs (6)
crates/stdlib/src/ssl.rs (1)
  • result (2652-2652)
crates/stdlib/src/csv.rs (1)
  • result (689-708)
crates/stdlib/src/openssl.rs (1)
  • result (713-713)
crates/vm/src/stdlib/io.rs (1)
  • vm (3791-3793)
crates/vm/src/frame.rs (1)
  • func (2389-2390)
crates/vm/src/stdlib/functools.rs (2)
  • func (62-64)
  • func (223-223)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run tests under miri
  • GitHub Check: Check Rust code with clippy
🔇 Additional comments (38)
.cspell.dict/cpython.txt (1)

17-17: Add CLASSDEREF spelling entry – looks correct

Including CLASSDEREF here keeps the dictionary in sync with the new LOAD_CLASSDEREF/CLASSDEREF usage and avoids false positives from the spell checker.

crates/vm/src/version.rs (1)

131-140: Confirm PYC magic number alignment with upstream CPython

PYC_MAGIC_NUMBER drives .pyc compatibility; changing it to 2997 is fine as long as it matches the CPython version you intend to emulate (per Lib/importlib/_bootstrap_external.py). Please double‑check against the exact CPython commit / release you’re targeting so that cross‑runtime .pyc files behave as expected.

crates/vm/src/builtins/code.rs (1)

1003-1027: Updated _varname_from_oparg correctly handles locals + cell/free vars

The new logic that interprets opcode as an index into varnames, then cellvars, then freevars (with explicit bounds checks) matches CPython’s combined-local indexing model for debug/introspection helpers. This should fix name resolution for cell/free-variable opargs in tracebacks and tooling.

Please ensure existing tests that exercise _varname_from_oparg (e.g. for closures, class scopes, comprehensions) still pass and consider adding a small focused test covering each region (varnames / cellvars / freevars) if not already present.

crates/jit/src/instructions.rs (1)

171-228: Block termination & unreachable tracking changes look sound; keep an eye on Cranelift expectations

The updated logic to:

  • add a fallthrough jump only when the current block is non‑empty and its last instruction is not already a terminator, and
  • mark in_unreachable_code for Jump in addition to ReturnValue / ReturnConst, and
  • emit a final trap only when the last block truly lacks a terminator

is consistent with Cranelift’s requirement that every block end in a terminator and avoids double‑terminator issues. It should also prevent generating dead instructions after unconditional branches, while still allowing re‑entry at label targets.

It would be good to exercise this via JIT tests that include:

  • an unconditional JUMP followed by non‑labelled bytecode (to confirm it’s dropped),
  • a label that’s both the entry point and a jump target, and
  • functions whose last instruction is and isn’t a terminator, to ensure traps are only inserted in the latter case.
crates/vm/src/builtins/asyncgenerator.rs (1)

76-87: Async generator finalization via Destructor looks correct; consider coverage for edge cases

Hooking async generator cleanup through Destructor::del and reusing call_finalizer guarded by inner.closed matches CPython’s “async gen finalizer” semantics:

  • finalizer is only invoked when a hook was installed and the generator isn’t already closed,
  • errors from the finalizer are intentionally ignored.

To be safe, please verify with tests that:

  • async generators with a registered finalizer run it exactly once on GC when not exhausted,
  • exhausted/closed async generators do not trigger the finalizer in del, and
  • finalizer exceptions do not leak but also don’t break interpreter state.

Also applies to: 647-657

crates/vm/src/frame.rs (5)

324-425: Exception handling path (run + handle_exception + unwind_blocks) is coherent with exception-table design

The new flow:

  • increments lasti before executing,
  • on error, computes idx for the faulting instruction,
  • adds a traceback frame and contextualizes the exception only for non‑reraise operations,
  • uses the exception table (find_exception_handler) to locate handlers and manipulate the stack (depth/push_lasti) before jumping, and
  • restores frame.lasti from the traceback when the exception escapes the frame

matches CPython’s zero‑cost exception model and keeps f_lineno consistent even after cleanup code has run.

Given how subtle this is, it’s worth re‑running the CPython test suites that stress:

  • nested try/except/finally with raise, raise from, bare raise, and RERAISE,
  • correct __traceback__ chains and __context__ behavior, and
  • traceback line numbers / frame.f_lineno alignment, especially after exceptions propagate out of cleanup blocks.

427-532: yield_from_target + gen_throw forwarding logic looks right but is subtle

The new yield_from_target heuristic (checking for SEND or RESUME after a YIELD_VALUE with non‑zero arg and using TOS as the delegate) plus gen_throw’s:

  • delegation to the sub‑iterator’s throw (or Coro::throw) when present, and
  • fallback to pushing None and re‑entering via unwind_blocks/exception table so CLEANUP_THROW can run

appear consistent with CPython’s modern yield from / SEND / CLEANUP_THROW protocol.

Because this interacts with both sync and async generators, I’d strongly suggest ensuring you have tests for:

  • throw into a generator currently delegating with yield from, including when the sub‑iterator swallows the exception, raises StopIteration (with a value), or raises another exception, and
  • that async for and async generators don’t regress in their handling of throw / aclose() in the presence of delegation.

969-1026: GetAwaitable + coroutine “already awaited” detection relies on yield_from_target – tests should cover it

The change to GetAwaitable that rejects PyCoroutine objects whose frame’s yield_from_target() is Some(_) is a reasonable analogue to CPython’s “coroutine is being awaited already” check.

This is subtle and easy to get wrong; please confirm you have (or add) tests for:

  • awaiting the same coroutine twice (must raise the appropriate RuntimeError),
  • awaiting a coroutine that has completed (should be allowed to re‑iterate via its result, matching CPython semantics), and
  • nested await / yield from combinations where the coroutine delegates further (to ensure yield_from_target returns Some exactly in the “already awaited” cases).

1505-1764: New exception opcodes (PushExcInfo, PopException, CheckExcMatch, Reraise, CleanupThrow) appear consistent with CPython’s stack protocol

The wiring for:

  • PushExcInfo and PopException maintaining vm.current_exception() and stacking (prev_exc, exc),
  • CheckExcMatch operating on [exc, type][exc, bool],
  • Reraise { depth } popping the saved values and re‑raising either the TOS exception or the current one, and
  • CleanupThrow implementing the SEND/END_SEND cleanup contract for StopIteration vs other exceptions

matches the documented CPython stack discipline for these opcodes.

It would be useful to run tests that cover:

  • except* blocks (including RAISE_STAR / PREP_RERAISE_STAR) to ensure exception state and chaining behave correctly,
  • yield‑from / async‑for error paths that exercise CleanupThrow, and
  • nested exception handlers where PushExcInfo/PopException are used multiple times to save/restore sys.exc_info().

1921-2005: unwind_blocks now delegates only Raising to the exception table; Break/Continue use direct jumps

For UnwindReason::Raising, using find_exception_handler with (lasti - 1) as the faulting offset, trimming the value stack to entry.depth, optionally pushing lasti, then pushing the exception and jumping to entry.target is in line with the exception‑table model (see find_exception_handler in crates/compiler-core/src/bytecode.rs). For Returning and Break/Continue, directly returning or jumping is straightforward, and the fast‑locals traceback‑cleanup on Returning is a nice mitigation for reference cycles in the absence of a tracing GC.

Given that break/continue cleanup is now assumed to be fully encoded in the bytecode/exception table rather than via an explicit block stack, please verify with tests that:

  • break/continue inside try/finally, with, and async with execute the expected cleanup/finally code, and
  • returns from inside except/finally blocks both preserve or clear exception tracebacks as intended and do not leak cycles that were previously broken by explicit cleanup code.
crates/compiler-core/src/lib.rs (1)

10-10: LGTM!

Making the varint module public is necessary to allow other crates (like codegen) to use the encoding utilities for exception table generation.

crates/vm/src/vm/mod.rs (2)

523-529: LGTM!

The symmetric push/pop of exception context around frame execution ensures proper isolation. Each frame starts fresh (None), and the caller's exception state is restored after the frame completes.


852-871: LGTM!

The Floyd's cycle detection algorithm correctly handles potential circular references in the exception context chain. The tortoise-and-hare approach (slow_o advances every other iteration via slow_update_toggle) efficiently detects cycles while traversing the chain.

crates/vm/src/coroutine.rs (2)

84-106: LGTM!

The exception state management correctly integrates with the new with_frame isolation pattern. The pop/push swap on lines 99-100 replaces the generic None context with the generator's saved exception state, maintaining the proper chain for coroutine semantics. Saving the exception state on line 104 before the closure returns ensures the generator's context is preserved across yields.


174-178: LGTM!

Early return for not-started generators (lasti == 0) is correct. There's no need to throw GeneratorExit into a frame that has never executed—simply marking it closed is sufficient and more efficient.

crates/codegen/src/error.rs (2)

79-81: LGTM!

The new async-related error variants complete the error handling for async constructs used outside their valid contexts.


119-126: LGTM!

The error messages match CPython's wording, ensuring consistent user experience across interpreters.

crates/codegen/src/symboltable.rs (5)

58-62: LGTM!

The comp_inlined field properly supports PEP 709 comprehension inlining with clear documentation.


352-381: LGTM!

The inlined comprehension symbol merging correctly:

  1. Uses BOUND_FLAGS to identify symbols actually bound in the comprehension
  2. Skips the internal .0 iterator parameter
  3. Avoids overwriting existing parent symbols (line 373 check)
  4. Marks merged symbols as Local in the parent scope

This properly implements PEP 709 semantics where inlined comprehension bindings become locals in the enclosing scope.


1280-1281: LGTM!

Generator expressions correctly pass is_generator=true since they cannot be inlined (lazy evaluation semantics must be preserved).


1294-1295: LGTM!

List comprehensions correctly pass is_generator=false, enabling potential inlining.


1494-1516: LGTM!

The inlining decision logic correctly follows CPython's rules:

  • Generator expressions cannot be inlined
  • Comprehensions in class scope context cannot be inlined (due to scoping complexities)

The saturating_sub(2) safely handles edge cases when there aren't enough parent tables.

crates/stdlib/src/opcode.rs (3)

40-53: LGTM!

The validity check now properly validates against the actual Instruction enum, and has_arg correctly gates on validity before checking the threshold.


125-163: LGTM!

The instruction categorization for free variables (closure cells) and local variables is correct and comprehensive.


165-171: LGTM!

Correctly returns false since RustPython now uses exception tables (CPython 3.11+ style) rather than embedding exception info in instructions.

crates/compiler-core/src/varint.rs (3)

6-30: LGTM!

The varint encoding correctly implements:

  1. 6-bit chunks with 0x40 continuation bit (CPython-compatible)
  2. Zigzag encoding for signed integers with proper INT_MIN handling via wrapping_sub

32-41: LGTM!

The start marker (0x80) enables parsing exception table entries sequentially. Safe access via get_mut handles edge cases.


43-98: Shift logic is consistent between read_varint_with_start and read_varint.

Both functions extract 6-bit chunks at the correct bit positions: the first byte contributes bits 0–5, subsequent bytes contribute bits 6–11, 12–17, and so on. The difference in initialization (shift starting at 6 vs. 0) is correct because read_varint_with_start pre-extracts the first byte's 6 bits into val before the loop, whereas read_varint uses the loop from the start.

The condition on line 87 (byte & 0x80 != 0 && shift > 0) correctly prevents breaking on the first byte and only stops when encountering a start-marked entry (set by write_varint_with_start). The included tests (test_varint_with_start, etc.) validate round-trip correctness.

crates/codegen/src/ir.rs (6)

96-105: LGTM! Clean exception handler metadata struct.

The ExceptHandlerInfo struct appropriately captures the necessary metadata for exception handlers matching CPython 3.11+ semantics.


193-206: Good addition of block-to-instruction index mapping.

The block_to_index array is correctly populated alongside block_to_offset, with clear comments explaining why this is needed for the exception table.


332-343: Good safety guard against potential infinite loops.

The iteration limit is a reasonable defensive measure. The error return is better than silently breaking, as it prevents underestimated stack depth which could cause runtime issues.


390-418: Handler depth calculation aligns with CPython 3.11+ semantics.

The calculation handler.stack_depth + 1 + (preserve_lasti as u32) correctly accounts for the exception object and optional lasti pushed before entering the handler block. The detailed comments make the logic clear.


462-468: Correct monotonic depth tracking for exception handlers.

The updated condition ensures we track the maximum depth reaching each block (important for exception handlers that may reach blocks with different stack depths), and properly handles first-visit via the u32::MAX check.


603-676: Well-structured exception table generation.

The function correctly coalesces consecutive instructions with identical exception handlers into single entries, properly compares all handler fields (block, depth, preserve_lasti), and handles the trailing entry case. The instruction index accounting for EXTENDED_ARG is correct.

crates/compiler-core/src/bytecode.rs (4)

614-629: Clear documentation for raise semantics.

The unified RaiseKind enum with detailed comments clarifies the mapping to CPython opcodes (RAISE_VARARGS opargs 0-2 and RERAISE).


673-1116: Comprehensive CPython 3.13 opcode alignment.

The instruction enum is carefully structured to match CPython 3.13 opcode numbers with explicit placeholder variants for unimplemented opcodes. The 1-byte size assertion at line 1117 ensures marshaling compatibility.


2015-2021: Correct stack effects for async generator instructions.

The stack effects for Send (0), EndSend (-1), and CleanupThrow (-1) correctly model the stack transitions for async generator operations.


2030-2042: Correct stack effects for all Raise variants.

The stack effects properly differentiate between bare raise (no stack pop), single exception, exception with cause, and reraise from stack.

Comment on lines +994 to +998
// 111: STORE_FAST_LOAD_FAST
StoreFastLoadFast {
store_idx: Arg<NameIdx>,
load_idx: Arg<NameIdx>,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n "StoreFastLoadFast" --type rust -A 5 -B 2

Repository: RustPython/RustPython

Length of output: 3722


🏁 Script executed:

# Get the fmt_dis function around lines 2292-2298
sed -n '2280,2310p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 1579


🏁 Script executed:

# Search for any unpacking/decoding logic for this instruction
rg -n "store_idx|load_idx" crates/compiler-core/src/bytecode.rs | head -20

Repository: RustPython/RustPython

Length of output: 283


🏁 Script executed:

# Find the Arg type definition
rg -n "^pub struct Arg|^pub enum Arg" crates/compiler-core/src/bytecode.rs -A 10

Repository: RustPython/RustPython

Length of output: 773


🏁 Script executed:

# Search for similar dual-index instructions to see the pattern
rg -n "idx.*Arg.*\{" crates/compiler-core/src/bytecode.rs | head -15

Repository: RustPython/RustPython

Length of output: 165


🏁 Script executed:

# Check how StoreFastLoadFast is created/compiled
rg -n "StoreFastLoadFast" crates/compiler-core/src/ --type rust -B 3 -A 3

Repository: RustPython/RustPython

Length of output: 1836


🏁 Script executed:

# Search for construction of StoreFastLoadFast in compiler code
rg -n "StoreFastLoadFast\s*\{" crates/compiler-core/src/ --type rust -B 5 -A 5

Repository: RustPython/RustPython

Length of output: 2887


🏁 Script executed:

# Check the Arg::get implementation and related methods
rg -n "impl.*Arg" crates/compiler-core/src/bytecode.rs -A 20 | head -60

Repository: RustPython/RustPython

Length of output: 1837


🏁 Script executed:

# Look for how OpArg encodes/decodes multiple values
rg -n "OpArg|to_op_arg|from_op_arg" crates/compiler-core/src/bytecode.rs | head -30

Repository: RustPython/RustPython

Length of output: 1259


🏁 Script executed:

# Find the .get() method implementation for Arg<T>
rg -n "impl.*Arg.*\{" crates/compiler-core/src/bytecode.rs -A 40 | grep -A 40 "pub fn get"

Repository: RustPython/RustPython

Length of output: 2349


🏁 Script executed:

# More targeted search for the get method
sed -n '531,600p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 1706


🏁 Script executed:

# Find NameIdx definition and its OpArgType implementation
rg -n "NameIdx" crates/compiler-core/src/bytecode.rs -A 10 | head -50

Repository: RustPython/RustPython

Length of output: 1627


🏁 Script executed:

# Check for any special handling in the actual compilation where StoreFastLoadFast is created
fd -e rs -x grep -l "StoreFastLoadFast" {} \; | xargs grep -l "compile\|codegen" | head -5

Repository: RustPython/RustPython

Length of output: 128


🏁 Script executed:

# Search in compiler code for StoreFastLoadFast usage
rg -n "StoreFastLoadFast" crates/compiler-core/src/compile/ --type rust -B 5 -A 5

Repository: RustPython/RustPython

Length of output: 138


🏁 Script executed:

# Find the compiler code structure
find crates/compiler-core/src -type d

Repository: RustPython/RustPython

Length of output: 89


🏁 Script executed:

# Search for StoreFastLoadFast in all Rust files under compiler-core
rg -n "StoreFastLoadFast" crates/compiler-core/ --type rust -B 5 -A 5

Repository: RustPython/RustPython

Length of output: 2887


🏁 Script executed:

# Look at how this instruction is serialized - check if there's unpacking logic for arguments
rg -n "from_instruction|write_instruction" crates/compiler-core/src/ --type rust -A 10

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# List the compiler-core source files
ls -la crates/compiler-core/src/

Repository: RustPython/RustPython

Length of output: 537


🏁 Script executed:

# Look for instruction argument encoding logic - search for methods that encode multiple args into one OpArg
rg -n "encode\|pack\|split.*arg" crates/compiler-core/src/bytecode.rs -i

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there's a macro or pattern for multi-index instructions
sed -n '1900,2050p' crates/compiler-core/src/bytecode.rs | head -100

Repository: RustPython/RustPython

Length of output: 4458


🏁 Script executed:

# Look for instruction deserialization - how instructions are read from bytecode
rg -n "from_code_unit|read.*instruction|parse.*instruction" crates/compiler-core/src/ --type rust -A 10

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check the marshal.rs file which likely handles bytecode serialization/deserialization
head -100 crates/compiler-core/src/marshal.rs

Repository: RustPython/RustPython

Length of output: 2550


🏁 Script executed:

# Search for where OpArg values are unpacked or decoded
rg -n "0xff|mask|shift" crates/compiler-core/src/bytecode.rs | head -20

Repository: RustPython/RustPython

Length of output: 564


🏁 Script executed:

# Look at the actual instruction bytecode encoding - find where instructions are assembled into CodeUnit
rg -n "CodeUnit|Instruction::to_" crates/compiler-core/src/bytecode.rs -A 5 | head -50

Repository: RustPython/RustPython

Length of output: 1707


🏁 Script executed:

# Check what the issue is in detail - look for how dual-argument instructions pack their values
sed -n '1200,1350p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 4881


🏁 Script executed:

# Look at how the instruction is supposed to be created - search for instruction construction
rg -n "StoreFastLoadFast\s*\{" crates/ --type rust -B 10 -A 10

Repository: RustPython/RustPython

Length of output: 8234


🏁 Script executed:

# Look for custom OpArgType implementations that might handle packing
rg -n "impl OpArgType" crates/compiler-core/src/bytecode.rs -A 8

Repository: RustPython/RustPython

Length of output: 2319


🏁 Script executed:

# Check if there's any byte-swapping or bit manipulation for extracting indices
sed -n '600,750p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 4285


🏁 Script executed:

# Look at how LoadFastLoadFast works - another dual-index instruction
rg -n "LoadFastLoadFast" crates/ --type rust -B 5 -A 10

Repository: RustPython/RustPython

Length of output: 5635


🏁 Script executed:

# Check how UnpackExArgs unpacks - this pattern should apply to StoreFastLoadFast too
rg -n "UnpackExArgs" crates/compiler-core/src/bytecode.rs -B 5 -A 15

Repository: RustPython/RustPython

Length of output: 2602


🏁 Script executed:

# Compare StoreFastLoadFast vs LoadFastLoadFast more carefully
sed -n '919,921p' crates/compiler-core/src/bytecode.rs
sed -n '995,998p' crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 213


🏁 Script executed:

# Check if there's a custom OpArgType for handling paired indices
rg -n "before.*after|pair|split" crates/compiler-core/src/bytecode.rs -i -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 1236


🏁 Script executed:

# Check if NameIdx has a custom OpArgType implementation
rg -n "impl.*OpArgType.*NameIdx" crates/compiler-core/src/bytecode.rs

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Verify NameIdx is just a u32 type alias
rg -n "^pub type NameIdx" crates/compiler-core/src/bytecode.rs -A 2

Repository: RustPython/RustPython

Length of output: 136


🏁 Script executed:

# Look at the full context of how StoreFastLoadFast is handled in frame.rs
sed -n '1603,1617p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 768


🏁 Script executed:

# Let me verify the issue by creating a test to understand how this would work
python3 << 'EOF'
# Simulate what happens with the current code
# If arg is, say, 0x12345678 (store_idx should be 0x5678, load_idx should be 0x1234)
# Both store_idx.get(arg) and load_idx.get(arg) would return 0x12345678

arg_value = 0x12345678

# Current behavior: both return the full arg
store_idx_result = arg_value
load_idx_result = arg_value

print(f"arg: {hex(arg_value)}")
print(f"store_idx.get(arg) returns: {hex(store_idx_result)}")
print(f"load_idx.get(arg) returns: {hex(load_idx_result)}")
print(f"Are they the same? {store_idx_result == load_idx_result}")

# What it should be (like UnpackExArgs):
import struct
bytes_val = struct.pack('<I', arg_value)
store_idx_should = struct.unpack('<H', bytes_val[0:2])[0]
load_idx_should = struct.unpack('<H', bytes_val[2:4])[0]

print(f"\nCorrect behavior:")
print(f"store_idx should be: {hex(store_idx_should)}")
print(f"load_idx should be: {hex(load_idx_should)}")
EOF

Repository: RustPython/RustPython

Length of output: 255


🏁 Script executed:

# Check git history for this file to see if there were recent changes
git log --oneline crates/compiler-core/src/bytecode.rs | head -20

Repository: RustPython/RustPython

Length of output: 95


🏁 Script executed:

# Check if there are any TODOs or FIXMEs related to StoreFastLoadFast
rg -n "TODO|FIXME|XXX" crates/compiler-core/src/bytecode.rs | grep -i "store\|fast"

Repository: RustPython/RustPython

Length of output: 149


🏁 Script executed:

# Final check: Look at how CPython 3.13 implements STORE_FAST_LOAD_FAST
web_search CPython 3.13 STORE_FAST_LOAD_FAST opcode implementation

Repository: RustPython/RustPython

Length of output: 115


StoreFastLoadFast indices extraction is incorrect — both fields extract the same value from a single OpArg.

StoreFastLoadFast has two separate index fields (store_idx and load_idx), but both call .get(arg) on the same arg without unpacking. Since NameIdx is a simple u32 type alias with no custom OpArgType implementation, both calls return the full 32-bit argument unchanged.

This causes the VM code in crates/vm/src/frame.rs:1610-1611 to use the same index for both store and load operations, breaking inlined comprehensions.

Following the pattern from UnpackExArgs (which unpacks two u8 values from a u32), StoreFastLoadFast needs a custom OpArgType implementation that splits the packed indices from a single OpArg:

impl OpArgType for FastLoadStoreArgs {
    fn from_op_arg(x: u32) -> Option<Self> {
        let [store, load, ..] = x.to_le_bytes();
        Some(Self { store: store as u32, load: load as u32 })
    }
    fn to_op_arg(self) -> u32 {
        u32::from_le_bytes([self.store as u8, self.load as u8, 0, 0])
    }
}

Then update the instruction to use a single packed argument instead of two separate Arg fields.

@youknowone youknowone merged commit eee3608 into RustPython:main Jan 4, 2026
12 of 13 checks passed
@rocky
Copy link

rocky commented Jan 4, 2026

What do I have to do to get the updated magic number? I did a git pull; cargo clean; cargo build; cargo run -- and this isn't showing me the updated magic.

@fanninpm
Copy link
Contributor

fanninpm commented Jan 4, 2026

What do I have to do to get the updated magic number? I did a git pull; cargo clean; cargo build; cargo run -- and this isn't showing me the updated magic.

Have you tried rm -rf **/__pycache__ yet?

@youknowone
Copy link
Member Author

@fanninpm rm -rf **/__pycache__ is required when we are not updating the magic number. This time it is updated.

@rocky How did you check the magic number? Let me try the same way

@youknowone youknowone deleted the exceptions branch January 4, 2026 07:36
@rocky
Copy link

rocky commented Jan 4, 2026

@fanninpm rm -rf **/__pycache__ is required when we are not updating the magic number. This time it is updated.

@rocky How did you check the magic number? Let me try the same way

The same method was used when I opened the report. I'll copy it in again here:

$ cargo run -- 
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.37s
     Running `target/debug/rustpython`
Welcome to the magnificent Rust Python 0.4.0 interpreter 😱 🖖
RustPython 3.13.0
Type "help", "copyright", "credits" or "license" for more information.
>>>>> from importlib.util import MAGIC_NUMBER
>>>>> import struct
>>>>> struct.unpack("<Hcc", b'\xcb\r\r\n')[0]
3531

I tried the rm -rf and that didn't change anything. And if you think about it, the fixes weren't in Python code but in Rust code.

Maybe there is a different branch I need? Here is my SHA1 for HEAD:

$ git rev-parse --short HEAD
fd27c2d78

@youknowone
Copy link
Member Author

@rocky you are not decoding MAGIC_NUMBER, but decoding a literal b'\xcb\r\r\n'

@youknowone youknowone mentioned this pull request Jan 4, 2026
@rocky
Copy link

rocky commented Jan 4, 2026

>>>>> struct.unpack("<Hcc", b'\xcb\r\r\n')[0]

Doh! My apologies for making this simple, stupid mistake. I get 2997 now, which is fine as far as I can tell.

I make lots of mistakes, which is why I work on debuggers.

Sorry for the noise. (But I will be putting in another feature for something a little more subtle and more subjective).. I see this has already been addressed.

terryluan12 pushed a commit to terryluan12/RustPython that referenced this pull request Jan 5, 2026
* asyncgen

* fix coroutine

* rewrite compiler

* tests

* set pyc magic number
terryluan12 pushed a commit to terryluan12/RustPython that referenced this pull request Jan 5, 2026
* asyncgen

* fix coroutine

* rewrite compiler

* tests

* set pyc magic number
@youknowone
Copy link
Member Author

@rocky no worry, thank you so much for your reports. I was underevaluating the value confusion to be honest.

terryluan12 pushed a commit to terryluan12/RustPython that referenced this pull request Jan 5, 2026
* asyncgen

* fix coroutine

* rewrite compiler

* tests

* set pyc magic number
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RustPython 3.13 uses CPython 3.12 magic number 3531

3 participants