Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Jan 19, 2026

Remove RustPython-specific opcodes (BuildListFromTuples, BuildMapForCall, BuildSetFromTuples, BuildTupleFromTuples) and replace their usage with CPython 3.14 standard opcode sequences:

  • BuildListFromTuples → BUILD_LIST + LIST_EXTEND loop
  • BuildSetFromTuples → BUILD_SET + SET_UPDATE loop
  • BuildTupleFromTuples → BUILD_LIST + LIST_EXTEND + CALL_INTRINSIC_1(ListToTuple)
  • BuildMapForCall → DICT_MERGE loop

Implement missing opcodes:

  • ListExtend: Extend list with iterable elements
  • SetUpdate: Add iterable elements to set
  • DictMerge: Merge dict with duplicate key checking

Summary by CodeRabbit

Release Notes

  • Refactor
    • Star-unpacking and collection assembly now build collections incrementally (extend/update then convert), streamlining unpacking and call-argument construction.
  • Bug Fixes
    • Keyword-argument merging now detects duplicate keys and invalid mappings, surfacing clearer errors.
  • New Features
    • Runtime supports explicit list-extend and set-update operations for expanded iterables.

✏️ Tip: You can customize this high-level summary in your review settings.

@youknowone youknowone marked this pull request as ready for review January 19, 2026 01:14
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 19, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Replaces tuple/map/set-from-tuples instructions with a two-step build pattern: emit an empty list, extend it per element, then convert to tuple/list or update set/dict using new VM ops (ListExtend, SetUpdate, DictMerge). Removes several legacy Build*FromTuples opcodes and related handlers.

Changes

Cohort / File(s) Summary
Codegen pattern updates
crates/codegen/src/compile.rs
Replaces BuildTupleFromTuples/BuildListFromTuples/BuildSetFromTuples/BuildMapForCall emission sites with sequence: BuildList { size:0 } → multiple ListExtend { i }/SetUpdate { i }/DictMerge (and ListToTuple when tuple needed). Updates star-unpacking and call/class/type-parameter gathering paths.
Bytecode enum changes
crates/compiler-core/src/bytecode/instruction.rs
Removes enum variants BuildListFromTuples, BuildMapForCall, BuildSetFromTuples, BuildTupleFromTuples and their TryFrom, stack-effect, and disasm mappings.
VM runtime handlers
crates/vm/src/frame.rs
Removes Build*FromTuples handling and helpers (flatten_tuples, execute_build_map_for_call). Adds new instruction handlers: DictMerge { index }, ListExtend { i }, SetUpdate { i } with runtime validation and merging/extension behavior.

Sequence Diagram(s)

sequenceDiagram
  participant Codegen
  participant Bytecode
  participant VM
  participant Runtime

  Codegen->>Bytecode: emit BuildList(size=0)
  Codegen->>Bytecode: emit ListExtend(i) (repeated per iterable)
  Codegen->>Bytecode: emit ListToTuple (if tuple target)
  Bytecode->>VM: instructions stream
  VM->>Runtime: create empty list (BuildList)
  VM->>Runtime: for each ListExtend -> iterate source -> append items
  alt convert to tuple
    VM->>Runtime: call ListToTuple intrinsic -> produce tuple
  end
  VM->>Runtime: DictMerge -> validate mapping keys -> merge into target dict
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh

Poem

🐰
I hopped through opcodes, bright and spry,
Replaced big-builds with extend-and-try;
Lists grow, sets update, dicts merge neat—
Bytecode pruned, new instructions fleet.
Hooray—more hopping, lighter feet! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main objective of the PR: replacing RustPython-specific opcodes with CPython standard opcode sequences.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 542-568: gather_elements leaves tuple chunks on the stack before
the collection is created, so emitting Instruction::BuildList/BuildSet after
those chunks causes ListExtend/SetUpdate to index the wrong stack slot and
trigger unsafe casts; fix by ensuring the collection is created before tuple
chunks are left on the stack (emit Instruction::BuildList { size: 0 } /
Instruction::BuildSet { size: 0 } prior to the code that pushes tuple chunks in
gather_elements) or, if reordering is infeasible, adjust the ListExtend { i: ...
} / SetUpdate { i: ... } indices to account for the tuple-first layout
(increment the depth used) in the CollectionType::List and CollectionType::Set
handling (also apply same fix to the other occurrences noted).
🧹 Nitpick comments (2)
crates/vm/src/frame.rs (2)

941-958: Consider using lazy iteration over keys instead of collecting into Vec.

The current implementation collects all keys into a Vec before iterating, which allocates memory proportional to the source dict size. This could be inefficient for large mappings and doesn't match CPython's lazy iteration behavior.

Consider using the iterator protocol directly:

♻️ Suggested refactor using lazy iteration
-                // Get keys from source and check for duplicates
-                let keys_iter = vm.call_method(&source, "keys", ())?;
-                for key in keys_iter.try_to_value::<Vec<PyObjectRef>>(vm)? {
+                // Get keys from source and check for duplicates
+                let keys_obj = vm.call_method(&source, "keys", ())?;
+                let keys_iter = keys_obj.get_iter(vm)?;
+                while let PyIterReturn::Return(key) = keys_iter.next(vm)? {
                     // Check if keyword argument is a string
                     if key.downcast_ref::<PyStr>().is_none() {
                         return Err(vm.new_type_error("keywords must be strings".to_owned()));
                     }
                     // Check for duplicate keyword arguments
                     if dict.contains_key(&*key, vm) {
                         let key_repr = key.repr(vm)?;
                         return Err(vm.new_type_error(format!(
                             "got multiple values for keyword argument {}",
                             key_repr.as_str()
                         )));
                     }
                     let value = vm.call_method(&source, "__getitem__", (key.clone(),))?;
                     dict.set_item(&*key, value, vm)?;
                 }

1594-1607: Consider using PySet's update_internal method for consistency with ListExtend.

While ListExtend delegates to list.extend(), this handler manually iterates and adds items. PySet has an update_internal() method that takes a single iterable and would provide the same consistency pattern: replace the manual iteration with set.update_internal(iterable, vm)?;.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/vm/src/frame.rs`:
- Around line 921-959: Instruction::DictMerge currently pops the source with
pop_value() then calls nth_value(index.get(arg)) without compensating for the
pop, so it reads the wrong stack slot; mirror the DictUpdate pattern by
subtracting 1 from the computed slot before calling nth_value (i.e., use
nth_value(index.get(arg) - 1) or otherwise adjust the index after pop_value()),
ensuring you reference the same index handling as in DictUpdate and keep the
existing checks (get_method, keys iteration, contains_key, set_item) intact.
♻️ Duplicate comments (3)
crates/codegen/src/compile.rs (3)

539-570: Fix LIST_EXTEND/SET_UPDATE stack target (collection is on TOS).

gather_elements leaves tuple chunks on the stack, then BUILD_LIST/BUILD_SET pushes the collection above them. LIST_EXTEND/SET_UPDATE expects the collection below the iterable on the stack, so this order can target a tuple instead of the list/set and fail at runtime. Please reorder stack construction or adjust the index/depth to match opcode expectations.


4436-4448: Fix LIST_EXTEND stack target for starred bases.

This sequence builds the list on TOS above tuple chunks, but LIST_EXTEND expects the list below the iterable. That can mis-target the tuple and trigger unsafe casts. Please reorder or adjust index depth as in the earlier unpacking path.


6994-7004: *Fix LIST_EXTEND stack target when building args tuple.

Here the list is pushed on top of tuple chunks, but LIST_EXTEND expects the list below the iterable. This can mis-target a tuple at runtime. Please reorder the stack or adjust the depth/index so the list is below the iterable.

@youknowone youknowone marked this pull request as draft January 19, 2026 02:24
Remove RustPython-specific opcodes (BuildListFromTuples,
BuildMapForCall, BuildSetFromTuples, BuildTupleFromTuples)
and replace their usage with CPython 3.14 standard opcode
sequences:

- BuildListFromTuples → BUILD_LIST + LIST_EXTEND loop
- BuildSetFromTuples → BUILD_SET + SET_UPDATE loop
- BuildTupleFromTuples → BUILD_LIST + LIST_EXTEND + CALL_INTRINSIC_1(ListToTuple)
- BuildMapForCall → DICT_MERGE loop

Implement missing opcodes:
- ListExtend: Extend list with iterable elements
- SetUpdate: Add iterable elements to set
- DictMerge: Merge dict with duplicate key checking
@github-actions
Copy link
Contributor

Code has been automatically formatted

The code in this PR has been formatted using:

  • python scripts/generate_opcode_metadata.py
    Please pull the latest changes before pushing again:
git pull origin no-rustpython-ops

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant