Replace custom opcodes with CPython standard sequences #6794

youknowone · 2026-01-19T01:13:58Z

Remove RustPython-specific opcodes (BuildListFromTuples, BuildMapForCall, BuildSetFromTuples, BuildTupleFromTuples) and replace their usage with CPython 3.14 standard opcode sequences:

BuildListFromTuples → BUILD_LIST + LIST_EXTEND loop
BuildSetFromTuples → BUILD_SET + SET_UPDATE loop
BuildTupleFromTuples → BUILD_LIST + LIST_EXTEND + CALL_INTRINSIC_1(ListToTuple)
BuildMapForCall → DICT_MERGE loop

Implement missing opcodes:

ListExtend: Extend list with iterable elements
SetUpdate: Add iterable elements to set
DictMerge: Merge dict with duplicate key checking

Summary by CodeRabbit

Release Notes

Refactor
- Star-unpacking and collection assembly now build collections incrementally (extend/update then convert), streamlining unpacking and call-argument construction.
Bug Fixes
- Keyword-argument merging now detects duplicate keys and invalid mappings, surfacing clearer errors.
New Features
- Runtime supports explicit list-extend and set-update operations for expanded iterables.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-19T01:14:08Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Replaces tuple/map/set-from-tuples instructions with a two-step build pattern: emit an empty list, extend it per element, then convert to tuple/list or update set/dict using new VM ops (ListExtend, SetUpdate, DictMerge). Removes several legacy Build*FromTuples opcodes and related handlers.

Changes

Cohort / File(s)	Summary
Codegen pattern updates `crates/codegen/src/compile.rs`	Replaces `BuildTupleFromTuples`/`BuildListFromTuples`/`BuildSetFromTuples`/`BuildMapForCall` emission sites with sequence: `BuildList { size:0 }` → multiple `ListExtend { i }`/`SetUpdate { i }`/`DictMerge` (and `ListToTuple` when tuple needed). Updates star-unpacking and call/class/type-parameter gathering paths.
Bytecode enum changes `crates/compiler-core/src/bytecode/instruction.rs`	Removes enum variants `BuildListFromTuples`, `BuildMapForCall`, `BuildSetFromTuples`, `BuildTupleFromTuples` and their TryFrom, stack-effect, and disasm mappings.
VM runtime handlers `crates/vm/src/frame.rs`	Removes `Build*FromTuples` handling and helpers (`flatten_tuples`, `execute_build_map_for_call`). Adds new instruction handlers: `DictMerge { index }`, `ListExtend { i }`, `SetUpdate { i }` with runtime validation and merging/extension behavior.

Sequence Diagram(s)

sequenceDiagram
  participant Codegen
  participant Bytecode
  participant VM
  participant Runtime

  Codegen->>Bytecode: emit BuildList(size=0)
  Codegen->>Bytecode: emit ListExtend(i) (repeated per iterable)
  Codegen->>Bytecode: emit ListToTuple (if tuple target)
  Bytecode->>VM: instructions stream
  VM->>Runtime: create empty list (BuildList)
  VM->>Runtime: for each ListExtend -> iterate source -> append items
  alt convert to tuple
    VM->>Runtime: call ListToTuple intrinsic -> produce tuple
  end
  VM->>Runtime: DictMerge -> validate mapping keys -> merge into target dict

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Bytecode pseudo opcodes #6715: Modifies bytecode instruction representation and emission paths; related to opcode changes here.
Sort Instruction enum & related match arms #6322: Touches the same Build*FromTuples variants and their handling.
Renewd DictUpdate instruction #6085: Related changes around dict-merge/update bytecode/VM path and indexed merging.

Suggested reviewers

ShaharNaveh

Poem

🐰
I hopped through opcodes, bright and spry,
Replaced big-builds with extend-and-try;
Lists grow, sets update, dicts merge neat—
Bytecode pruned, new instructions fleet.
Hooray—more hopping, lighter feet! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately captures the main objective of the PR: replacing RustPython-specific opcodes with CPython standard opcode sequences.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@crates/codegen/src/compile.rs`:
- Around line 542-568: gather_elements leaves tuple chunks on the stack before
the collection is created, so emitting Instruction::BuildList/BuildSet after
those chunks causes ListExtend/SetUpdate to index the wrong stack slot and
trigger unsafe casts; fix by ensuring the collection is created before tuple
chunks are left on the stack (emit Instruction::BuildList { size: 0 } /
Instruction::BuildSet { size: 0 } prior to the code that pushes tuple chunks in
gather_elements) or, if reordering is infeasible, adjust the ListExtend { i: ...
} / SetUpdate { i: ... } indices to account for the tuple-first layout
(increment the depth used) in the CollectionType::List and CollectionType::Set
handling (also apply same fix to the other occurrences noted).

🧹 Nitpick comments (2)

crates/vm/src/frame.rs (2)
941-958: Consider using lazy iteration over keys instead of collecting into Vec.

The current implementation collects all keys into a Vec before iterating, which allocates memory proportional to the source dict size. This could be inefficient for large mappings and doesn't match CPython's lazy iteration behavior.

Consider using the iterator protocol directly:
♻️ Suggested refactor using lazy iteration
-                // Get keys from source and check for duplicates
-                let keys_iter = vm.call_method(&source, "keys", ())?;
-                for key in keys_iter.try_to_value::<Vec<PyObjectRef>>(vm)? {
+                // Get keys from source and check for duplicates
+                let keys_obj = vm.call_method(&source, "keys", ())?;
+                let keys_iter = keys_obj.get_iter(vm)?;
+                while let PyIterReturn::Return(key) = keys_iter.next(vm)? {
                     // Check if keyword argument is a string
                     if key.downcast_ref::<PyStr>().is_none() {
                         return Err(vm.new_type_error("keywords must be strings".to_owned()));
                     }
                     // Check for duplicate keyword arguments
                     if dict.contains_key(&*key, vm) {
                         let key_repr = key.repr(vm)?;
                         return Err(vm.new_type_error(format!(
                             "got multiple values for keyword argument {}",
                             key_repr.as_str()
                         )));
                     }
                     let value = vm.call_method(&source, "__getitem__", (key.clone(),))?;
                     dict.set_item(&*key, value, vm)?;
                 }
1594-1607: Consider using PySet's update_internal method for consistency with ListExtend.

While ListExtend delegates to list.extend(), this handler manually iterates and adds items. PySet has an update_internal() method that takes a single iterable and would provide the same consistency pattern: replace the manual iteration with set.update_internal(iterable, vm)?;.

crates/codegen/src/compile.rs

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@crates/vm/src/frame.rs`:
- Around line 921-959: Instruction::DictMerge currently pops the source with
pop_value() then calls nth_value(index.get(arg)) without compensating for the
pop, so it reads the wrong stack slot; mirror the DictUpdate pattern by
subtracting 1 from the computed slot before calling nth_value (i.e., use
nth_value(index.get(arg) - 1) or otherwise adjust the index after pop_value()),
ensuring you reference the same index handling as in DictUpdate and keep the
existing checks (get_method, keys iteration, contains_key, set_item) intact.

♻️ Duplicate comments (3)

crates/codegen/src/compile.rs (3)

539-570: Fix LIST_EXTEND/SET_UPDATE stack target (collection is on TOS).

gather_elements leaves tuple chunks on the stack, then BUILD_LIST/BUILD_SET pushes the collection above them. LIST_EXTEND/SET_UPDATE expects the collection below the iterable on the stack, so this order can target a tuple instead of the list/set and fail at runtime. Please reorder stack construction or adjust the index/depth to match opcode expectations.

4436-4448: Fix LIST_EXTEND stack target for starred bases.

This sequence builds the list on TOS above tuple chunks, but LIST_EXTEND expects the list below the iterable. That can mis-target the tuple and trigger unsafe casts. Please reorder or adjust index depth as in the earlier unpacking path.

6994-7004: *Fix LIST_EXTEND stack target when building args tuple.

Here the list is pushed on top of tuple chunks, but LIST_EXTEND expects the list below the iterable. This can mis-target a tuple at runtime. Please reorder the stack or adjust the depth/index so the list is below the iterable.

crates/vm/src/frame.rs

Remove RustPython-specific opcodes (BuildListFromTuples, BuildMapForCall, BuildSetFromTuples, BuildTupleFromTuples) and replace their usage with CPython 3.14 standard opcode sequences: - BuildListFromTuples → BUILD_LIST + LIST_EXTEND loop - BuildSetFromTuples → BUILD_SET + SET_UPDATE loop - BuildTupleFromTuples → BUILD_LIST + LIST_EXTEND + CALL_INTRINSIC_1(ListToTuple) - BuildMapForCall → DICT_MERGE loop Implement missing opcodes: - ListExtend: Extend list with iterable elements - SetUpdate: Add iterable elements to set - DictMerge: Merge dict with duplicate key checking

github-actions · 2026-01-19T08:42:26Z

Code has been automatically formatted

The code in this PR has been formatted using:

python scripts/generate_opcode_metadata.py
Please pull the latest changes before pushing again:

git pull origin no-rustpython-ops

youknowone marked this pull request as ready for review January 19, 2026 01:14

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

crates/codegen/src/compile.rs Outdated Show resolved Hide resolved

youknowone force-pushed the no-rustpython-ops branch from ccb2a22 to 8c3b698 Compare January 19, 2026 02:01

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

crates/vm/src/frame.rs Show resolved Hide resolved

youknowone marked this pull request as draft January 19, 2026 02:24

youknowone force-pushed the no-rustpython-ops branch from 8c3b698 to a35f3bb Compare January 19, 2026 02:25

youknowone force-pushed the no-rustpython-ops branch from a35f3bb to d8b46b1 Compare January 19, 2026 08:42

Auto-generate: generate_opcode_metadata.py

aebd863

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replace custom opcodes with CPython standard sequences #6794

Replace custom opcodes with CPython standard sequences #6794

youknowone commented Jan 19, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 19, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Replace custom opcodes with CPython standard sequences #6794

Are you sure you want to change the base?

Replace custom opcodes with CPython standard sequences #6794

Conversation

youknowone commented Jan 19, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

youknowone commented Jan 19, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 19, 2026 •

edited

Loading