Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Jan 4, 2026

Fix #6239

Summary by CodeRabbit

  • New Features

    • Support for loading and executing precompiled Python bytecode (.pyc) from files or in-memory bytes
    • New VM API to run bytecode directly from memory and a new helper to create a main scope
  • Refactor

    • Simplified main-run flow and scope initialization, consolidating setup and cleanup behavior

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 4, 2026

📝 Walkthrough

Walkthrough

This PR adds APIs and helpers to load and execute .pyc bytecode from memory or files: new PyCode constructors to build code objects from pyc bytes/paths, a VM API to run pyc bytes, a centralized pyc magic-number check, and refactors main-scope/module initialization to a VM-provided helper.

Changes

Cohort / File(s) Summary
PyCode .pyc constructors
crates/vm/src/builtins/code.rs
Added from_pyc_path(path, vm) and from_pyc(pyc_bytes, name, bytecode_path, source_path, vm) to construct PyCode from .pyc bytes/files; validate magic and invoke frozen importlib _compile_bytecode.
PYC magic check utility
crates/vm/src/import.rs
Added pub(crate) fn check_pyc_magic_number_bytes(buf: &[u8]) -> bool to centralize two-byte pyc magic validation.
VM API & helpers for running pyc
crates/vm/src/vm/mod.rs
Exported PyDict publicly; added pub fn run_pyc_bytes(&self, pyc_bytes: &[u8], scope: Scope) -> PyResult<()> and private helpers with_simple_run() and flush_io() to prepare main scope, run code, and manage IO/cleanup.
Run-file and pyc detection refactor
crates/vm/src/vm/python_run.rs
Reworked run_simple_file to use with_simple_run; changed module dict types from PyDictRef to Py<PyDict>; removed inline file/cached and removed flush_io; replaced direct magic-byte compare with check_pyc_magic_number_bytes() and simplified maybe_pyc_file_with_magic.
Main-scope creation helper
crates/vm/src/vm/vm_new.rs
Added pub fn new_scope_with_main(&self) -> PyResult<Scope> to create a scope with builtins, a main module, initialize __annotations__, and register it in sys.modules.
Consume new main-scope API
src/lib.rs
Removed setup_main_module() and replaced its uses with vm.new_scope_with_main(), delegating main module initialization to the VM helper.

Sequence Diagram

sequenceDiagram
    autonumber
    actor Client
    participant VM as VirtualMachine
    participant Scope
    participant MainMod as __main__ Module
    participant PyCode
    participant FrozenImportlib as frozen_importlib._compile_bytecode

    Client->>VM: run_pyc_bytes(pyc_bytes, scope)
    activate VM
    VM->>VM: with_simple_run("__main__", closure)
    VM->>Scope: ensure scope / builtins (new_scope_with_main)
    VM->>MainMod: create/register __main__, init __annotations__
    Note right of VM: Inject __file__ / __cached__ into MainMod
    VM->>PyCode: from_pyc(pyc_bytes, metadata, vm)
    activate PyCode
    PyCode->>PyCode: check_pyc_magic_number_bytes(pyc_bytes)
    PyCode->>FrozenImportlib: load & call _compile_bytecode(bytes, metadata)
    FrozenImportlib-->>PyCode: code object
    deactivate PyCode
    PyCode-->>VM: return PyCode
    VM->>MainMod: execute code object in __main__ dict
    Note right of VM: flush IO / cleanup injected state
    deactivate VM
    VM-->>Client: PyResult<()>
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh
  • arihant2math

Poem

🐰 Hops through bytes and magic signs,

I stitch the pyc into lines.
From frozen importlib's gentle hum,
The VM wakes — the bytecode runs! 🎉

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'vm.run_pyc_bytes' directly reflects the main new public API method added to VirtualMachine, making it clear and specific about the primary change.
Linked Issues check ✅ Passed The PR successfully implements the core coding objective from issue #6239: adding vm.run_pyc_bytes as a new public VM API to execute precompiled bytecode directly without requiring a compiler.
Out of Scope Changes check ✅ Passed All changes are scoped to supporting the new run_pyc_bytes functionality: PyCode constructors from .pyc bytes/files, magic number validation, scope creation, and refactoring of existing python_run.rs to use the new abstractions. No unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e90531 and 8634a91.

📒 Files selected for processing (2)
  • crates/vm/src/builtins/code.rs
  • crates/vm/src/import.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/vm/src/builtins/code.rs
  • crates/vm/src/import.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Check Rust code with clippy

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone changed the title Run pyc bytes vm.run_pyc_bytes Jan 4, 2026
@youknowone youknowone marked this pull request as ready for review January 4, 2026 12:53
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
crates/vm/src/vm/vm_new.rs (1)

65-80: Consider propagating error instead of panicking on __annotations__ initialization.

Line 71 uses .expect() which will panic if setting __annotations__ fails, while the rest of the method uses ? for error propagation. For consistency and robustness, consider using ? here as well.

🔎 Proposed fix
     pub fn new_scope_with_main(&self) -> PyResult<Scope> {
         let scope = self.new_scope_with_builtins();
         let main_module = self.new_module("__main__", scope.globals.clone(), None);
         main_module
             .dict()
-            .set_item("__annotations__", self.ctx.new_dict().into(), self)
-            .expect("Failed to initialize __main__.__annotations__");
+            .set_item("__annotations__", self.ctx.new_dict().into(), self)?;
 
         self.sys_module.get_attr("modules", self)?.set_item(
             "__main__",
             main_module.into(),
             self,
         )?;
 
         Ok(scope)
     }
crates/vm/src/builtins/code.rs (1)

339-352: Consider using actual source path instead of hardcoded "<source>".

The source_path is hardcoded to "<source>" which may be confusing in tracebacks. Consider deriving it from the path or allowing it to be None.

🔎 Proposed fix
         Self::from_pyc(
             &content,
             Some(&name),
             Some(&path.display().to_string()),
-            Some("<source>"),
+            None,  // Let the loader determine the source path
             vm,
         )
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ec564ac and 5e90531.

📒 Files selected for processing (6)
  • crates/vm/src/builtins/code.rs
  • crates/vm/src/import.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/python_run.rs
  • crates/vm/src/vm/vm_new.rs
  • src/lib.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/import.rs
  • src/lib.rs
  • crates/vm/src/vm/vm_new.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/python_run.rs
  • crates/vm/src/builtins/code.rs
🧠 Learnings (4)
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: When testing Python code, always use RustPython instead of the standard `python` command; use `cargo run -- script.py` or `cargo run` for interactive REPL

Applied to files:

  • src/lib.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/python_run.rs
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: When modifying bytecode instructions, perform a full clean build by running `rm -r target/debug/build/rustpython-* && find . | grep -E '\.pyc$' | xargs rm -r`

Applied to files:

  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/python_run.rs
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: RustPython is a Python 3 interpreter written in Rust implementing Python 3.13.0+ compatibility with complete Python environment in Rust (not CPython bindings), clean implementation without compatibility hacks, and cross-platform support including WebAssembly compilation

Applied to files:

  • crates/vm/src/vm/mod.rs
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/vm/src/vm/python_run.rs
🧬 Code graph analysis (3)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/builtins/code.rs (1)
  • from_pyc (353-370)
crates/vm/src/vm/python_run.rs (2)
crates/vm/src/vm/mod.rs (1)
  • import (689-693)
crates/vm/src/import.rs (1)
  • check_pyc_magic_number_bytes (11-13)
crates/vm/src/builtins/code.rs (1)
crates/vm/src/import.rs (1)
  • check_pyc_magic_number_bytes (11-13)
🔇 Additional comments (7)
src/lib.rs (2)

209-209: LGTM!

The refactoring to use vm.new_scope_with_main() centralizes the main module initialization logic and improves maintainability.


354-360: LGTM!

Test code correctly updated to use the new new_scope_with_main() API.

crates/vm/src/vm/mod.rs (3)

463-497: Well-documented API with clear usage example.

The run_pyc_bytes implementation correctly:

  • Provides clear documentation with usage examples
  • Delegates to PyCode::from_pyc for loading
  • Uses with_simple_run for proper main module setup

One minor note: the path "<source>" passed to with_simple_run is generic. Consider allowing callers to specify a more descriptive path in a future enhancement.


539-571: LGTM!

The with_simple_run helper properly:

  • Sets __file__ and __cached__ only if not already set
  • Cleans up these attributes after execution
  • Flushes IO streams

573-583: LGTM!

The flush_io helper appropriately ignores errors during cleanup, which is the correct behavior for post-execution flushing.

crates/vm/src/vm/python_run.rs (2)

27-31: LGTM!

The refactoring to use with_simple_run properly centralizes the __file__ and __cached__ management, reducing code duplication.


133-150: LGTM!

The magic number validation is now properly centralized through check_pyc_magic_number_bytes. The function correctly validates that exactly 2 bytes were read before checking the magic.

@youknowone youknowone merged commit e1b22f1 into RustPython:main Jan 4, 2026
13 checks passed
@youknowone youknowone deleted the run_pyc_bytes branch January 4, 2026 13:58
terryluan12 pushed a commit to terryluan12/RustPython that referenced this pull request Jan 5, 2026
* rustpython_vm::import::check_pyc_magic_number_bytes

* vm.new_scope_with_main

* PyCode::from_pyc

* vm.run_pyc_bytes

* add boundary check
terryluan12 pushed a commit to terryluan12/RustPython that referenced this pull request Jan 5, 2026
* rustpython_vm::import::check_pyc_magic_number_bytes

* vm.new_scope_with_main

* PyCode::from_pyc

* vm.run_pyc_bytes

* add boundary check
terryluan12 pushed a commit to terryluan12/RustPython that referenced this pull request Jan 15, 2026
* rustpython_vm::import::check_pyc_magic_number_bytes

* vm.new_scope_with_main

* PyCode::from_pyc

* vm.run_pyc_bytes

* add boundary check
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add vm.run_bytecode

1 participant