Frame.co_code implementation. by alanjds · Pull Request #2149 · RustPython/RustPython

alanjds · 2020-08-25T22:13:26Z

TBD

This reverts commit c0b11de.

Existance of co_code is used by debugging tools to detect the Code object

This is needed to have sys.settrace(fn) to not call fn() right away with its return.

vm/src/frame.rs

alanjds · 2020-11-19T22:23:00Z

When testing pdb/bdb with this branch, I understood that some source of tooling confusion is that the AST generator (or parser, idk) thing is marking "line 0" for lots of bytecodes, even far down ones. This is leading pdb to not reliably print the current line or perceiving where to stop after a step. I consider this a bug, even being an interpreter-dependent one.

For some reason, I got to know PEP 626, accepted on CPython 3.10. It adds a Frame.co_lines iterator of tuples with bytecode offsets & source line, to be used inside Frame.co_lnotab changed to a lazy object generated from co_lines.

I was having trouble trying to implement the co_lnotab for RustPython, and this can ease the whole implementation yet making the thing future-proof.

Could I get some guidance on where/how to implement it?

alanjds · 2020-11-19T22:31:17Z

Btw, the co_lnotab attribute will be deprecated in 3.10 and removed in 3.12.

coolreader18 · 2020-11-20T00:06:00Z

Yeah, I think co_lines seems better, especially if we're having issues with bytecode locations. For implementing, I think we could iterate through the locations vec and do something like group_by to find the ranges of where the bytecode locations are identical.

coolreader18 · 2020-11-20T00:08:02Z

Also, what are you planning on using the co_code attribute for? Our bytecode format isn't really stable, and all the existing code that reads co_code probably expects the cpython bytecode format. I think I'd prefer to keep it unimplemented if possible, just so we avoid things reading co_code as "corrupted" because it isn't the format it expects.

alanjds · 2020-11-20T13:59:11Z

Also, what are you planning on using the co_code attribute for? Our bytecode format isn't really stable, and all the existing code that reads co_code probably expects the cpython bytecode format. I think I'd prefer to keep it unimplemented if possible, just so we avoid things reading co_code as "corrupted" because it isn't the format it expects.

Well, not really.

I needed co_code it available because pdb.py uses dis and inspect modules to get the source lines, via co_lnotab + co_code. Is not nice, but is how the debuggers work today :/
The internal dis.rs is ok for demonstrations but not enough for inspect and pdb modules.

Looking at the code of pdb and bdb, I saw nothing very specific to the bytecode format, which makes sense as the same pdb.py is used on PyPy for example, that have a different bytecode set than CPython.

Implement co_code + co_lnotab seemed a more stable approach, as pdb.py, inspect and dis are kinda coupled around it. I started by patching pdb.py replacing the parts touching co_code+co_lnotab, but the rabbit hole showed itself when I started digging.

Another thing is that, after pdb working, my next targets would be IPython, wdb and debugpy. This ones will probable need patches and maintenance. Implementing co_code+co_lnotab seems like less work now and on the future.

alanjds · 2020-11-20T14:03:03Z

Ah, I also plan to implement PEP 3147 (aka __pycache__ folder) in the near future. This will need direct access to the bytecode to be dumped to a file on disk.

alanjds · 2020-11-20T14:09:34Z

This is where pdb.py indirectly uses co_code:
https://github.com/python/cpython/blob/022bc7572f061e1d1132a4db9d085b29707701e7/Lib/pdb.py#L107-L122

def getsourcelines(obj):
    lines, lineno = inspect.findsource(obj)
    if inspect.isframe(obj) and obj.f_globals is obj.f_locals:
        # must be a module frame: do not try to cut a block out of it
        return lines, 1
    elif inspect.ismodule(obj):
        return lines, 1
    return inspect.getblock(lines[lineno:]), lineno+1

def lasti2lineno(code, lasti):
    linestarts = list(dis.findlinestarts(code))
    linestarts.reverse()
    for i, lineno in linestarts:
        if lasti >= i:
            return lineno
    return 0

Also, pdb.py sets f_lasti when jumping around. This will need a way to translate the line number to the bytecode offset, if I understood correctly.

coolreader18 · 2020-11-20T15:58:34Z

I also plan to implement PEP 3147

I think that's already implemented, just by the importlib frozen module.

Also, it looks like the way findlinestarts is implemented assumes that co_code is actually bytecode, i.e. f_lasti corresponds to an index into f_code.co_code, or at least len(co_code) represents the max that f_lasti can be. I think because the dis module is so tied to the bytecode format of the interpreter, it would make sense to implement findlinestarts in the dis.rs module we already have.

alanjds · 2020-11-20T19:03:19Z

I was playing with an implementation of findlinestarts on dis.rs when let it half-baked here.

Will explore this path first. Thanks for the guidance.

alanjds added 11 commits August 11, 2020 17:31

Merge branch 'master' into support-pdb

0f8db25

Comment on incompatibility on Frame.__delattr__

688a3c8

No .unwrap() is needed on PyRef !!

f5d119b

gc.collect() is dumb to always return 0 collected objects

f5bb995

Disable gc on RustPython tests

c542022

Revert python-based sys.displayhook implementation

8b58390

This reverts commit c0b11de.

Merge remote-tracking branch 'official/master' into support-pdb

3475470

Frame co_code is bytes serialization of bytecode instructions

d14ccda

Existance of co_code is used by debugging tools to detect the Code object

Simpler return for sys_settrace

3bc6acc

vm.skip_frame_next::<Bool> skips the next frame from tracing

b6b029d

This is needed to have sys.settrace(fn) to not call fn() right away with its return.

Merge branch 'master' into _frame_co_code

4d2f881

coolreader18 reviewed Aug 26, 2020

View reviewed changes

vm/src/frame.rs Show resolved Hide resolved

frame.rs is LF, and remove two warnings

ee54f3b

youknowone mentioned this pull request Jun 1, 2022

[RFC] Executing by stepping each instruction #3761

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Frame.co_code implementation.#2149

Frame.co_code implementation.#2149
alanjds wants to merge 12 commits intoRustPython:mainfrom
alanjds:_frame_co_code

alanjds commented Aug 25, 2020

Uh oh!

Uh oh!

alanjds commented Nov 19, 2020 •

edited

Loading

Uh oh!

alanjds commented Nov 19, 2020

Uh oh!

coolreader18 commented Nov 20, 2020

Uh oh!

coolreader18 commented Nov 20, 2020

Uh oh!

alanjds commented Nov 20, 2020

Uh oh!

alanjds commented Nov 20, 2020

Uh oh!

alanjds commented Nov 20, 2020

Uh oh!

coolreader18 commented Nov 20, 2020

Uh oh!

alanjds commented Nov 20, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alanjds commented Aug 25, 2020

Uh oh!

Uh oh!

alanjds commented Nov 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alanjds commented Nov 19, 2020

Uh oh!

coolreader18 commented Nov 20, 2020

Uh oh!

coolreader18 commented Nov 20, 2020

Uh oh!

alanjds commented Nov 20, 2020

Uh oh!

alanjds commented Nov 20, 2020

Uh oh!

alanjds commented Nov 20, 2020

Uh oh!

coolreader18 commented Nov 20, 2020

Uh oh!

alanjds commented Nov 20, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alanjds commented Nov 19, 2020 •

edited

Loading