Skip to content

gh-131798: Split _CHECK_AND_ALLOCATE_OBJECT into smaller uops#148433

Merged
Fidget-Spinner merged 7 commits intopython:mainfrom
Sacul0457:Split-_CHECK_AND_ALLOCATE_OBJECT
Apr 12, 2026
Merged

gh-131798: Split _CHECK_AND_ALLOCATE_OBJECT into smaller uops#148433
Fidget-Spinner merged 7 commits intopython:mainfrom
Sacul0457:Split-_CHECK_AND_ALLOCATE_OBJECT

Conversation

@Sacul0457
Copy link
Copy Markdown
Contributor

@Sacul0457 Sacul0457 commented Apr 12, 2026

@Sacul0457
Copy link
Copy Markdown
Contributor Author

Hey @Fidget-Spinner, I'm a little lost here
From my understanding:

  1. In _CHECK_OBJECT if tp->tp_version_tag == type_version is true, we remove the guard.
  2. Do we set callable to a const here like it currently does?
  3. If 1. is false, we need to add a type watcher.

assert(init != NULL);
assert(PyFunction_Check(init));
callable = sym_new_const(ctx, init);
sym_set_const(callable, init);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This should go into the new _ALLOCATE_OBJECT
  2. It should be callable = sym_new_const(ctx, init); not sym_set_const(callable, init);. There's a difference --- the first overrides the current callable with a new one, the second sets the current callable to a known constant. If you read the code in bytecodes.c, you'll see it's the first one.

PyType_Watch(TYPE_WATCHER_ID, (PyObject *)type);
_Py_BloomFilter_Add(dependencies, type);
}
}
Copy link
Copy Markdown
Member

@cocolato cocolato Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same to bytecodes.c:

Suggested change
}
}
self_or_null = sym_new_not_null(ctx);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait sorry, shouldn't this be sym_set_non_null?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Please look at bytecodes.c. It's creating a new object, self. That's what _ALLOCATE_OBJECT does. So it should be a new non-null. This is same as the above where it grabs a new __init__ callable. In both of these cases, we do not set the existing symbol, but create a new one. This is a good learning point as it's tricky and requires observing bytecodes.c!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok

Co-authored-by: Hai Zhu <haiizhu@outlook.com>
@Fidget-Spinner
Copy link
Copy Markdown
Member

@Sacul0457 I'm going to push to this PR, as I think it's a little tricky to get right. Please add a test after I push.

@Fidget-Spinner
Copy link
Copy Markdown
Member

@Sacul0457 please add the following tests:

  1. One test where there's a class outside the trace, and needs to be inferred. E.g.
class MyPoint:
    def __init__(self, x, y):
        return None

def testfunc(n):
    point_local = MyPoint
    for _ in range(n):
        p = point_local(1.0, 2.0)
        p = point_local(1.0, 2.0)

  1. One test where it's directly using the class, and that gets constant folded and removed. E.g.
class MyPoint:
    def __init__(self, x, y):
        return None

def testfunc(n):
    for _ in range(n):
        p = MyPoint(1.0, 2.0)
        p = MyPoint(1.0, 2.0)

@Sacul0457
Copy link
Copy Markdown
Contributor Author

Sorry for the wait, I printed the uops trace for the second test and how can we tell if it is constant folded?

['_START_EXECUTOR', '_MAKE_WARM', '_SET_IP', '_CHECK_PERIODIC', '_CHECK_VALIDITY', '_ITER_CHECK_RANGE', '_GUARD_NOT_EXHAUSTED_RANGE', 
'_ITER_NEXT_RANGE', '_SET_IP', '_SWAP_FAST_1', '_SPILL_OR_RELOAD', '_POP_TOP', '_CHECK_VALIDITY', '_SET_IP', '_LOAD_DEREF', 
'_CHECK_VALIDITY', '_PUSH_NULL', '_LOAD_CONST_INLINE_BORROW', '_SPILL_OR_RELOAD', '_LOAD_CONST_INLINE_BORROW', '_SET_IP', 
'_SPILL_OR_RELOAD', '_CHECK_OBJECT', '_ALLOCATE_OBJECT', '_CREATE_INIT_FRAME', '_PUSH_FRAME', '_CHECK_VALIDITY', 
'_TIER2_RESUME_CHECK', '_LOAD_CONST_INLINE_BORROW', '_SET_IP', '_RETURN_VALUE', '_CHECK_VALIDITY', '_SET_IP', '_EXIT_INIT_CHECK', 
'_CHECK_VALIDITY', '_SET_IP', '_MAKE_HEAP_SAFE', '_RETURN_VALUE', '_GUARD_IP_RETURN_VALUE', '_GUARD_CODE_VERSION_RETURN_VALUE', 
'_CHECK_VALIDITY', '_SET_IP', '_SWAP_FAST_2', '_POP_TOP', '_CHECK_VALIDITY', '_SET_IP', '_LOAD_DEREF', '_CHECK_VALIDITY', 
'_PUSH_NULL', '_LOAD_CONST_INLINE_BORROW', '_SPILL_OR_RELOAD', '_LOAD_CONST_INLINE_BORROW', '_SET_IP', '_SPILL_OR_RELOAD', 
'_CHECK_OBJECT', '_ALLOCATE_OBJECT', '_CREATE_INIT_FRAME', '_PUSH_FRAME', '_CHECK_VALIDITY', '_TIER2_RESUME_CHECK', 
'_LOAD_CONST_INLINE_BORROW', '_SET_IP', '_RETURN_VALUE', '_CHECK_VALIDITY', '_SET_IP', '_EXIT_INIT_CHECK', '_CHECK_VALIDITY', 
'_SET_IP', '_MAKE_HEAP_SAFE', '_RETURN_VALUE', '_GUARD_IP_RETURN_VALUE', '_GUARD_CODE_VERSION_RETURN_VALUE', '_CHECK_VALIDITY', '_SET_IP', '_SWAP_FAST_2', '_POP_TOP', '_JUMP_TO_TOP', ...]

@Fidget-Spinner
Copy link
Copy Markdown
Member

@Sacul0457 that's strange, seems like it's not being constant folded in the second case. Let me fix that.

@Sacul0457
Copy link
Copy Markdown
Contributor Author

Oh alright. I'm probably going to bed now, please feel free to add the tests too as I don't want to hold up this PR being merged. Thank you!

@Sacul0457 Sacul0457 marked this pull request as ready for review April 12, 2026 17:42
@Fidget-Spinner
Copy link
Copy Markdown
Member

Sorry Sacul, I forgot to mention that you can't nest MyPoint in the test, it needs to be a global to be promoted. My bad, and good night!

Copy link
Copy Markdown
Member

@Fidget-Spinner Fidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for splitting the ops in bytecodes.c Sacul.

@Fidget-Spinner Fidget-Spinner merged commit 18d7d90 into python:main Apr 12, 2026
78 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants