gh-147988: Initialize digits in long_alloc() in debug mode#147989
gh-147988: Initialize digits in long_alloc() in debug mode#147989vstinner merged 5 commits intopython:mainfrom
Conversation
When Python is built in debug mode, long_alloc() now initializes digits with a pattern to detect usage of uninitialized digits. _PyLong_CompactValue() makes sure that the digit is zero when the sign is zero.
|
I also modified PyLongWriter_Finish() to detect uninitialized digits. It's a good place for such checks :-) |
| assert(PyType_HasFeature(op->ob_base.ob_type, Py_TPFLAGS_LONG_SUBCLASS)); | ||
| assert(PyUnstable_Long_IsCompact(op)); | ||
| sign = 1 - (op->long_value.lv_tag & _PyLong_SIGN_MASK); | ||
| if (sign == 0) { |
There was a problem hiding this comment.
Does not it emit a warning or generate inefficient code in non-debug build?
You can write assert(size != 0 || ...).
There was a problem hiding this comment.
I think that sane compiler should recognize here no-op. I've tested assembly code for a toy program with gcc -O2.
There was a problem hiding this comment.
In release mode, assertions are removed and so the code becomes if (size == 0) {}. C compilers are good to remove such dead code. I looked at the assembly code of Python built in release mode, and I cannot see any instruction related to if (size == 0).
gcc -O3 doesn't emit a warning on such code.
| int sign = is_signed ? -1: 1; | ||
| if (idigit == 0) { | ||
| sign = 0; | ||
| v->long_value.ob_digit[0] = 0; |
There was a problem hiding this comment.
Is it needed in non-debug build?
There was a problem hiding this comment.
As described in the issue, MSAN generates a memory error without this line, since _PyLong_CompactValue() uses uninitialized memory (ob_digit[0]).
If you ignore MSAN (and other sanitizers), maybe_small_long() returns the zero singleton, and _PyLong_CompactValue() returns 0 because it computes sign * digit where sign is 0 and digit is a random number.
This PR moves v->long_value.ob_digit[0] = 0; from long_alloc() to _PyLong_FromByteArray(), and make it conditional: only write ob_digit[0] if (idigit == 0) is true. So in the common case, the assignment is no longer done.
There was a problem hiding this comment.
Technically, we could live without all this. But I think it's a good idea to set digit[0] for zero.
Historically, we set number of digits to zero for zero. I think it should be 1 to reflect actual memory allocation. Then zero representation will make sense in both "big int" format (sign + magnitude as 1-element array) and as a "small int" (sign*digit[0])
| Py_UNREACHABLE(); | ||
| } | ||
|
|
||
| if ((size_z + negz) == 0) { |
There was a problem hiding this comment.
This is equivalent to size_z == 0 & !negz, right? Why write the condition in such form? Why add this if? Is it needed in non-debug build?
There was a problem hiding this comment.
This is equivalent to size_z == 0 & !negz, right? Why write the condition in such form?
My intent is to add a special case for long_alloc(0) on long_alloc(size_z + negz) below.
Why add this if?
The alternative is to add z->long_value.ob_digit[0] = 0; after long_alloc() below.
Do you prefer adding z->long_value.ob_digit[0] = 0; ?
Is it needed in non-debug build?
See my reply on _PyLong_FromByteArray() change.
There was a problem hiding this comment.
Well, it does not matter.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
| // long_alloc() fills digits with 0xFF byte pattern. | ||
| Py_ssize_t ndigits = _PyLong_DigitCount(obj); | ||
| if (ndigits == 0) { | ||
| // Check ob_digit[0] digit for the number zero |
There was a problem hiding this comment.
Should not we simply assert obj->long_value.ob_digit[0] == 0?
There was a problem hiding this comment.
I modified the code to raise SystemError, so there is now more complex code below.
| Py_UNREACHABLE(); | ||
| } | ||
|
|
||
| if ((size_z + negz) == 0) { |
There was a problem hiding this comment.
Well, it does not matter.
PyLongWriter_Finish() now raises SystemError instead of stopping the process with abort(). Add test on PyLongWriter_Finish() bug.
|
I modified |
When Python is built in debug mode, long_alloc() now initializes digits with a pattern to detect usage of uninitialized digits.
_PyLong_CompactValue() makes sure that the digit is zero when the sign is zero.
Objects/longobject.c(_PyLong_New()) #147988