-
-
Notifications
You must be signed in to change notification settings - Fork 34.3k
Closed
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)type-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error
Description
Bug report
Bug description:
The function _PyUnicodeWriter_WriteASCIIString in Objects/unicodeobject.c (or Objects/unicode_writer.c in some versions) contains a potential Undefined Behavior. When len is 0 and ascii is NULL, it calls memcpy with a NULL source pointer.
According to the C standard, passing NULL to memcpy is undefined even if the count is zero.
Proof of Concept
Clang's UndefinedBehaviorSanitizer (UBSan) reports:
../Objects/unicode_writer.c:494:36: runtime error: null pointer passed as argument 2, which is declared to never be null
This happens in the following block:
case PyUnicode_1BYTE_KIND:
{
const Py_UCS1 *str = (const Py_UCS1 *)ascii;
Py_UCS1 *data = writer->data;
memcpy(data + writer->pos, str, len); // <--- UB if str is NULL and len is 0
break;
}Mitigation
The function should return early if len == 0. This is a common pattern in CPython to avoid unnecessary work and prevent UB with memory functions.
if (len == -1)
len = strlen(ascii);
if (len == 0)
return 0;CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
Linked PRs
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)(Objects, Python, Grammar, and Parser dirs)type-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error