gh-145142: Make str.maketrans safe under free-threading#145157
gh-145142: Make str.maketrans safe under free-threading#145157VanshAgarwal24036 wants to merge 10 commits intopython:mainfrom
Conversation
Objects/unicodeobject.c
Outdated
| } | ||
| PyObject *items = PyDict_Items(x); | ||
| if(items == NULL) goto err; | ||
| Py_ssize_t n = PyList_GET_SIZE(items); |
There was a problem hiding this comment.
Use a critical section here instead of copying the items to a list. You'll need to fix up the goto err. Something like:
{
...
Py_BEGIN_CRITICAL_SECTION(x);
while (PyDict_Next(x, &i, &key, &value)) {
...
}
goto done;
err:
Py_CLEAR(new);
done:
Py_END_CRITICAL_SECTION();
return new;
}There was a problem hiding this comment.
- It fails the Windows _freeze_module build (syntax errors from macro expansion).
- A version with early returns builds but causes test_str failures under free-threading on macOS/Linux
Is there a preferred pattern for using Py_BEGIN_CRITICAL_SECTION here, or would you prefer using the dict’s internal locking helpers instead?
Misc/NEWS.d/next/Core_and_Builtins/2026-02-23-23-18-28.gh-issue-145142.T-XbVe.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
I would suggest isolating the loop in a separate function so that we do not have those weird labels jumps and confusing names but this should be benchmarked on PGO/LTO tocheck that we do not introduce an overhead (though I am not that worried as str.maketrans() is usually not a hot function).
Objects/unicodeobject.c
Outdated
| newkey = PyLong_FromLong(PyUnicode_READ(kind, data, 0)); | ||
| if (!newkey) | ||
| goto err; | ||
| goto error; |
There was a problem hiding this comment.
Having the err and error labels is really confusing. We should not have both.
Objects/unicodeobject.c
Outdated
| goto done; | ||
| error: | ||
| Py_CLEAR(new); | ||
| done: | ||
| Py_END_CRITICAL_SECTION(); | ||
| return new; |
There was a problem hiding this comment.
Having a "done" while not entirely done (that is, being done inside the function block) is confusing as well. There are too many jumps for it to be clear when reading the code. I would suggest (1) not using labels at all or (2) find better names for the labels (3) or just duplicate the Py_CLEAR(new) and remove the error label (and rename the done label). (4) wrap the logic in a separate helper.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
d487864 to
05ba3f3
Compare
206f6b2 to
4e715b0
Compare
|
I have made the requested changes; please review again. |
|
Thanks for making the requested changes! @picnixz: please review the changes made to this pull request. |
| goto err; | ||
| } | ||
| } | ||
| if(unicode_maketrans_from_dict(x, new) < 0) |
There was a problem hiding this comment.
I think @picnixz suggestion was to put the critical section outside the function. Something like:
int err;
Py_BEGIN_CRITICAL_SECTION(x);
err = unicode_maketrans_from_dict(x, new);
Py_END_CRITICAL_SECTION();
if (x < 0) {
goto err;
}
Then inside unicode_maketrans_from_dict, the error returns can just be return -1 instead of break and no need for the if (!PyErr_Occurred()) check.
Replace unsafe PyDict_Next iteration with snapshot iteration using PyDict_Items to avoid crashes when the input dict is mutated concurrently under free-threading.
Adds a regression test.