Skip to content

gh-144984: Fix crash in ExternalEntityParserCreate() error paths#144992

Open
raminfp wants to merge 1 commit intopython:mainfrom
raminfp:fix-pyexpat-extentity-error-paths
Open

gh-144984: Fix crash in ExternalEntityParserCreate() error paths#144992
raminfp wants to merge 1 commit intopython:mainfrom
raminfp:fix-pyexpat-extentity-error-paths

Conversation

@raminfp
Copy link
Contributor

@raminfp raminfp commented Feb 19, 2026

Fix crash when ExternalEntityParserCreate() hits an error path
(allocation failure). Py_DECREF(new_parser) calls xmlparse_dealloc()
on a partially-initialized object where handlers is NULL, causing a
NULL pointer dereference in clear_handlers. Additionally,
Py_CLEAR(parent) in dealloc already decrements the parent's refcount,
making the subsequent Py_DECREF(self) a double-decrement.

  • Add NULL guard in clear_handlers
  • Set parent = NULL before Py_DECREF(new_parser) in each error path

When ExternalEntityParserCreate() hits an error path (allocation
failure), Py_DECREF(new_parser) triggers xmlparse_dealloc() on a
partially-initialized object:

1. handlers is NULL, so clear_handlers dereferences NULL (SEGV).
2. Py_CLEAR(parent) in dealloc already decrements the parent's
   refcount, so the explicit Py_DECREF(self) is a double-decrement.

Fix by adding a NULL guard in clear_handlers and setting parent to
NULL before Py_DECREF(new_parser) in each error path so that dealloc
does not over-decrement the parent's refcount.
Comment on lines +839 to +842
_testcapi = import_helper.import_module('_testcapi')
parser = expat.ParserCreate()
parser.buffer_text = True
rc_before = sys.getrefcount(parser)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we'll need _testcapi later for this class (if we find other bugs) so add a setupClass method that binds cls.testcapi so that it can be called inthe methods.

Comment on lines +845 to +849
try:
parser.ExternalEntityParserCreate(None)
except MemoryError:
pass
finally:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explcitily catch the MemoryError.

if (self->buffer != NULL) {
new_parser->buffer = PyMem_Malloc(new_parser->buffer_size);
if (new_parser->buffer == NULL) {
new_parser->parent = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the issue isn't here, it's the decref on self. Instead, let's do new_parser->parent = Py_NewRef(self) and let the new parser's deallocator be responsible for decrefing this reference instead of us doing the Py_DECREF(self)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments