Skip to content

Comments

fix(compiler): parse named HTML entities containing digits#67229

Open
yogeshwaran-c wants to merge 1 commit intoangular:mainfrom
yogeshwaran-c:fix/named-entity-digits
Open

fix(compiler): parse named HTML entities containing digits#67229
yogeshwaran-c wants to merge 1 commit intoangular:mainfrom
yogeshwaran-c:fix/named-entity-digits

Conversation

@yogeshwaran-c
Copy link

What kind of change does this PR introduce?

Bug fix

What is the current behavior?

Named HTML entities containing digits (e.g. ¹, ½, ▓, ∴) are not processed correctly by the Angular template compiler. Instead of being decoded to their corresponding Unicode characters, they are output as escaped text (e.g. ¹ instead of ¹).

This affects all 24 valid HTML named entities that contain digits in their names: blk12, blk14, blk34, emsp13, emsp14, frac12, frac13, frac14, frac15, frac16, frac18, frac23, frac25, frac34, frac35, frac38, frac45, frac56, frac58, frac78, sup1, sup2, sup3, there4.

Closes #51323

What is the new behavior?

The lexer's isNamedEntityEnd function now accepts digits as valid characters within named entity references, allowing all standard HTML named entities to be parsed and decoded correctly.

Additional context

The root cause was in the isNamedEntityEnd function in packages/compiler/src/ml_parser/lexer.ts. This function determines when to stop scanning a named entity reference. It used !chars.isAsciiLetter(code) as its termination condition, which meant any digit would immediately end the entity name scan. For an entity like ¹, the scanner would stop at sup (before the 1), never reach the semicolon, and fall back to treating the & as plain text.

The fix adds chars.isDigit(code) to the valid character check, so entity names like sup1 and frac12 are fully scanned.

The entity lookup table in entities.ts already contained all 24 entities — only the lexer scanning logic needed to be updated.

The lexer's isNamedEntityEnd function stopped scanning entity names
when encountering a digit character, causing 24 valid HTML named
entities with digits in their names (e.g. ¹, ½, ▓)
to be treated as plain text instead of decoded to their corresponding
Unicode characters.

Fixes angular#51323
@pullapprove pullapprove bot requested a review from thePunderWoman February 23, 2026 21:04
@angular-robot angular-robot bot added the area: compiler Issues related to `ngc`, Angular's template compiler label Feb 23, 2026
@ngbot ngbot bot added this to the Backlog milestone Feb 23, 2026
Copy link
Contributor

@thePunderWoman thePunderWoman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: compiler Issues related to `ngc`, Angular's template compiler

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Named characters references containing digits are not processed properly

2 participants