Skip to content

fix(compiler): allow digits in named character reference entity names#67716

Closed
mango766 wants to merge 1 commit intoangular:mainfrom
mango766:fix/named-entities-with-digits
Closed

fix(compiler): allow digits in named character reference entity names#67716
mango766 wants to merge 1 commit intoangular:mainfrom
mango766:fix/named-entities-with-digits

Conversation

@mango766
Copy link

Summary

Named HTML character references containing digits (e.g. ¹, ½, ░, ∴,  ) are not processed correctly in Angular templates. There are 24 such entities defined in the HTML spec, and none of them render properly.

Root cause

The isNamedEntityEnd function in packages/compiler/src/ml_parser/lexer.ts uses !chars.isAsciiLetter(code) to determine when the entity name ends. Since digits are not ASCII letters, the lexer stops reading the entity name prematurely when it encounters a digit character.

For example, when parsing ¹:

  1. The lexer reads s, u, p (all ASCII letters ✓)
  2. It hits 1isAsciiLetter('1') is false, so it stops
  3. The captured name is sup instead of sup1
  4. The next character is 1, not ;, so the entity parse fails
  5. The lexer falls back to emitting & as literal text

Fix

Added !chars.isDigit(code) to the isNamedEntityEnd predicate so that digits are treated as valid continuation characters in named entity names. This matches the HTML spec, which allows alphanumeric characters in named character references.

Test plan

  • Added test case 'should parse named entities containing digits' covering 6 different entities: ¹, ½, ¾, ░, ∴,  
  • These cover entities with digits at the end, in the middle, and the special there4 case
  • Verified all entities are already present in the NAMED_ENTITIES table in entities.ts

Fixes #51323

The `isNamedEntityEnd` function in the HTML lexer only recognized ASCII
letters as valid characters within named entity names. This caused all
24 HTML named character references that contain digits (e.g. `¹`,
`½`, `░`, `∴`, ` `) to be incorrectly
parsed. The lexer would stop reading at the first digit, fail to find
the terminating semicolon, and fall back to emitting `&` as literal
text — effectively breaking these entities.

The fix adds `chars.isDigit(code)` to the `isNamedEntityEnd` predicate
so that digits are treated as valid continuation characters in named
entity names, matching the HTML spec.

Fixes angular#51323
@google-cla
Copy link

google-cla bot commented Mar 17, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@pullapprove pullapprove bot requested a review from JoostK March 17, 2026 03:48
@angular-robot angular-robot bot added the area: compiler Issues related to `ngc`, Angular's template compiler label Mar 17, 2026
@ngbot ngbot bot added this to the Backlog milestone Mar 17, 2026
@SkyZeroZx
Copy link
Contributor

This a duplicated of #67229

@JoostK
Copy link
Member

JoostK commented Mar 17, 2026

@mango766 thanks for the PR, but there's an open one already in #67229

@JoostK JoostK closed this Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: compiler Issues related to `ngc`, Angular's template compiler

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Named characters references containing digits are not processed properly

3 participants