Optimize name table in unicodedata by excluding names derived by rule NR2

For most ideographs, the Name property value is derived by concatenating a script-specific prefix string to the code point, expressed in uppercase hexadecimal, with the usual 4- to 6-digit convention (see rule NR2 in [chapter 4.8.1 of Unicode 17.0.0 spec](https://www.unicode.org/versions/Unicode17.0.0/core-spec/chapter-4/#G135207)).

Thus, names for Hangul syllables and most Han and Tangut ideographic characters are not explicitly listed in UnicodeData.txt. They are generated algorithmically in `unicodedata`. See #80667. But ideographic characters for scripts other than Han and Tangut, as well as Egyptian hieroglyphs, have their names listed explicitly in UnicodeData.txt, even when their names are derived by rule NR2. We can reduce the name table if exclude names derived by rule NR2 and generate them using existing code.


### Linked PRs
* gh-144883

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize name table in unicodedata by excluding names derived by rule NR2 #144882

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Optimize name table in unicodedata by excluding names derived by rule NR2 #144882

Description

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions