src: support domains with empty labels in i18n #12707
src: support domains with empty labels in i18n #12707watilde wants to merge 1 commit intonodejs:masterfrom
Conversation
|
cc @nodejs/url |
TimothyGu
left a comment
There was a problem hiding this comment.
Since long domain name labels and long domain names are invalid already, it might not need to ignore
UIDNA_ERROR_LABEL_TOO_LONGandUIDNA_ERROR_DOMAIN_NAME_TOO_LONG.
Can you elaborate on this? What do you mean by "invalid already"?
src/node_i18n.cc
Outdated
There was a problem hiding this comment.
I'd leave off the if as the & will do nothing if the error is not present, and I'd add a reference to https://url.spec.whatwg.org/#concept-domain-to-ascii (especially the VerifyDnsLength bit) for why we are ignoring this error.
|
@TimothyGu Sorry for my unclear description. I meant we cannot ignore -- |
dd3b613 to
f06dd84
Compare
src/node_i18n.cc
Outdated
There was a problem hiding this comment.
I'd add an explicit reference to the VerifyDnsLength option.
// The WHATWG URL "domain to ASCII" algorithm explicitly sets the
// VerifyDnsLength flag to false, which disables the domain name length
// verification step in ToASCII (as specified by UTS #46). Unfortunately,
// ICU4C's IDNA module does not support disabling this flag through `options`,
// so just filter out the errors that may be caused by the verification step
// afterwards.There was a problem hiding this comment.
Thank you for the actual comment example! I put it into the commit :)
src/node_i18n.cc
Outdated
There was a problem hiding this comment.
Hmm this is a bit tricky. A more detailed comment like the comment should be warranted:
// UTS #46's ToUnicode operation applies no validation of domain name length
// (nor a flag requesting it to do so, like VerifyDnsLength for ToASCII). For
// that reason, unlike ToASCII below, ICU4C correctly accepts long domain
// names. However, ICU4C still sets the EMPTY_LABEL error in contrary to UTS
// #46. Therefore, explicitly filters out that error here.Follow the spec of domainToASCII/domainToUnicode in whatwg, and synchronise WPT url test data. Refs: web-platform-tests/wpt#5397
f06dd84 to
d080abb
Compare
TimothyGu
left a comment
There was a problem hiding this comment.
Many thanks for upstreaming the WPT changes as well!
Follow the spec of domainToASCII/domainToUnicode in whatwg, and synchronise WPT url test data. Refs: web-platform-tests/wpt#5397 PR-URL: #12707 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Timothy Gu <timothygu99@gmail.com>
|
landed in 0f58d3c |
Follow the spec of domainToASCII/domainToUnicode in whatwg, and synchronise WPT url test data. Refs: web-platform-tests/wpt#5397 PR-URL: nodejs#12707 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Timothy Gu <timothygu99@gmail.com>
|
Looks like we should backport this to v7.x, no? /cc @watilde |
|
Yes, we should backport it into v7. I'm not sure I should make a patch for it, will check it if it makes conflicts. |
Summary
The
domainToUnicodeand thedomainToASCIIcould be used in the middle of parsing the origin, and it means the input values could be a non-final domain name label. Then the converter should ignore theUIDNA_ERROR_EMPTY_LABELerror. Since long domain name labels and long domain names are invalid already, it might not need to ignoreUIDNA_ERROR_LABEL_TOO_LONGandUIDNA_ERROR_DOMAIN_NAME_TOO_LONG.Refs:
node/test/fixtures/url-idna.js
Lines 192 to 201 in 6c21397
Updates
Checklist
make -j4 testAffected core subsystem(s)
src, test