feat(core): Support locale variants in messages object #360

dpchamps · 2022-02-24T23:37:58Z

It was discovered that when compiling against locale variants (e.g. en-US, es-MX etc etc...) output is generated against expectations.

For example

const MessageFormat = require("@messageformat/core");
const compileMessageModule = require("@messageformat/core/lib/compile-module");

const messagePacks = {
  "en-US": {
    utilsDate: "{date, date, ::EEEMMMd}"
  },
  "es-MX": {
    utilsDate: "{date, date, ::EEEMMMd}"
  },
};

const mf = new MessageFormat("*");
const result = compileMessageModule(mf, messagePacks);

Where result is

const date_en_EEEMMMd_tmb11z = (function() {
  var opt = {"weekday":"short","month":"short","day":"numeric"};
  var dtf = new Intl.DateTimeFormat("en", opt);
//------------------------------------^ this is problematic
  return function(value) { return dtf.format(value); }
})();

export default {
  "en-US": {
    utilsDate: (d) => date_en_EEEMMMd_tmb11z(d.date)
  },
  "es-MX": {
    utilsDate: (d) => date_en_EEEMMMd_tmb11z(d.date)
  }
}

The expectation is instead

const date_en_EEEMMMd_tmb11z = (function() {
  var opt = {"weekday":"short","month":"short","day":"numeric"};
  var dtf = new Intl.DateTimeFormat("en", opt);
  return function(value) { return dtf.format(value); }
})();

const date_es_EEEMMMd_m7z124 = (function() {
  var opt = {"weekday":"short","month":"short","day":"numeric"};
  var dtf = new Intl.DateTimeFormat("es", opt);
  return function(value) { return dtf.format(value); }
})();

export default {
  "en-US": {
    utilsDate: (d) => date_en_EEEMMMd_tmb11z(d.date)
  },
  "es-MX": {
    utilsDate: (d) => date_es_EEEMMMd_m7z124(d.date)
  }
}

Variants are an expected part of locale strings, so it makes sense to support them inside of imports. This bug was caught in our codebase because we want to maintain parity with the icu variant spec.

The patch proposed here is (i think) pretty un-controversial: if a given plural key can be normalized, perform the lookup based on the normalized value.

dpchamps · 2022-02-24T23:39:22Z

packages/core/src/compiler.ts

      for (const key of Object.keys(src)) {
-        const pl = (plurals && plurals[key]) || plural;
+        const normalizedPluralsKey = key.length > 2 ? normalize(key) : key;
+        const pl = (plurals && plurals[normalizedPluralsKey])  || plural;


I considered initially making this a double fallback:

const pl = (plurals && plurals[normalizedPluralsKey]) || (plurals && plurals[key]) || plural;

But could not determine whether this covers a desirable fallback path. Could use guidance here.

if 1) normalization is to be trusted, and 2) only normalized values are to be supported, then this appears to be a safe refactor

dpchamps · 2022-02-24T23:52:19Z

~~I'm not quite sure how to address the test failures... They seem to be unrelated to my changeset and passing locally under the same node versions, is there known flakyness here?~~

It looks like there are some tests failing on master under some versions of node, see: 2bf89a7 for proposed fix

packages/core/src/compile-module.test.ts

cdaringe · 2022-02-24T23:55:32Z

packages/core/src/compiler.ts

      const result: StringStructure = {};
      for (const key of Object.keys(src)) {
-        const pl = (plurals && plurals[key]) || plural;
+        const normalizedPluralsKey = key.length > 2 ? normalize(key) : key;


could we just normalize 💯 % of the time? seems like this length check is already implemented there. may be nice to dedupe it

normalize will throw if given unexpected input, so no (given that there will be single length keys at some point, we need to ensure that it's a candidate for normalization). Unless we want to catch here, which seems less desirable. Or, we could duplicate the normalize function to be more permissive.

This to me seems like the simplest solution: if the key can be normalized, normalize it. Otherwise use it as-is. Suppose could move to a separate fn, maybeGetNormalizedValue, but it seems like overkill.

eemeli · 2022-02-27T12:19:36Z

packages/date-skeleton/src/get-date-formatter.test.ts

      'Dhuʻl-Hijjah 2, 1426',
-      'Dhuʻl-Hijjah 2, 1426 AH'
+      'Dhuʻl-Hijjah 2, 1426 AH',
+      '2 Dhuʻl-Hijjah 1426 AH'


Thank you, this looks like the right fix for this test. Would you be ok if I cherry-picked this commit separately from the rest of the PR?

please do :)

eemeli

This looks like it may be a breaking change. Consider this case, and how it works at the moment:

// input
const  mf = new MessageFormat(['en', 'no']) // English & Norwegian
compileModule(mf, { 'no-one': '{var, plural, one{one} other{other}}' })

// output
import { en } from "@messageformat/runtime/lib/cardinals";
import { plural } from "@messageformat/runtime";
export default {
  "no-one": (d) => plural(d["var"], 0, en, { one: "one", other: "other" })
}

With normalized as well as exact keys getting tested, that message would end up using Norwegian rather than English pluralisation rules. That's a breaking change.

The current documented behaviour is that:

If the messageformat instance has been initialized with support for more than one locale, using a key that matches the locale's identifier at any depth of a messages object will set its child elements to use that locale.

which in your case ends up interacting with the '*' locale identifier:

If locale has the special value '*', it will match all available locales. This may be useful if you want your messages to be completely determined by your data, but may provide surprising results if your input message object includes any 2-3 character keys that are not locale identifiers.

The documentation bug here is that the latter part refers to "locale identifiers" and only implicitly specifies that to really mean "BCP-47 primary language subtag".

While I agree that we ought to find a solution for your use case, I think this change shouldn't be it.

dpchamps · 2022-02-27T18:27:43Z

@eemeli Yes, I was concerned that there may be some edge cases like this. Makes total sense.

Ideas for how we might be able to find a solution that works without a breaking change:

implement a canonicalization procedure (level 1 should be sufficient)
a. Alternatively, just continue extract the key without the canonicalization procedure via normalize (like I'm doing here in this PR)
determine if country code / variant is valid
If so, select the locale identifier from the parsed input
(optionally) put this behind a configuration flag, something like supportLocaleKeyVariants

That way, by ensuring that the country code / variant is in the set of valid codes if would be very difficult to accidentally specify a key that would be misinterpreted by a variant.

Supporting a configuration flag would prevent any accidents at all from happening.

However, this would require an additional set of all valid country codes / variants...

dpchamps · 2022-03-09T19:27:34Z

Hey @eemeli hope you're well. Do you have any thoughts wrt #360 (comment)

Apologies for the bump. We have some work that is currently blocked by this and I'm more than happy to contribute to the codebase if it makes sense. If not, totally understandable-- I can find a workaround in the interim. Thanks so much!

cdaringe · 2023-02-10T00:09:57Z

hey @eemeli, hope all is well. re-reading thru this issue i'm still trying to degrok things, but curious if you'd be willing to give @dpchamps or myself any concrete guidance here for a solution you'd be interested in supporting. Thanks for your time and consideration

eemeli · 2023-03-11T11:37:50Z

Closing this in favour of #386.

dpchamps added 2 commits February 24, 2022 15:17

feat(core): Support locale variants in messages object

e2d1a06

feat(core): add date specification

7789798

dpchamps commented Feb 24, 2022

View reviewed changes

chore(core): run prettier

df9bd4e

cdaringe reviewed Feb 24, 2022

View reviewed changes

packages/core/src/compile-module.test.ts Outdated Show resolved Hide resolved

chore(core): remove duplicate specification

6c089d4

cdaringe reviewed Feb 24, 2022

View reviewed changes

fix(date-skeleton): broken test

2bf89a7

cdaringe approved these changes Feb 25, 2022

View reviewed changes

chore(core): remove unecessary async in spec

1028fdb

dpchamps mentioned this pull request Feb 25, 2022

bug: Potentially unexpected output given the "*" operator #361

Open

eemeli reviewed Feb 27, 2022

View reviewed changes

eemeli requested changes Feb 27, 2022

View reviewed changes

eemeli mentioned this pull request Feb 10, 2023

Add localeCodeFromKey option #386

Merged

eemeli closed this Mar 11, 2023

feat(core): Support locale variants in messages object #360

feat(core): Support locale variants in messages object #360

Uh oh!

Conversation

dpchamps commented Feb 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dpchamps Feb 24, 2022

Choose a reason for hiding this comment

Uh oh!

cdaringe Feb 25, 2022

Choose a reason for hiding this comment

Uh oh!

dpchamps commented Feb 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

cdaringe Feb 24, 2022

Choose a reason for hiding this comment

Uh oh!

dpchamps Feb 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eemeli Feb 27, 2022

Choose a reason for hiding this comment

Uh oh!

dpchamps Feb 27, 2022

Choose a reason for hiding this comment

Uh oh!

eemeli left a comment

Choose a reason for hiding this comment

Uh oh!

dpchamps commented Feb 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dpchamps commented Mar 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cdaringe commented Feb 10, 2023

Uh oh!

eemeli commented Mar 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dpchamps commented Feb 24, 2022 •

edited

Loading

dpchamps commented Feb 24, 2022 •

edited

Loading

dpchamps Feb 24, 2022 •

edited

Loading

dpchamps commented Feb 27, 2022 •

edited

Loading

dpchamps commented Mar 9, 2022 •

edited

Loading