-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLDR-17644 Add additional test data for locale display name algorithm #3728
Conversation
// only when we match do we get & process the data at that path | ||
if (path.startsWith(LANGUAGE_PATH)) { | ||
// Get display name | ||
String value = cldrFile.getStringValue(path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each cldrFile has a method to create a locale name (optionally with language, script, and region). It would be better to use that.
Clearly we don't want to produce all the combinations.
I would suggest. Take the locales in Organization.cldr (call that LOC)
for locale in LOC
for localeForName in LOC
test to see if each localeForName's pieces is in Organization.cldr's coverage for locale
skip if not
emit the locale identifier and the name of localeForName in locale
An enhancement of that would be to also have alt values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After more work with @sffc , we pared down the generated test data to a small list of input locales that exercise interesting corner cases (lower & higher coverage levels, which pieces of LSR are used to form the dialect display name, etc.).
We also moved the logic into a preexisting test data file generator (rather than creating a new class file of code and output file with a duplicate purpose).
We went ahead and ensure that both the {standard, dialect} styles of display are used to generate test cases, whereas it previously only used standard.
One question we had: it seems that reading data from |
@@ -3919,6 +3919,10 @@ public synchronized String getName( | |||
if (localePattern == null) { | |||
localePattern = "{0} ({1})"; | |||
} | |||
// Hack | |||
if (localeSeparator == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. This should never happen. Can you see which locales this occurs in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurs in about three-quarters of locales, including major ones like it
, de
. See my comment about my suspicion that they aren't following fallback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fixed per our discussion earlier!
We also switched to using CalculatedCoverageLevels
, which cleans up a lot of code and output.
Does the PR look good now?
There is a test failure:
|
CLDRFile cldrFile = | ||
factory.make(locale, true, DraftStatus.contributed); // don't include | ||
// draft=unconfirmed/provisional | ||
CLDRFile unresolved = cldrFile.getUnresolved(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@macchiati: if you pass true
in factory.make
, it should fall back to root. The unresolved one does not fall back to root. It looks like it's getting display names with the unresolved version. If you have the unresolved version, it does not fall back to root, but it falls back to code. For example, if there's no value, you return the code. For example, display name of "en-GB" becomes "en (GB)". You probably want to test that because it is in the spec. The resolved version still uses code fallback.
Alternatively, there is a method on CLDRFile
that tells you where the string comes from. What it returns is that, for a particular value, what locale is it coming from? It also tells you if the path changed when fetching. That's probably not relevant to this code, but that detail is available.
@@ -50,6 +50,7 @@ adp ; dz | |||
afr ; af |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: There's a tool GenerateAllTests (?) that runs all the tools.
@@ -50,6 +50,7 @@ adp ; dz | |||
afr ; af | |||
agp ; apf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: There's a class called ICUServiceBuilder that plugs CLDR data into ICU formatters, including number and date formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hi @macchiati, we almost resolved all the test failures, but we need your help to resolve the final 4 remaining assertion failures:
The test data hasn't changed, so we're not sure where the translation data for the IPA variant is coming from. |
@sffc I paired with @macchiati to debug, and we found that the |
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
Just need a rubber stamp now (I had to squash to make the PR checker bot happy). |
Let me know if you need me to merge. |
Actually, yes, can you also merge for me? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
CLDR-17644