Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLDR-17644 Add additional test data for locale display name algorithm #3728

Merged
merged 1 commit into from
Jun 25, 2024

Conversation

echeran
Copy link
Contributor

@echeran echeran commented May 16, 2024

CLDR-17644

  • This PR completes the ticket.

// only when we match do we get & process the data at that path
if (path.startsWith(LANGUAGE_PATH)) {
// Get display name
String value = cldrFile.getStringValue(path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each cldrFile has a method to create a locale name (optionally with language, script, and region). It would be better to use that.

Clearly we don't want to produce all the combinations.

I would suggest. Take the locales in Organization.cldr (call that LOC)
for locale in LOC
for localeForName in LOC
test to see if each localeForName's pieces is in Organization.cldr's coverage for locale
skip if not
emit the locale identifier and the name of localeForName in locale

An enhancement of that would be to also have alt values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After more work with @sffc , we pared down the generated test data to a small list of input locales that exercise interesting corner cases (lower & higher coverage levels, which pieces of LSR are used to form the dialect display name, etc.).

We also moved the logic into a preexisting test data file generator (rather than creating a new class file of code and output file with a duplicate purpose).

We went ahead and ensure that both the {standard, dialect} styles of display are used to generate test cases, whereas it previously only used standard.

@echeran echeran requested a review from macchiati May 22, 2024 22:21
@sffc
Copy link
Member

sffc commented May 23, 2024

One question we had: it seems that reading data from CLDRFile does not follow fallback. For example, hi-Latn doesn't return useful data, and many locales were failing to load the display name separator pattern {0}, {1} presumably because they were falling back to root. Is there an easy way to get the display names following fallback?

@@ -3919,6 +3919,10 @@ public synchronized String getName(
if (localePattern == null) {
localePattern = "{0} ({1})";
}
// Hack
if (localeSeparator == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. This should never happen. Can you see which locales this occurs in?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurs in about three-quarters of locales, including major ones like it, de. See my comment about my suspicion that they aren't following fallback

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed per our discussion earlier!

We also switched to using CalculatedCoverageLevels, which cleans up a lot of code and output.

Does the PR look good now?

@sffc
Copy link
Member

sffc commented Jun 3, 2024

There is a test failure:

2024-05-28T19:03:11.3397737Z   TestLocale {
2024-05-28T19:03:11.3398117Z     TestBrackets (0.007s) Passed
2024-05-28T19:03:11.3398566Z     TestCanonicalizer (0.003s) Passed
2024-05-28T19:03:11.3399045Z     TestConsistency (0.016s) Passed
2024-05-28T19:03:11.3399517Z     TestExtendedLanguage (0.001s) Passed
2024-05-28T19:03:11.3400007Z     TestLanguageRegions (0.003s) Passed
2024-05-28T19:03:11.3400473Z     TestLocaleDisplay
2024-05-28T19:03:11.5397438Z ##[warning] (TestLocale.java:513)  Warning: 
2024-05-28T19:03:11.5399427Z Use -v to get samples for tests
2024-05-28T19:03:11.7397349Z 
2024-05-28T19:03:11.9410968Z ##[error] (TestAll.java:165)  Error: (TestAll.java:165) java.lang.IllegalArgumentException: Bad line: @languageDisplay=standard
2024-05-28T19:03:11.9413644Z java.lang.IllegalArgumentException: Bad line: @languageDisplay=standard
2024-05-28T19:03:11.9414992Z 	at org.unicode.cldr.unittest.TestLocale.TestLocaleDisplay(TestLocale.java:548)
2024-05-28T19:03:11.9416372Z 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2024-05-28T19:03:11.9418080Z 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2024-05-28T19:03:11.9419994Z 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2024-05-28T19:03:11.9421498Z 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
2024-05-28T19:03:11.9422616Z 	at com.ibm.icu.dev.test.TestFmwk$MethodTarget.execute(TestFmwk.java:416)
2024-05-28T19:03:11.9423737Z 	at com.ibm.icu.dev.test.TestFmwk$Target.run(TestFmwk.java:355)
2024-05-28T19:03:11.9424839Z 	at com.ibm.icu.dev.test.TestFmwk$ClassTarget.execute(TestFmwk.java:473)
2024-05-28T19:03:11.9425933Z 	at com.ibm.icu.dev.test.TestFmwk$Target.run(TestFmwk.java:355)
2024-05-28T19:03:11.9427099Z 	at com.ibm.icu.dev.test.TestFmwk$ClassTarget.execute(TestFmwk.java:473)
2024-05-28T19:03:11.9428574Z 	at com.ibm.icu.dev.test.TestFmwk$Target.run(TestFmwk.java:355)
2024-05-28T19:03:11.9429607Z 	at com.ibm.icu.dev.test.TestFmwk.runTests(TestFmwk.java:629)
2024-05-28T19:03:11.9430609Z 	at com.ibm.icu.dev.test.TestFmwk.run(TestFmwk.java:577)
2024-05-28T19:03:11.9431562Z 	at org.unicode.cldr.unittest.TestAll.runTests(TestAll.java:165)
2024-05-28T19:03:11.9432592Z 	at org.unicode.cldr.unittest.TestShim.TestAll(TestShim.java:19)
2024-05-28T19:03:11.9433850Z 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2024-05-28T19:03:11.9435452Z 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2024-05-28T19:03:11.9437228Z 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2024-05-28T19:03:11.9438686Z 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
2024-05-28T19:03:11.9439953Z 	at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
2024-05-28T19:03:11.9441495Z 	at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
2024-05-28T19:03:11.9443432Z 	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
2024-05-28T19:03:11.9445439Z 	at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
2024-05-28T19:03:11.9447217Z 	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
2024-05-28T19:03:11.9449077Z 	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
2024-05-28T19:03:11.9451061Z 	at org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
2024-05-28T19:03:11.9453029Z 	at org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
2024-05-28T19:03:11.9454994Z 	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
2024-05-28T19:03:11.9457023Z 	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
2024-05-28T19:03:11.9459017Z 	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
2024-05-28T19:03:11.9461442Z 	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
2024-05-28T19:03:11.9463260Z 	at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
2024-05-28T19:03:11.9464879Z 	at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98)
2024-05-28T19:03:11.9466822Z 	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:214)
2024-05-28T19:03:11.9468930Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9470947Z 	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:210)
2024-05-28T19:03:11.9473477Z 	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:135)
2024-05-28T19:03:11.9475325Z 	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:66)
2024-05-28T19:03:11.9477309Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
2024-05-28T19:03:11.9479308Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9481328Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
2024-05-28T19:03:11.9482965Z 	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
2024-05-28T19:03:11.9484952Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
2024-05-28T19:03:11.9486940Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9488832Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
2024-05-28T19:03:11.9490724Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
2024-05-28T19:03:11.9492057Z 	at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
2024-05-28T19:03:11.9494087Z 	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
2024-05-28T19:03:11.9496576Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
2024-05-28T19:03:11.9498505Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9500698Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
2024-05-28T19:03:11.9502373Z 	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
2024-05-28T19:03:11.9504018Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
2024-05-28T19:03:11.9506091Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9508003Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
2024-05-28T19:03:11.9509767Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
2024-05-28T19:03:11.9511072Z 	at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
2024-05-28T19:03:11.9513070Z 	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
2024-05-28T19:03:11.9515536Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
2024-05-28T19:03:11.9517531Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9519463Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
2024-05-28T19:03:11.9521129Z 	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
2024-05-28T19:03:11.9522791Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
2024-05-28T19:03:11.9524747Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9526658Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
2024-05-28T19:03:11.9528491Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
2024-05-28T19:03:11.9531278Z 	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:35)
2024-05-28T19:03:11.9533746Z 	at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
2024-05-28T19:03:11.9535901Z 	at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54)
2024-05-28T19:03:11.9538003Z 	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107)
2024-05-28T19:03:11.9540367Z 	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
2024-05-28T19:03:11.9542445Z 	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
2024-05-28T19:03:11.9545037Z 	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
2024-05-28T19:03:11.9547271Z 	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
2024-05-28T19:03:11.9548986Z 	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
2024-05-28T19:03:11.9550470Z 	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
2024-05-28T19:03:11.9552277Z 	at org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
2024-05-28T19:03:11.9554090Z 	at org.apache.maven.surefire.junitplatform.LazyLauncher.execute(LazyLauncher.java:50)
2024-05-28T19:03:11.9555835Z 	at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.execute(JUnitPlatformProvider.java:184)
2024-05-28T19:03:11.9557882Z 	at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:148)
2024-05-28T19:03:11.9559899Z 	at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:122)
2024-05-28T19:03:11.9561646Z 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:385)
2024-05-28T19:03:11.9563126Z 	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
2024-05-28T19:03:11.9564445Z 	at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:507)
2024-05-28T19:03:11.9565779Z 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495)
2024-05-28T19:03:11.9566568Z 
2024-05-28T19:03:11.9566577Z 
2024-05-28T19:03:11.9566584Z 
2024-05-28T19:03:11.9566819Z  (0.603s) FAILED (1 failure(s), 1 warning(s))

@sffc sffc requested a review from macchiati June 3, 2024 22:15
Comment on lines 197 to 200
CLDRFile cldrFile =
factory.make(locale, true, DraftStatus.contributed); // don't include
// draft=unconfirmed/provisional
CLDRFile unresolved = cldrFile.getUnresolved();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@macchiati: if you pass true in factory.make, it should fall back to root. The unresolved one does not fall back to root. It looks like it's getting display names with the unresolved version. If you have the unresolved version, it does not fall back to root, but it falls back to code. For example, if there's no value, you return the code. For example, display name of "en-GB" becomes "en (GB)". You probably want to test that because it is in the spec. The resolved version still uses code fallback.

Alternatively, there is a method on CLDRFile that tells you where the string comes from. What it returns is that, for a particular value, what locale is it coming from? It also tells you if the path changed when fetching. That's probably not relevant to this code, but that detail is available.

@@ -50,6 +50,7 @@ adp ; dz
afr ; af
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: There's a tool GenerateAllTests (?) that runs all the tools.

@@ -50,6 +50,7 @@ adp ; dz
afr ; af
agp ; apf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: There's a class called ICUServiceBuilder that plugs CLDR data into ICU formatters, including number and date formatting.

@sffc sffc changed the title Create initial test data for locale display name algorithm Add additional test data for locale display name algorithm Jun 5, 2024
macchiati
macchiati previously approved these changes Jun 6, 2024
Copy link
Member

@macchiati macchiati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@echeran echeran changed the title Add additional test data for locale display name algorithm CLDR-17644 Add additional test data for locale display name algorithm Jun 7, 2024
@echeran
Copy link
Contributor Author

echeran commented Jun 7, 2024

Hi @macchiati, we almost resolved all the test failures, but we need your help to resolve the final 4 remaining assertion failures:

Error:  (TestLocale.java:586)  Error: : ka; zh-Hans-fonipa: expected "ჩინური (გამარტივებული, FONIPA)", got "ჩინური (გამარტივებული, IPA)"
Error:  (TestLocale.java:586)  Error: : ka; zh-Hans-fonipa: expected "გამარტივებული ჩინური (FONIPA)", got "გამარტივებული ჩინური (IPA)"
Error:  (TestLocale.java:586)  Error: : ko; zh-Hans-fonipa: expected "중국어(간체, FONIPA)", got "중국어(간체, IPA 음성학)"
Error:  (TestLocale.java:586)  Error: : ko; zh-Hans-fonipa: expected "중국어(간체, FONIPA)", got "중국어(간체, IPA 음성학)"

The test data hasn't changed, so we're not sure where the translation data for the IPA variant is coming from.

@echeran
Copy link
Contributor Author

echeran commented Jun 12, 2024

@sffc I paired with @macchiati to debug, and we found that the ka locale expected value for the -fonipa variant was annotated with draft="uncofirmed" in the source data in ka.xml. That led us to discover that we had different values of DraftStatus between the code used to generate the test file and the code in LocaleData.java that is testing that generated test file. Specifying those to be the same thing fixed the remaining failing assertions.

@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@echeran echeran requested review from macchiati and sffc June 20, 2024 04:30
@echeran
Copy link
Contributor Author

echeran commented Jun 20, 2024

Just need a rubber stamp now (I had to squash to make the PR checker bot happy).

@macchiati
Copy link
Member

Let me know if you need me to merge.

@echeran
Copy link
Contributor Author

echeran commented Jun 20, 2024

Actually, yes, can you also merge for me? Thanks!

@macchiati macchiati merged commit 27462b9 into unicode-org:main Jun 25, 2024
10 checks passed
Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants