CLDR-17644 Add additional test data for locale display name algorithm #3728

echeran · 2024-05-16T20:58:09Z

This PR completes the ticket.

macchiati · 2024-05-16T22:36:16Z

tools/cldr-code/src/main/java/org/unicode/cldr/tool/GenerateDisplayNameTestData.java

+          // only when we match do we get & process the data at that path
+          if (path.startsWith(LANGUAGE_PATH)) {
+            // Get display name
+            String value = cldrFile.getStringValue(path);


Each cldrFile has a method to create a locale name (optionally with language, script, and region). It would be better to use that.

Clearly we don't want to produce all the combinations.

I would suggest. Take the locales in Organization.cldr (call that LOC)
for locale in LOC
for localeForName in LOC
test to see if each localeForName's pieces is in Organization.cldr's coverage for locale
skip if not
emit the locale identifier and the name of localeForName in locale

An enhancement of that would be to also have alt values.

After more work with @sffc , we pared down the generated test data to a small list of input locales that exercise interesting corner cases (lower & higher coverage levels, which pieces of LSR are used to form the dialect display name, etc.).

We also moved the logic into a preexisting test data file generator (rather than creating a new class file of code and output file with a duplicate purpose).

We went ahead and ensure that both the {standard, dialect} styles of display are used to generate test cases, whereas it previously only used standard.

sffc · 2024-05-23T03:19:12Z

One question we had: it seems that reading data from CLDRFile does not follow fallback. For example, hi-Latn doesn't return useful data, and many locales were failing to load the display name separator pattern {0}, {1} presumably because they were falling back to root. Is there an easy way to get the display names following fallback?

macchiati · 2024-05-23T03:34:14Z

tools/cldr-code/src/main/java/org/unicode/cldr/util/CLDRFile.java

@@ -3919,6 +3919,10 @@ public synchronized String getName(
        if (localePattern == null) {
            localePattern = "{0} ({1})";
        }
+        // Hack
+        if (localeSeparator == null) {


Hmmm. This should never happen. Can you see which locales this occurs in?

It occurs in about three-quarters of locales, including major ones like it, de. See my comment about my suspicion that they aren't following fallback

This is fixed per our discussion earlier!

We also switched to using CalculatedCoverageLevels, which cleans up a lot of code and output.

Does the PR look good now?

sffc · 2024-06-03T22:15:23Z

There is a test failure:

2024-05-28T19:03:11.3397737Z   TestLocale {
2024-05-28T19:03:11.3398117Z     TestBrackets (0.007s) Passed
2024-05-28T19:03:11.3398566Z     TestCanonicalizer (0.003s) Passed
2024-05-28T19:03:11.3399045Z     TestConsistency (0.016s) Passed
2024-05-28T19:03:11.3399517Z     TestExtendedLanguage (0.001s) Passed
2024-05-28T19:03:11.3400007Z     TestLanguageRegions (0.003s) Passed
2024-05-28T19:03:11.3400473Z     TestLocaleDisplay
2024-05-28T19:03:11.5397438Z ##[warning] (TestLocale.java:513)  Warning: 
2024-05-28T19:03:11.5399427Z Use -v to get samples for tests
2024-05-28T19:03:11.7397349Z 
2024-05-28T19:03:11.9410968Z ##[error] (TestAll.java:165)  Error: (TestAll.java:165) java.lang.IllegalArgumentException: Bad line: @languageDisplay=standard
2024-05-28T19:03:11.9413644Z java.lang.IllegalArgumentException: Bad line: @languageDisplay=standard
2024-05-28T19:03:11.9414992Z 	at org.unicode.cldr.unittest.TestLocale.TestLocaleDisplay(TestLocale.java:548)
2024-05-28T19:03:11.9416372Z 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2024-05-28T19:03:11.9418080Z 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2024-05-28T19:03:11.9419994Z 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2024-05-28T19:03:11.9421498Z 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
2024-05-28T19:03:11.9422616Z 	at com.ibm.icu.dev.test.TestFmwk$MethodTarget.execute(TestFmwk.java:416)
2024-05-28T19:03:11.9423737Z 	at com.ibm.icu.dev.test.TestFmwk$Target.run(TestFmwk.java:355)
2024-05-28T19:03:11.9424839Z 	at com.ibm.icu.dev.test.TestFmwk$ClassTarget.execute(TestFmwk.java:473)
2024-05-28T19:03:11.9425933Z 	at com.ibm.icu.dev.test.TestFmwk$Target.run(TestFmwk.java:355)
2024-05-28T19:03:11.9427099Z 	at com.ibm.icu.dev.test.TestFmwk$ClassTarget.execute(TestFmwk.java:473)
2024-05-28T19:03:11.9428574Z 	at com.ibm.icu.dev.test.TestFmwk$Target.run(TestFmwk.java:355)
2024-05-28T19:03:11.9429607Z 	at com.ibm.icu.dev.test.TestFmwk.runTests(TestFmwk.java:629)
2024-05-28T19:03:11.9430609Z 	at com.ibm.icu.dev.test.TestFmwk.run(TestFmwk.java:577)
2024-05-28T19:03:11.9431562Z 	at org.unicode.cldr.unittest.TestAll.runTests(TestAll.java:165)
2024-05-28T19:03:11.9432592Z 	at org.unicode.cldr.unittest.TestShim.TestAll(TestShim.java:19)
2024-05-28T19:03:11.9433850Z 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2024-05-28T19:03:11.9435452Z 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2024-05-28T19:03:11.9437228Z 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2024-05-28T19:03:11.9438686Z 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
2024-05-28T19:03:11.9439953Z 	at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
2024-05-28T19:03:11.9441495Z 	at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
2024-05-28T19:03:11.9443432Z 	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
2024-05-28T19:03:11.9445439Z 	at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
2024-05-28T19:03:11.9447217Z 	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
2024-05-28T19:03:11.9449077Z 	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
2024-05-28T19:03:11.9451061Z 	at org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
2024-05-28T19:03:11.9453029Z 	at org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
2024-05-28T19:03:11.9454994Z 	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
2024-05-28T19:03:11.9457023Z 	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
2024-05-28T19:03:11.9459017Z 	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
2024-05-28T19:03:11.9461442Z 	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
2024-05-28T19:03:11.9463260Z 	at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
2024-05-28T19:03:11.9464879Z 	at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98)
2024-05-28T19:03:11.9466822Z 	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:214)
2024-05-28T19:03:11.9468930Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9470947Z 	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:210)
2024-05-28T19:03:11.9473477Z 	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:135)
2024-05-28T19:03:11.9475325Z 	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:66)
2024-05-28T19:03:11.9477309Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
2024-05-28T19:03:11.9479308Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9481328Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
2024-05-28T19:03:11.9482965Z 	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
2024-05-28T19:03:11.9484952Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
2024-05-28T19:03:11.9486940Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9488832Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
2024-05-28T19:03:11.9490724Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
2024-05-28T19:03:11.9492057Z 	at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
2024-05-28T19:03:11.9494087Z 	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
2024-05-28T19:03:11.9496576Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
2024-05-28T19:03:11.9498505Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9500698Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
2024-05-28T19:03:11.9502373Z 	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
2024-05-28T19:03:11.9504018Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
2024-05-28T19:03:11.9506091Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9508003Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
2024-05-28T19:03:11.9509767Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
2024-05-28T19:03:11.9511072Z 	at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
2024-05-28T19:03:11.9513070Z 	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
2024-05-28T19:03:11.9515536Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
2024-05-28T19:03:11.9517531Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9519463Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
2024-05-28T19:03:11.9521129Z 	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
2024-05-28T19:03:11.9522791Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
2024-05-28T19:03:11.9524747Z 	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
2024-05-28T19:03:11.9526658Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
2024-05-28T19:03:11.9528491Z 	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
2024-05-28T19:03:11.9531278Z 	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:35)
2024-05-28T19:03:11.9533746Z 	at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
2024-05-28T19:03:11.9535901Z 	at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54)
2024-05-28T19:03:11.9538003Z 	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107)
2024-05-28T19:03:11.9540367Z 	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
2024-05-28T19:03:11.9542445Z 	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
2024-05-28T19:03:11.9545037Z 	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
2024-05-28T19:03:11.9547271Z 	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
2024-05-28T19:03:11.9548986Z 	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
2024-05-28T19:03:11.9550470Z 	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
2024-05-28T19:03:11.9552277Z 	at org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
2024-05-28T19:03:11.9554090Z 	at org.apache.maven.surefire.junitplatform.LazyLauncher.execute(LazyLauncher.java:50)
2024-05-28T19:03:11.9555835Z 	at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.execute(JUnitPlatformProvider.java:184)
2024-05-28T19:03:11.9557882Z 	at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:148)
2024-05-28T19:03:11.9559899Z 	at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:122)
2024-05-28T19:03:11.9561646Z 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:385)
2024-05-28T19:03:11.9563126Z 	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
2024-05-28T19:03:11.9564445Z 	at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:507)
2024-05-28T19:03:11.9565779Z 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495)
2024-05-28T19:03:11.9566568Z 
2024-05-28T19:03:11.9566577Z 
2024-05-28T19:03:11.9566584Z 
2024-05-28T19:03:11.9566819Z  (0.603s) FAILED (1 failure(s), 1 warning(s))

sffc · 2024-06-04T17:22:28Z

tools/cldr-code/src/main/java/org/unicode/cldr/tool/GenerateLocaleIDTestData.java

+                    CLDRFile cldrFile =
+                            factory.make(locale, true, DraftStatus.contributed); // don't include
+                    // draft=unconfirmed/provisional
+                    CLDRFile unresolved = cldrFile.getUnresolved();


@macchiati: if you pass true in factory.make, it should fall back to root. The unresolved one does not fall back to root. It looks like it's getting display names with the unresolved version. If you have the unresolved version, it does not fall back to root, but it falls back to code. For example, if there's no value, you return the code. For example, display name of "en-GB" becomes "en (GB)". You probably want to test that because it is in the spec. The resolved version still uses code fallback.

Alternatively, there is a method on CLDRFile that tells you where the string comes from. What it returns is that, for a particular value, what locale is it coming from? It also tells you if the path changed when fetching. That's probably not relevant to this code, but that detail is available.

sffc · 2024-06-04T17:23:35Z

common/testData/localeIdentifiers/localeCanonicalization.txt

@@ -50,6 +50,7 @@ adp	;	dz
 afr	;	af


Note: There's a tool GenerateAllTests (?) that runs all the tools.

sffc · 2024-06-04T17:25:41Z

common/testData/localeIdentifiers/localeCanonicalization.txt

@@ -50,6 +50,7 @@ adp	;	dz
 afr	;	af
 agp	;	apf


Note: There's a class called ICUServiceBuilder that plugs CLDR data into ICU formatters, including number and date formatting.

macchiati

LGTM

echeran · 2024-06-07T21:24:44Z

Hi @macchiati, we almost resolved all the test failures, but we need your help to resolve the final 4 remaining assertion failures:

Error:  (TestLocale.java:586)  Error: : ka; zh-Hans-fonipa: expected "ჩინური (გამარტივებული, FONIPA)", got "ჩინური (გამარტივებული, IPA)"
Error:  (TestLocale.java:586)  Error: : ka; zh-Hans-fonipa: expected "გამარტივებული ჩინური (FONIPA)", got "გამარტივებული ჩინური (IPA)"
Error:  (TestLocale.java:586)  Error: : ko; zh-Hans-fonipa: expected "중국어(간체, FONIPA)", got "중국어(간체, IPA 음성학)"
Error:  (TestLocale.java:586)  Error: : ko; zh-Hans-fonipa: expected "중국어(간체, FONIPA)", got "중국어(간체, IPA 음성학)"

The test data hasn't changed, so we're not sure where the translation data for the IPA variant is coming from.

echeran · 2024-06-12T22:46:33Z

@sffc I paired with @macchiati to debug, and we found that the ka locale expected value for the -fonipa variant was annotated with draft="uncofirmed" in the source data in ka.xml. That led us to discover that we had different values of DraftStatus between the code used to generate the test file and the code in LocaleData.java that is testing that generated test file. Specifying those to be the same thing fixed the remaining failing assertions.

See unicode-org#3728

jira-pull-request-webhook · 2024-06-19T22:45:25Z

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

echeran · 2024-06-20T22:10:55Z

Just need a rubber stamp now (I had to squash to make the PR checker bot happy).

macchiati · 2024-06-20T22:53:32Z

Let me know if you need me to merge.

echeran · 2024-06-20T23:08:54Z

Actually, yes, can you also merge for me? Thanks!

sffc

LGTM

macchiati reviewed May 16, 2024

View reviewed changes

echeran requested a review from macchiati May 22, 2024 22:21

macchiati reviewed May 23, 2024

View reviewed changes

sffc requested a review from macchiati June 3, 2024 22:15

sffc reviewed Jun 4, 2024

View reviewed changes

sffc changed the title ~~Create initial test data for locale display name algorithm~~ Add additional test data for locale display name algorithm Jun 5, 2024

macchiati previously approved these changes Jun 6, 2024

View reviewed changes

echeran dismissed macchiati’s stale review via 91bfee9 June 7, 2024 20:10

echeran changed the title ~~Add additional test data for locale display name algorithm~~ CLDR-17644 Add additional test data for locale display name algorithm Jun 7, 2024

CLDR-17644 Add additional test data for locale display name algorithm

2c0af66

See unicode-org#3728

echeran force-pushed the testdata-gen branch from 16094b3 to 2c0af66 Compare June 19, 2024 22:45

echeran requested review from macchiati and sffc June 20, 2024 04:30

macchiati approved these changes Jun 20, 2024

View reviewed changes

macchiati merged commit 27462b9 into unicode-org:main Jun 25, 2024
10 checks passed

sffc reviewed Jun 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLDR-17644 Add additional test data for locale display name algorithm #3728

CLDR-17644 Add additional test data for locale display name algorithm #3728

echeran commented May 16, 2024

macchiati May 16, 2024

echeran May 22, 2024

sffc commented May 23, 2024

macchiati May 23, 2024

sffc May 23, 2024

sffc Jun 5, 2024

sffc commented Jun 3, 2024

sffc Jun 4, 2024

sffc Jun 4, 2024

sffc Jun 4, 2024

macchiati left a comment

echeran commented Jun 7, 2024

echeran commented Jun 12, 2024

jira-pull-request-webhook bot commented Jun 19, 2024

echeran commented Jun 20, 2024

macchiati commented Jun 20, 2024

echeran commented Jun 20, 2024

sffc left a comment

CLDR-17644 Add additional test data for locale display name algorithm #3728

CLDR-17644 Add additional test data for locale display name algorithm #3728

Conversation

echeran commented May 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sffc commented May 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sffc commented Jun 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

macchiati left a comment

Choose a reason for hiding this comment

echeran commented Jun 7, 2024

echeran commented Jun 12, 2024

jira-pull-request-webhook bot commented Jun 19, 2024

echeran commented Jun 20, 2024

macchiati commented Jun 20, 2024

echeran commented Jun 20, 2024

sffc left a comment

Choose a reason for hiding this comment