Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICU-22503 Add support for property Indic_Conjunct_Break #3049

Merged
merged 1 commit into from
Jul 26, 2024

Conversation

echeran
Copy link
Contributor

@echeran echeran commented Jun 26, 2024

This PR is for adding support for property InCB (Indic_Conjunct_Break).

Checklist
  • Required: Issue filed: https://unicode-org.atlassian.net/browse/ICU-22503
  • Required: The PR title must be prefixed with a JIRA Issue number.
  • Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
  • Required: Each commit message must be prefixed with a JIRA Issue number.
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable

@jira-pull-request-webhook

This comment was marked as outdated.

@echeran
Copy link
Contributor Author

echeran commented Jun 28, 2024

I rebased this PR on top of #3028 . Therefore, this PR:

  1. should not be merged until ICU-22707 Unicode 16 beta jun04 #3028 is merged
  2. should only introduce changes for InCB, and should not appear to introduce any changes for other properties after this branch is updated from the latest on main after ICU-22707 Unicode 16 beta jun04 #3028 is merged

@echeran echeran marked this pull request as ready for review June 28, 2024 22:45
@markusicu markusicu self-assigned this Jul 18, 2024
@markusicu markusicu requested a review from richgillam July 18, 2024 16:01
@echeran echeran force-pushed the indic-conjunct-break branch from d243193 to 419d617 Compare July 23, 2024 21:00
@jira-pull-request-webhook

This comment was marked as outdated.

richgillam
richgillam previously approved these changes Jul 23, 2024
Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOKTM. Do you also have to make changes on the Java side?

@echeran
Copy link
Contributor Author

echeran commented Jul 23, 2024

Yes, I need the change in Java, since we need a constant (effectively an enum) to represent the new property so that people can provide that vlaue into UCharacter.getPropertValueEnum() or UCharacter.getPropertyName().

Although I'm puzzled why that addition of a single line int constant is causing the Maven build to just break altogether.

@richgillam
Copy link
Contributor

Yes, I need the change in Java, since we need a constant (effectively an enum) to represent the new property so that people can provide that vlaue into UCharacter.getPropertValueEnum() or UCharacter.getPropertyName().

Although I'm puzzled why that addition of a single line int constant is causing the Maven build to just break altogether.

Oops! My fault. You do have the Java change in there too. I missed that. But it's just INDIC_CONJUNCT_BREAK. Don't you also need constants for the individual properties?

@echeran
Copy link
Contributor Author

echeran commented Jul 23, 2024

Don't you also need constants for the individual properties?

Yes indeed! Thanks for catching that.

richgillam
richgillam previously approved these changes Jul 23, 2024
Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lookin' good!

@echeran
Copy link
Contributor Author

echeran commented Jul 23, 2024

is causing the Maven build

The Unicode version constant is not being initialized from the Java resource files. I didn't proceed all the way through in the data update instructions in changes.txt so I'll be going through that further.

Copy link
Member

@markusicu markusicu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the Java versions of the updated pnames.icu & uprops.icu files, but since the major version number has not changed in this PR, I would expect uprops.icu to still load fine, except that all characters would look like InCB=None.

tools/unicode/py/preparseucd.py Outdated Show resolved Hide resolved
tools/unicode/py/preparseucd.py Show resolved Hide resolved
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • icu4c/source/data/unidata/changes.txt is different
  • icu4c/source/test/cintltst/cucdtst.c is now changed in the branch
  • icu4j/main/core/src/main/java/com/ibm/icu/impl/UCharacterProperty.java is now changed in the branch
  • icu4j/main/core/src/main/resources/com/ibm/icu/impl/data/icudata/pnames.icu is now changed in the branch
  • icu4j/main/core/src/main/resources/com/ibm/icu/impl/data/icudata/uprops.icu is now changed in the branch
  • icu4j/main/core/src/test/java/com/ibm/icu/dev/test/lang/UCharacterTest.java is now changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@echeran echeran force-pushed the indic-conjunct-break branch from f02235a to 59fa413 Compare July 26, 2024 21:00
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • tools/unicode/py/preparseucd.py is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@markusicu
Copy link
Member

FYI --

Don't you also need constants for the individual properties?

Yes indeed! Thanks for catching that.

It was also missing the Java port of the uprops.h / uprops.cpp changes, to UCharacterProperty.java. The class initializer error indicated that an int property had been added (and the _LIMIT constant incremented) without adding an entry to the array of IntProps which defines how to fetch the values for each enumerated property from the data structure.

@echeran echeran merged commit 06c077b into unicode-org:main Jul 26, 2024
105 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants