Skip to content

Commit

Permalink
ICU-22707 Unicode 16 security data first cut
Browse files Browse the repository at this point in the history
  • Loading branch information
markusicu committed Apr 26, 2024
1 parent 09addda commit c90ec08
Show file tree
Hide file tree
Showing 9 changed files with 1,415 additions and 1,351 deletions.
2,369 changes: 1,185 additions & 1,184 deletions icu4c/source/common/uchar_props_data.h

Large diffs are not rendered by default.

Binary file modified icu4c/source/data/in/uprops.icu
Binary file not shown.
3 changes: 0 additions & 3 deletions icu4c/source/data/unidata/changes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,6 @@ and see the change logs below.

Unicode 16.0 update for ICU 76

TODO
- In corepropsbuilder.cpp, remove the isA9CF hack.

https://www.unicode.org/versions/Unicode16.0.0/
https://www.unicode.org/versions/beta-16.0.0.html
https://www.unicode.org/Public/draft/
Expand Down
48 changes: 42 additions & 6 deletions icu4c/source/data/unidata/confusables.txt

Large diffs are not rendered by default.

288 changes: 144 additions & 144 deletions icu4c/source/data/unidata/ppucd.txt

Large diffs are not rendered by default.

Binary file not shown.
Binary file not shown.

Large diffs are not rendered by default.

10 changes: 2 additions & 8 deletions tools/unicode/c/genprops/corepropsbuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -522,7 +522,7 @@ encodeNumericValue(UChar32 start, const char *s, UErrorCode &errorCode) {
return ntv;
}

uint32_t encodeIdentifierType(const UnicodeSet &idType, bool isA9CF, UErrorCode &errorCode) {
uint32_t encodeIdentifierType(const UnicodeSet &idType, UErrorCode &errorCode) {
if(U_FAILURE(errorCode)) { return 0; }
if(idType.isEmpty()) {
fprintf(stderr, "genprops error: data line has an empty Identifier_Type\n");
Expand All @@ -532,12 +532,6 @@ uint32_t encodeIdentifierType(const UnicodeSet &idType, bool isA9CF, UErrorCode
if(idType.contains(U_ID_TYPE_EXCLUSION) && idType.contains(U_ID_TYPE_LIMITED_USE)) {
// By definition, Exclusion and Limited_Use are mutually exclusive.
// We rely on that for the data structure.
if(isA9CF) {
// TODO: Known bug in Unicode 15.1 and before.
// See PAG issue #217.
// See L2/24-064: UTC #179 properties feedback & recommendations
return (UPROPS_ID_TYPE_LIMITED_USE|UPROPS_ID_TYPE_UNCOMMON_USE)&~UPROPS_ID_TYPE_BIT;
}
fprintf(stderr,
"genprops error: data line has both Identifier_Type Exclusion and Limited_Use\n");
errorCode=U_ILLEGAL_ARGUMENT_ERROR;
Expand Down Expand Up @@ -829,7 +823,7 @@ CorePropsBuilder::setProps(const UniProps &props, const UnicodeSet &newValues,
upvec_setValue(pv, start, pvecEnd, 0, scriptX, UPROPS_SCRIPT_X_MASK, &errorCode);
}
if(newValues.contains(UCHAR_IDENTIFIER_TYPE)) {
uint32_t encodedType=encodeIdentifierType(props.idType, start==0xA9CF && start==end, errorCode);
uint32_t encodedType=encodeIdentifierType(props.idType, errorCode);
upvec_setValue(
pv, start, pvecEnd, 2,
encodedType << UPROPS_2_ID_TYPE_SHIFT, UPROPS_2_ID_TYPE_MASK,
Expand Down

0 comments on commit c90ec08

Please sign in to comment.