Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICU-22843 UnicodeString <-> std::u16string_view / wstring_view #3076

Merged
merged 1 commit into from
Aug 13, 2024

Conversation

markusicu
Copy link
Member

@markusicu markusicu commented Aug 1, 2024

Checklist
  • Required: Issue filed: https://unicode-org.atlassian.net/browse/ICU-22843
  • Required: The PR title must be prefixed with a JIRA Issue number.
  • Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
  • Required: Each commit message must be prefixed with a JIRA Issue number.
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable

Copy link
Member Author

@markusicu markusicu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @richgillam I got help from @eggrobin who showed me (https://godbolt.org/z/5WMbWszoE) how to templatize the inputs so that in most cases we go from a group of four new overloads to just one templatized one. I like this much better! Do you agree?

See just the third commit for the changes compared with the proposal.

I added a few "WDYT?" comments for both of you.

icu4c/source/common/unicode/unistr.h Outdated Show resolved Hide resolved
icu4c/source/common/unicode/unistr.h Outdated Show resolved Hide resolved
icu4c/source/common/unicode/unistr.h Outdated Show resolved Hide resolved
icu4c/source/common/unicode/unistr.h Outdated Show resolved Hide resolved
icu4c/source/test/intltest/ustrtest.cpp Outdated Show resolved Hide resolved
@@ -1082,7 +1082,7 @@ UnicodeString DecimalQuantity::toScientificString() const {
result.append(u'E');
int32_t _scale = upperPos + scale + exponent;
if (_scale == INT32_MIN) {
result.append({u"-2147483648", -1});
result.append(UnicodeString(true, u"-2147483648", -1));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI
Adding overloads can still break certain call sites, but I think this is rare.
In this case, the old code also copied the contents; now I am aliasing it.

icu4c/source/common/unicode/unistr.h Show resolved Hide resolved
richgillam
richgillam previously approved these changes Aug 3, 2024
Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow. There's a lot of stuff here I don't understand (because I've never really worked in a modern dialect of C++), but I definitely like the conciseness. You'll have to update the API proposal, of course, and maybe that's a good place to go into a little more detail about what you're actually doing. I was having trouble following it, but maybe that's just me. Regardless, I trust you guys...

icu4c/source/common/unicode/char16ptr.h Show resolved Hide resolved
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • icu4c/source/i18n/number_decimalquantity.cpp is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@aheninger
Copy link
Contributor

I like this template-function approach. It's forcing me to learn something new (to me) with std::enable_if

richgillam
richgillam previously approved these changes Aug 6, 2024
Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got lost in some of the changes (I still haven't really gotten my head around enable_if), but it all sounds good and it looks like you've done appropriate testing...

@markusicu
Copy link
Member Author

I finally finished the unit test code. PTAL

@markusicu markusicu marked this pull request as ready for review August 9, 2024 21:51
@markusicu
Copy link
Member Author

I will look at the MSVC failures :-/

@eggrobin
Copy link
Member

eggrobin commented Aug 9, 2024

I will look at the MSVC failures :-/

Since I do my ICU4C work on MSVC, I am happy to look at them next week if you prefer.


UnicodeString aliasFromStr = UnicodeString::readOnlyAlias(str16);
assertTrue("aliasFromStr pointer alias", aliasFromStr.getBuffer() == str16.data());
assertEquals("aliasFromStr length", (int32_t)str16.length(), aliasFromStr.length());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would note that we now consistently use static_cast for this since #3060, and that while it is verbose there is some value in not having to think about any of those C casts that might hide nasty stuff…

But really we should just not have to cast here, assertEquals should be templatized and should accept heterogeneous arguments, and then the call site can look unambiguously nicer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would note that we now consistently use static_cast for this since #3060, and that while it is verbose there is some value in not having to think about any of those C casts that might hide nasty stuff…

I think that static_cast for simple integer type casts adds a lot more clutter than value.

But really we should just not have to cast here, assertEquals should be templatized and should accept heterogeneous arguments, and then the call site can look unambiguously nicer.

I don't know if templates would let us compare signed and unsigned types unhindered.

We have had a little discussion about switching ICU4C tests to using the Google unit test framework, but someone would have to work out how to do so. Probably only for new test code, which means adding a whole new, very different test suite, and one probably less portable than the primitive one we have.

@markusicu
Copy link
Member Author

It looks like I got MSVC to work where the wchar_t version of the test runs. Back to @richgillam & @eggrobin

@markusicu markusicu requested a review from eggrobin August 12, 2024 16:00
@markusicu
Copy link
Member Author

All tests pass 🎉 , except of course for the single-commit check. I plan to squash after review.


Also, I am considering a follow-up proposal+PR for the same kind of template overload for operator!=. Seems silly not to have done it right away. But I also want to get this PR here in as is.

I am still not planning to add overloads for most of the UnicodeString API, and we should at some point have a discussion about future ICU4C APIs taking string inputs.

Copy link
Contributor

@richgillam richgillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests look good. I had one nit, but I'm fine if you ignore it.

icu4c/source/test/intltest/ustrtest.cpp Show resolved Hide resolved
@richgillam
Copy link
Contributor

Also, I am considering a follow-up proposal+PR for the same kind of template overload for operator!=. Seems silly not to have done it right away.

Ouch! I'm bummed we didn't think of that. I'm fine with that in a follow-up PR, although I kind of wish we'd thought about it in time to get it into this one.

@markusicu markusicu force-pushed the unistr-u16string_view branch from 2e1805e to 150466b Compare August 12, 2024 19:58
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@rp9-next rp9-next closed this Aug 13, 2024
@rp9-next rp9-next reopened this Aug 13, 2024
@rp9-next rp9-next closed this Aug 13, 2024
@rp9-next rp9-next reopened this Aug 13, 2024
@rp9-next rp9-next closed this Aug 13, 2024
@rp9-next rp9-next reopened this Aug 13, 2024
@markusicu markusicu force-pushed the unistr-u16string_view branch from 150466b to c8a7897 Compare August 13, 2024 15:18
@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@markusicu markusicu merged commit 7220649 into unicode-org:main Aug 13, 2024
99 checks passed
@markusicu markusicu deleted the unistr-u16string_view branch August 13, 2024 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants