Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GCC 15 Wunterminated-string-initialization issues #1543

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ac000
Copy link
Member

@ac000 ac000 commented Jan 21, 2025

    Fix GCC 15 Wunterminated-string-initialization issues
    
    When building with the upcoming GCC 15, we are hit by the new
    Wunterminated-string-initialization compiler warning. E.g.
    
      src/nxt_string.c: In function ‘nxt_encode_uri’:
      src/nxt_string.c:601:36: error: initializer-string for array of ‘unsigned char’ is too long [-Werror=unterminated-string-initialization]
        601 |     static const u_char  hex[16] = "0123456789ABCDEF";
            |                                    ^~~~~~~~~~~~~~~~~~
    
    We have created a character array containing 16 elements, but is not NUL
    terminated. While this type of (lookup/translation table) use case is
    perfectly valid and we could simply disable the warning, it is probably
    still a useful warning to have enabled.
    
    There is a 'nonstring' attribute, but that doesn't influence this
    warning.
    
    Lets just do what nginx does and create them NUL terminated. Except in
    src/nxt_http_parse.c where they *really* want to be non-NUL
    terminated and we use *arrays* of chars.
     
    Signed-off-by: Andrew Clayton <[email protected]>

@ac000 ac000 force-pushed the unterminated-strings branch 3 times, most recently from fe37226 to a365f98 Compare January 21, 2025 04:03
@ac000 ac000 changed the title Don't declare unterminated character arrays/strings Fix GCC 15 Wunterminated-string-initialization issues Jan 21, 2025
@ac000 ac000 force-pushed the unterminated-strings branch from a365f98 to 72ea5ce Compare January 21, 2025 04:17
@ac000 ac000 marked this pull request as ready for review January 21, 2025 05:49
@ac000 ac000 requested review from hongzhidao and avahahn January 21, 2025 05:50
@ac000 ac000 force-pushed the unterminated-strings branch from 72ea5ce to 98f11e6 Compare January 21, 2025 15:51
@ac000
Copy link
Member Author

ac000 commented Jan 21, 2025

  • Highlight that this is a build failure...
$ git range-diff 72ea5ce9...98f11e65
1:  72ea5ce9 ! 1:  98f11e65 Fix GCC 15 Wunterminated-string-initialization issues
    @@ Commit message
         Fix GCC 15 Wunterminated-string-initialization issues
     
         When building with the upcoming GCC 15, we are hit by the new
    -    Wunterminated-string-initialization compiler warning. E.g.
    +    Wunterminated-string-initialization compiler warning (however, due to
    +    -Werror, this is a build failure). E.g.
     
           src/nxt_string.c: In function ‘nxt_encode_uri’:
           src/nxt_string.c:601:36: error: initializer-string for array of ‘unsigned char’ is too long [-Werror=unterminated-string-initialization]

Copy link
Contributor

@hongzhidao hongzhidao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that I’m unable to check if the change is reasonable.
As I understand, we introduce nxt_str_t to handle string with a non traditional way, that it explicitly specify its length. In this case, the stuff like hex is an array of u_char/uint8 type, I think they are not entirely strings whose ending char ‘\0’ is used.
I’d like to ask team’s review.
Thanks.

@ac000
Copy link
Member Author

ac000 commented Jan 21, 2025

Taking this as an example

    static const u_char  hex[16] = "0123456789ABCDEF";                          

That declares a non-NUL terminated string (by loosing the terminating \0 from the string literal), which in this case is OK, but could often be an error, which this new compiler warning warns about. See here for details (Yes, this was introduced by Alex!).

You know that these are usually used as translation table type things and is really being treated as just an array of characters.

So assuming we want to keep this new warning enabled. The two main options are

Make it NUL terminated, which this patch does. (As you say ) It doesn't matter that it's NUL terminated or not (other than using an extra byte) as we only ever access the array indices 0-15 and never the NUL byte.

This is what nginx does.

The other option would be to define it as a character array like

    static const u_char  hex[16] = { '0', '1', '2', '3', '4', '5', '6', '7',
                                     '8', '9', 'A', 'B', 'C', 'D', 'E', 'F' }; 

I chose the first option for most of the cases as it's simpler and they don't matter being NUL terminated.

You can see I took the second option for a couple of them where it does matter that they are non-NUL terminated.

I suppose you could maybe declare it as a nxt_str_t... OK, lets stop there... (besides that's just going to give you a pointer to a NUL terminated string literal, which is no better than char [] and char [] better describes its purpose...)

When building with the upcoming GCC 15, we are hit by the new
Wunterminated-string-initialization compiler warning (however, due to
-Werror, this is a build failure). E.g.

  src/nxt_string.c: In function ‘nxt_encode_uri’:
  src/nxt_string.c:601:36: error: initializer-string for array of ‘unsigned char’ is too long [-Werror=unterminated-string-initialization]
    601 |     static const u_char  hex[16] = "0123456789ABCDEF";
        |                                    ^~~~~~~~~~~~~~~~~~

We have created a character array containing 16 elements, but is not NUL
terminated. While this type of (lookup/translation table) use case is
perfectly valid and we could simply disable the warning, it is probably
still a useful warning to have enabled.

There is a 'nonstring' attribute, but that doesn't influence this
warning. Well not yet at least... but that looks to change!

Lets just do what nginx does and create them NUL terminated. Except in
src/nxt_http_parse.c where they *really* want to be non-NUL
terminated and we use *arrays* of chars.

Link: <https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Common-Variable-Attributes.html>
Link: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117178#c21>
Signed-off-by: Andrew Clayton <[email protected]>
@ac000 ac000 force-pushed the unterminated-strings branch from 98f11e6 to 36d6901 Compare January 21, 2025 23:04
@ac000
Copy link
Member Author

ac000 commented Jan 21, 2025

  • There is work under way to make the nonstring attribute silence the Wunterminated-string-initialization warning. Although I think it's too late for that work to make it into GCC 15 as it's now in the regression and documentation fixes stage...
 $ git range-diff 98f11e65...36d69013
1:  98f11e65 ! 1:  36d69013 Fix GCC 15 Wunterminated-string-initialization issues
    @@ Commit message
         still a useful warning to have enabled.
     
         There is a 'nonstring' attribute, but that doesn't influence this
    -    warning.
    +    warning. Well not yet at least... but that looks to change!
     
         Lets just do what nginx does and create them NUL terminated. Except in
         src/nxt_http_parse.c where they *really* want to be non-NUL
         terminated and we use *arrays* of chars.
     
    +    Link: <https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Common-Variable-Attributes.html>
    +    Link: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117178#c21>
         Signed-off-by: Andrew Clayton <[email protected]>
     
      ## src/nxt_http_parse.c ##

@hongzhidao
Copy link
Contributor

To minimize it, it’s d discussion about if the below code is good or not.

static const u_char hex[16] = "0123456789ABCDEF";

Personally I prefer to keep the explicit [16] because:

  1. It clearly expresses the intent of a hexadecimal lookup table
  2. Provides explicit boundary checking
  3. Self-documenting code - size is known without looking at the initialization value

Regarding string initialization: The original syntax fits this lookup table use case. GCC 15's warning is a bit overzealous here. We don't actually need this string to be NUL-terminated since it's only accessed through 0-15 indices.

Removing [16] just to satisfy the compiler warning actually obscures the code's intent. If we really need to handle this warning, I'd prefer to suppress it with a compiler directive rather than change the code structure.

Welcome to others suggestions on it. It’s also a discussion if we need to support GCC 15 and the new compile option.

@ac000
Copy link
Member Author

ac000 commented Jan 22, 2025

To minimize it, it’s d discussion about if the below code is good or not.

static const u_char hex[16] = "0123456789ABCDEF";

Personally I prefer to keep the explicit [16] because:

This fails to build (due to -Werror) with GCC 15 (I.e. the Fedora Rawhide GCC CI check, the Clang check is failing due to Ruby 3.4 API deprecations)... which is the whole point of this PR...

Regarding string initialization: The original syntax fits this lookup table use case. GCC 15's warning is a bit overzealous here. We don't actually need this string to be NUL-terminated since it's only accessed through 0-15 indices.

No, I know we don't need it to be NUL terminated, if we did then we had bigger problems.

Making it NUL terminated is harmless however... and is what the nginx folks did. nginx actually builds fine with GCC 15 (with -Wextra).

Removing [16] just to satisfy the compiler warning actually obscures the code's intent. If we really need to handle this

Yes, I agree with that...

warning, I'd prefer to suppress it with a compiler directive rather than change the code structure.

Welcome to others suggestions on it. It’s also a discussion if we need to support GCC 15 and the new compile option.

Do we need to support GCC 15?, err yes... why wouldn't we? We support all versions of Clang/GCC that support -std=gnu11

Seeing as there is work in progress to make the nonstring variable attribute quell this warning, that will hopefully make GCC 16. We could temporarily disable this warning on GCC and the re-enable it for GCC 16 (hopefully) and later.

@hongzhidao
Copy link
Contributor

I thought the PR is only because of the option -Wu-s-i* support in GCC 15.

@ac000
Copy link
Member Author

ac000 commented Jan 22, 2025

Correct.

@hongzhidao
Copy link
Contributor

While I'm comfortable with the current approach, I'll stay neutral as others might prefer the new GCC
15 option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants