-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace trailing 1-element arrays with C99 flex-arrays #3785
Conversation
Using a trailing 1-element array as a variable-length array is undefined behavior. Compilers have traditionally allowed this, as this is a traditional, pre-C99 convention. But it's being increasingly phased out in favor of C99 flexible arrays, which avoid ambiguity about whether the array is intended to be variable-length or not. gcc 13+ and clang 16+ support the -fstrict-flex-arrays=3 option, which Linux v6.5+ enables globally, resulting in UBSAN errors when these arrays are accessed. (See https://lore.kernel.org/r/[email protected]) Fix this by making zstd use C99 flexible arrays instead of 1-element arrays. This only affects three places (that I could find): one in fse_decompress.c and two in zstdmt_compress.c. Note, this breaks the c89build CI job and thus might not be suitable for merging yet. I recommend that zstd drop support for C89/C90.
Hi @ebiggers! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
We still value I can see why this programming pattern, which used to be tolerated in the past, would be a problem for modern unforgiving static analyzers. Maybe a more future-proof solution could be to move away from this pattern, though I also understand it's a much larger refactor. |
It's not just about static analyzers. This issue prevents enabling the UBSAN array bounds sanitizer, which can be desirable as a security feature (not just debugging), in code that is being built with Other options could include |
Yeah, this is a concern. Shortest solution might be to use selective Longer term, I still believe it will be better to entirely move away from this pattern. |
the flexArray in structure FSE_DecompressWksp is just a way to derive a pointer easily, without risk/complexity of calculating it manually. Not sure if this change is good enough to avoid ubsan warnings though.
Btw, I just tried this setting
I presume it must be a recent setting, with is not widely supported yet. |
Thanks @ebiggers for bringing this to our attention & submitting a PR! I will look into fixing this upstream and merging the relevant fix into my tree. Unfortunately I missed those messages in the LKML. It looks like my email has to stop syncing my folder containing LKML messages that CC me. So I have to fix that and work through my backlog. |
@Cyan4973 What do you think about adding a macro to our
Then using that for all of our flexible arrays? |
This clever use of macro would answer the specific It also assumes that the new Finally, I also believe that it's generally good practice to move away from this convention, whenever practical. For the |
the flexArray in structure FSE_DecompressWksp is just a way to derive a pointer easily, without risk/complexity of calculating it manually. Not sure if this change is good enough to avoid ubsan warnings though.
solving flexArray issue #3785 in fse
the flexArray in structure FSE_DecompressWksp is just a way to derive a pointer easily, without risk/complexity of calculating it manually. Not sure if this change is good enough to avoid ubsan warnings though.
the flexArray in structure FSE_DecompressWksp is just a way to derive a pointer easily, without risk/complexity of calculating it manually. Not sure if this change is good enough to avoid ubsan warnings though.
the flexArray in structure FSE_DecompressWksp is just a way to derive a pointer easily, without risk/complexity of calculating it manually. Not sure if this change is good enough to avoid ubsan warnings though.
Using a trailing 1-element array as a variable-length array is undefined behavior. Compilers have traditionally allowed this, as this is a traditional, pre-C99 convention. But it's being increasingly phased out in favor of C99 flexible arrays, which avoid ambiguity about whether the array is intended to be variable-length or not. gcc 13+ and clang 16+ support the -fstrict-flex-arrays=3 option, which Linux v6.5+ enables globally, resulting in UBSAN errors when these arrays are accessed. (See https://lore.kernel.org/r/[email protected])
Fix this by making zstd use C99 flexible arrays instead of 1-element arrays. This only affects three places (that I could find): one in fse_decompress.c and two in zstdmt_compress.c.
Note, this breaks the c89build CI job and thus might not be suitable for merging yet. I recommend that zstd drop support for C89/C90.