-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unconditional NO_AVX512=1
for flang builds
#4789
Comments
To repeat, I'm aware that this most likely needs upstream fixes, which I want to help bring about. Still, this issue makes sense from my POV as figuring out the OpenBLAS side of things, as well as tracking the removal of those work-arounds. |
This is probably not related to use of AVX512 intrinsics enabled by the macro, but to the |
(although the skylakex-avx512 option should get filtered out since 52b71a1 (March 22 of this year), as it is no longer supported by more recent versions of flang-new) |
It wasn't supported before either, AFAIU it's just that flang 18 started erroring on unknown flags. In any case, the builds in conda-forge/openblas-feedstock#115 are based on 0.3.27, so contain 52b71a1 I genuinely don't know if it's a bug/misconfiguration in OpenBLAS, or a compiler error in flang. In any case, if we can figure out (or someone can explain to me) how |
I must admit I do not see how NO_AVX512 can influence fortran targets - unless this is some weird register usage problem in their Fortran/C interoperability. |
reconfiguring my SkylakeX system for testing locally. |
Not sure if this'll play any role in this, but flang just gained support for |
may be useful for future performance (or to introduce more fma-related deviations in the lapack test results...). my local test hit an unexpected problem in that some "#include"s of the actual sources by the cmake-generated files cannot be resolved by make, although the exact same absolute paths work for browsing the affected files. can't remember getting this before with msys... |
include problem solved (path apparently too long) but flang 18.1.8 fails when compiling LAPACK's slamch (which does divisions by huge and near-zero numbers to determine machine constants). need to check if this was already reported/fixed |
If you're able to use conda-forge compilers, you could install
which should give you a flang built off of llvm/llvm-project@3bb2563, so ~2 weeks old. LLVM 19 branches in about a week; I plan to have rc1 built soon after. |
thank you. unfortunately I had a few problems with my miniconda installation - finally got a build that reproduces the blas3 test failures but have not found yet what causes them |
NVM, I'm just too tired, forgot to add |
So the conda-forge LLVM 19.1 (used in conjunction with VS2022) appears to work correctly even with AVX512 enabled. I am currently experiencing two problems with this setup though - with BUILD_STATIC_LIBS, all tests using CSCAL or ZSCAL fail to link due to an unresolved symbol |
closing here as AVX512 appears to pose no additional problem anymore, any remaining issues can be addressed in the context of #4768 |
I want to figure out what's happening with the errors with flang when using the default build options on a platform that supports AVX512. This problem was already observed in #4016 (CC @mmuetzel), leading to work-arounds like the following
OpenBLAS/.github/workflows/dynamic_arch.yml
Lines 177 to 179 in e1eef56
However, these problems still occur in conda-forge with the in-progress flang 19, almost 3 releases later (c.f. #4768); more precisely, the errors are
which matches what happened in #4016. There are some more detailed failure logs in that PR that I haven't yet tried to reproduce.
Before raising an upstream bug report, I first would like to properly understand what's happening in OpenBLAS itself, because for now I haven't been able to construct the link between
NO_AVX512
and any fortran code.Running on azure pipelines, we're getting skylakex agents regularly, which have some AVX512 instructions and thus fall into the above failures (there are still some non-AVX512 agents around; when I caught one, the tests passed). As hoped, adding
NO_AVX512=1
does in fact cause the tests to pass, with the following difference in configuration:The macro
HAVE_AVX512VL
doesn't appear often outside of the config setup, basically the only usage AFAICT isOpenBLAS/kernel/simd/intrin.h
Lines 59 to 61 in e1eef56
What I don't understand is how
intrin_avx512.h
influences any fortran code.The text was updated successfully, but these errors were encountered: