-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aliasing in geadd
#4538
Comments
As recently as 2015 it seems, back when I was just a hapless user... At first glance, the implementation uses AXPY so I am not sure if aliasing is safe to use on all platforms (given the various optimized assembler routines for that). |
This is actually coded as B := alpha *A + beta *B without a standalone C ever coming into play. (Presumably this is how ATLAS did it - I have not checked there yet, and it certainly does not help that IBM's ESSL has another, more elaborate API for ?GEADD that even supports transposition of the matrices) |
There's a similar one from intel too (
|
That reads a lot like ESSL's version of GEADD... and there's already a request to implement this (#4236) |
Fortran has a function logic in which arrays are assumed not to alias with each other, but many BLAS functions like
ddot
nevertheless still work correctly with aliasing in theconst
inputs - e.g. one can pass the same array under bothx
andy
:OpenBLAS recently introduced functions
geadd
:https://github.com/OpenMathLib/OpenBLAS/wiki/OpenBLAS-Extensions
which performs matrix addition:
Compared to functions like
gemm
, in these each value of the inputs is involved in exactly one value of the outputs, so if it were to be implemented as a simple C loop, it should work correctly even ifC
is aliased with A or B.From some quick experiments, it seems to work as expected when B and C refer to the same array, but want to ask nevertheless: is this function guaranteed to produce correct output when there is aliasing in the inputs?
i.e. can it be used to perform an operation like this?
If so, I think it'd be a nice addition to the documentation, given that function signatures do not use
restrict
.The text was updated successfully, but these errors were encountered: