-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AssertionError: M == 1 #506
Comments
My suggested workaround is function testy(N, MDM, rhs, fx)
NN = N^2
rng = 1:NN
v_rhs = reshape(view(rhs, rng), (N,N))
v_fx = reshape(view(fx, rng), (N,N))
x = 1.0
@turbo for j = 1:N
for i = 1:N
rhsij = 0.0
for k = 1:N
# rhsij += MDM[i,k] * v_fx[k+(N-1)*j]
rhsij += MDM[i,k] * v_fx[k,j]
end
v_rhs[i,j] = rhsij * x
end
end
end I haven't tested it, but I assume doing function testy(N, MDM, rhs, fx)
NN = N^2
rng = 1:NN
v_rhs = reshape(view(rhs, rng), (N,N))
v_fx = reshape(view(fx, rng), (N,N))
x = 1.0
@turbo for j = 1:N
for i = 1:N
rhsij = 0.0
for k = 1:N
# rhsij += MDM[i,k] * v_fx[k+(N-1)*j]
rhsij += MDM[i,k] * v_fx[k,j]
end
rhsijx = rhsij * x
v_rhs[i,j] = rhsijx
end
end
end |
Indeed, both versions work. Is this a user problem, a current limitation of LV or just a bug? |
I'd say it is just a bug, but I am personally unlikely to spend any time fixing it. |
Its fine with me, since your suggested workaround does the job quite well. Btw I encountered another issue when extending the above MWE by a second inner loop. Consider using LoopVectorization
function testy(N, MDM, rhs, fx)
NN = N^2
rng = 1:NN
v_rhs = reshape(view(rhs, rng), (N,N))
v_fx = reshape(view(fx, rng), (N,N))
x = 1.0
@turbo for j = 1:N
for i = 1:N
rhsij = 0.0
for k = 1:N
rhsij += MDM[i,k] * v_fx[k,j]
end
for l = 1:N # changing this to k gives silently wrong results
rhsij += MDM[i,l] * v_fx[l,j]
end
v_rhs[i,j] = rhsij * x
end
end
end
N = 20
MDM = reshape(Float64.(collect(1:N^2)), (N,N))
rhs = deepcopy(MDM)
fx = deepcopy(MDM)
testy(N, MDM, rhs, fx)
display(sum(rhs)) Depending on whether you use I wonder if this falls under this limitation from the README?
|
LoopVectorization doesn't support that. LoopModels will. LoopVectorization.jl requires a loop chain from outer to inner, i.e. no trees. |
So I shouldn't rely on the code to work, although when using different indices |
I am surprised it happened to work at all. Maybe if you try different (larger) sizes, it'll give wrong results? |
Indeed, for Luckily, in my application I can split the extra loop off into a separate |
I was thinking it nested all the loops, meaning it made a chain of loops 4 deep, instead of a tree with 2 leaves. |
Firstly, many thanks for this amazing package!
I managed to cook it down to the following MWE
gives
Changing one of the following makes it work though
@turbo
(obviously)@turbo
onto thei
loopN
to 7 or smallerrhsij += MDM[i,k] * v_fx[k+(N-1)*j]
rhsij *= x
from within thei
loopMy setup is
and using
LoopVectorization v0.12.165
.The text was updated successfully, but these errors were encountered: