-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Outer product implementation #690
Conversation
src/linalg/impl_linalg.rs
Outdated
Sb: Data<Elem = T>, | ||
I: Dimension, | ||
T: crate::ScalarOperand, | ||
for<'a> &'a ArrayBase<Sb, I>: std::ops::Mul<T, Output = Array<T, I>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems inefficient since it's using a multiplication that produces an Array result, can we use a reusable scratch array for this or some other way to make this mostly in-place?
Maybe using Zip to make the assignment directly without using assign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi !
Yes indeed, this is not the most efficient. Yet, I haven't really found a function allowing me to directly perform a multiplication into a specified array/buffer. A possibility would be to use MulAssign
but then I would have to add a pass to copy the first matrix and then mulassign each chunk, which I'm not sure would be more efficient. An other possibility is to just go and write ad-hoc code for the multiplication but it creates code duplication and if the std::ops implementation ever benefit from performance improvements (for example by using blas), these function won't.
Maybe using Zip to make the assignment directly without using assign
Not sure of what you mean here.
Let me know what you think, and thanks for your review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @bluss, I'm allowing myself to bump the discussion, do you have any opinion, let me know if something isn't clear.
This new API is both easier to use and also takes into account the comments of @bluss. Apart from function names, I'm satisfied with it. Let me know if you have any comments, and I'll start writing documentation! |
@LukeMathWalker @jturner314 I'm not sure @bluss is around anymore, do you have any input ? |
I won't be around much, but I'm trying to get 0.13.1 done when I'm here, and maybe get started on the next release. Why do we have both generic dim and dynamic dim functions? If you replace the dimension type parameter with IxDyn, both functions have the same type signature. That should mean we only need one function of each kind. Your methods are not callable by crate users, because they are not public. I suppose that tells us this is a draft PR, and that's ok. |
In general, should this be in ndarray-linalg or here? I think as much as possible can go in ndarray-linalg, but have we defined the boundaries? Since this is more about the dimensionality of the operation and less about the linalg, maybe the ndarray crate is a good fit(?). Maybe @LukeMathWalker knows |
src/linalg/impl_linalg.rs
Outdated
.collect(); | ||
|
||
unsafe { | ||
let mut res: ArrayD<T> = ArrayBase::uninitialized(res_dim); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have time, this can be ported to Array::maybe_uninit
now. Look into using Zip::apply_assign_into
- see example in tests? I'd still use T: Copy
as the restriction, as long as we don't handle drop of partially filled arrays on panics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Hi @bluss, Thanks for your comments.
I made it so because in general, if you take the kronecker product of two matrices, one with n dimensions and the other with m dimensions, the dimensions of the result will be max(n,m). As the "max" operation is not implemented on types having the Dimension Trait, we have to loose the dimensionality information at the type-level. But the special case were both matrices have the same dimensionality is quite common, and allows us to preserve the dimensionality info at the type-level, hence my choice of two implementations.
Yes, in general I think it should be up to you the crate maintainers to decide the API. I have implemented the modifications for both
I'm not sure I fully understand what you mean. In general, I think you should double check my unsafe code, I'm pretty new to this so I'm not so confident it is ok. |
In one case the dimensionality inputs are D and D (where D is a type parameter for the dimension) and the output dimensionality is D. For the dynamic case, the dim of the inputs are IxDyn, IxDyn and the output has type IxDyn, so it looks like it can be handled by the first case (substitute D = IxDyn and they are the same). There might be something I'm missing. Edit: (Reading more) seems like it will work, both operations are really the same, just adapting to the dimensionality. Making one interface with type parameter D will work, but it might be more tricky to code - and be like |
src/linalg/impl_linalg.rs
Outdated
{ | ||
let mut res_dim = a.raw_dim(); | ||
let mut res_dim_view = res_dim.as_array_view_mut(); | ||
res_dim_view *= &b.raw_dim().as_array_view(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously I kind of love this, that we can use array methods on dimensions 🙂.
The unsafe code looks good, we only need to prove that all elements are assigned to, and they will be if the exact chunks don't leave any uneven remainder. And that looks good to me, B's shape evenly divides the result's shape, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah no, we also have to check the product. We need to use a saturating multiplication, otherwise we can overflow, and then the above doesn't hold?
It is indeed quite tricky to code : I think the problem can be seen as: what should the type of
I ended up with the current solution which could replace the zeros with some uninitialized mem but that would require unsafe code I think. I can also try to optimize a bit for the case where the number of dimensions are the same. I also fixed a bug in the chunks/inputs shape compatibility for different ndims, and added two new tests for that. |
I think both problems can be solved 😸 . Since I've written most of ndarray I have done a fair bit of weird dimension hacks. Not sure if it needs to be solved right now, I can't commit that much time right now, otherwise I'd do it. Creating a new dimension value: |
|
||
// Reshape input arrays to compatible shapes | ||
let a_reshape = a.view().into_shape(a_shape_completed).unwrap(); | ||
let b_reshape = b.view().into_shape(b_shape_completed.clone()).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
into_shape is unfortunately not general enough to always succeed here :( I'm sorry if I have pushed you over to this solution with into_shape. We'll need to think about what we can do, because we know the completed shape is compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the issue with it? The layout constraint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only supports c/f-contiguous arrays, unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shape completed code can be much simpler I guess. Make views av
and bv
and use .insert_axis()
until they both have the same number of axes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm no, insert_axis changes the type, that's no good :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One can check if it's a dyn dimension - then convert to explicitly typed ArrayD
and convert back to Array<_, D>
later with .into_dimensionality()
? Is that truly the best way? Not sure.
Zip::from(&b_reshape) | ||
.apply_assign_into(res_chunk, |&b_elem| MaybeUninit::new(f(a_elem, b_elem))) | ||
}); | ||
// This is safe because the exact chunks covered exactly the res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
I'm closing this as too old. If anyone reworks on this feel free to let me know. |
Hi !
This is my first contribution to ndarray ! :)
This deals with issue #652.
There are two proposed function here, more or less general. I think that a lot of choices can be made (should the
f
function be able to work on chunks or not), and efficiency wise, I am not too sure about what I have. Anyhow, I think it is a nice starting point, so let me know what you think about it, whether interface or implementation wise.Also I didn't really know where to put it, so if you want to move the code somewhere else, please do.