You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I used the Flask-based GUI to view roofline data for a kernel that is heavy on INT32 VALU arithmetic instructions, but has no FP16, FP32, FP64, or INT8 instructions. Because of this the plot marker for this kernel does not show up on the displayed Roofline plots. This can be disorienting, leaving the user to wonder why the kernel does not show up on the plots (but may only become clear when looking at the instruction mix further down). Is there any possibility or plan to extend the roofline plots to demonstrate performance of kernels heavy on INT32 arithmetic?
Testing info:
ROCm v5.4.3
OmniPerf v1.0.6
Hardware: Single GCD of MI250x
The text was updated successfully, but these errors were encountered:
Thanks for reaching out @mrowan137. The reasoning behind our two Empirical Roofline plots is
(FP32/FP64) for HPC applications
(FP16/INT8) for ML application
My understanding is that these data types encapsulate a majority of the arithmetic for these two crowds. To justify adding this to our model, could you tell me a little more about your application and what group this would fit in?
Hi @coleramos425 , the data I collected are from an application called ALE3D which we are supporting at LLNL as part of ELCAP bring-up. This would fall in the HPC category.
I used the Flask-based GUI to view roofline data for a kernel that is heavy on INT32 VALU arithmetic instructions, but has no FP16, FP32, FP64, or INT8 instructions. Because of this the plot marker for this kernel does not show up on the displayed Roofline plots. This can be disorienting, leaving the user to wonder why the kernel does not show up on the plots (but may only become clear when looking at the instruction mix further down). Is there any possibility or plan to extend the roofline plots to demonstrate performance of kernels heavy on INT32 arithmetic?
Testing info:
The text was updated successfully, but these errors were encountered: