You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the suggestion
Build SMI-sampling to get more accurate clock rates
Justification
see here: #149 (comment). Basically, we'd ideally want a good average of clock rates from the profiler for each kernel. Note that 'other' profilers give the ability to either control the clock rate (subject to throttling) or actively report the number of clocks elapsed in various time domains.
Implementation
It can be as simple as spinning up a background thread to sample the clock rate during the app, but a more robust version would be able to assign clocks to specific kernels (e.g., if we control rocprofiler directly as well). We might also ask for enhanced rocprofiler support.
Additional Notes
We might also be able to do this by using (EndNs - BeginNs) / GRBM_GUI_ACTIVE to get an approximate clock rate?
Describe the suggestion
Build SMI-sampling to get more accurate clock rates
Justification
see here: #149 (comment). Basically, we'd ideally want a good average of clock rates from the profiler for each kernel. Note that 'other' profilers give the ability to either control the clock rate (subject to throttling) or actively report the number of clocks elapsed in various time domains.
Implementation
It can be as simple as spinning up a background thread to sample the clock rate during the app, but a more robust version would be able to assign clocks to specific kernels (e.g., if we control rocprofiler directly as well). We might also ask for enhanced rocprofiler support.
Additional Notes
We might also be able to do this by using (EndNs - BeginNs) / GRBM_GUI_ACTIVE to get an approximate clock rate?
Originally posted by @arghdos in #153 (comment)
The text was updated successfully, but these errors were encountered: