Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering by block doesn't consider cross-block dependencies for metrics #21

Open
skyreflectedinmirrors opened this issue Nov 10, 2022 · 4 comments
Assignees
Labels
bug Something isn't working Profiling Related to the profiling done in Omniperf
Milestone

Comments

@skyreflectedinmirrors
Copy link
Contributor

Specifically, we noticed this while trying to collect coalescing (which lives in the TCP section):

https://github.com/AMDResearch/omniperf/blob/62d130b458a21a2c964da234cf7a24420e01efe1/src/omniperf_cli/configs/gfx90a/1600_L1_cache.yaml#L20

but uses values from the TA (i.e., TA_TOTAL_WAVEFRONTS_sum).

So, if a user does:

omniperf profile -b TCP -n bar -- <foo>
omniperf analyze -p workloads/bar/mi200

the resulting Buffer Coalescing value in the L1 section will be empty.

@coleramos425
Copy link
Collaborator

Ah, good catch. Thanks for reporting this.

We'll have to refine the logic for ip block filtering to account for metrics that reference other blocks such as this. We'll add this to the next release

@coleramos425 coleramos425 added this to the v1.0.5 milestone Nov 11, 2022
@coleramos425 coleramos425 self-assigned this Nov 11, 2022
@coleramos425 coleramos425 added bug Something isn't working Profiling Related to the profiling done in Omniperf labels Nov 11, 2022
@coleramos425
Copy link
Collaborator

Adding this to a future milestone. IP Block, dispatch, and kernel filtering are going to be overhauled when we introduce alternative profiling to users.

This alternative profiling option will introduce a single output csv where organizing logical IP Blocks is much easier. This will also eliminate the issue we have with metrics that use counters from different blocks like our Memory Chart. This similar issue is described below

Issue was that these metrics used SQ_ACCUM_PREV_HIRES which is a counter generated in several ip blocks. Needed to specify which csv to pull counter from in .yaml configs.

Another issue exists with these two cache latencies

image

The expressions for these metrics use counters from two ip blocks.

i.e. L1D Cache Latency = AVG(SQ_ACCUM_PREV_HIRES[from SQ_IFETCH_LEVEL] / SQC_DCACHE_REQ [from pmc_perf])

The coll_level fix used above won't work for these as two different csv would need to be specified. coll_level only lets us specify one. Could either

  • Reorder performance counters in rocprof perfmon config
  • Modify cli tool's coll_level implementation

@coleramos425 coleramos425 modified the milestones: v1.0.5, v1.0.6 Dec 12, 2022
@koomie koomie modified the milestones: v1.0.6, v.1.0.7 Dec 21, 2022
@coleramos425 coleramos425 modified the milestones: v.1.0.7, v.1.1.0 Feb 21, 2023
@coleramos425 coleramos425 modified the milestones: v.1.0.9, v1.1.0 Aug 14, 2023
@ppanchad-amd
Copy link

Closing since ticket is no longer relevant. Thanks!

@ppanchad-amd ppanchad-amd closed this as not planned Won't fix, can't repro, duplicate, stale Oct 4, 2024
@skyreflectedinmirrors
Copy link
Contributor Author

This is definitely still relevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Profiling Related to the profiling done in Omniperf
Projects
None yet
Development

No branches or pull requests

4 participants