Skip to content

Latest commit

 

History

History
97 lines (80 loc) · 5.83 KB

README.md

File metadata and controls

97 lines (80 loc) · 5.83 KB

LinuxPerf.jl -- Julia wrapper for Linux's perf

Build Status Coverage

the kernel multiplexes event counter that requires limited hardware resources so some counters are only active for a fraction of the running time (% on the right).

if you need to compare two quantities you must put them in the same event group so they are always scheduled at the same time (or not at all).

julia> using LinuxPerf

julia> @noinline function g(a)
           c = 0
           for x in a
               if x > 0
                   c += 1
               end
           end
           c
       end
g (generic function with 1 method)

julia> g(zeros(10000))
0

julia> data = zeros(10000); @measure g(data)
┌───────────────────────┬────────────┬─────────────┐
│                       │ Events     │ Active Time │
├───────────────────────┼────────────┼─────────────┤
│             hw:cycles │ 25,583,165100.0 %     │
├───────────────────────┼────────────┼─────────────┤
│       hw:cache_access │ 1,640,429100.0 %     │
├───────────────────────┼────────────┼─────────────┤
│       hw:cache_misses │ 328,561100.0 %     │
├───────────────────────┼────────────┼─────────────┤
│           hw:branches │ 6,164,138100.0 %     │
├───────────────────────┼────────────┼─────────────┤
│ hw:branch_mispredicts │ 223,272100.0 %     │
├───────────────────────┼────────────┼─────────────┤
│       hw:instructions │ 28,115,285100.0 %     │
├───────────────────────┼────────────┼─────────────┤
│       sw:ctx_switches │ 0100.0 %     │
├───────────────────────┼────────────┼─────────────┤
│        sw:page_faults │ 41100.0 %     │
├───────────────────────┼────────────┼─────────────┤
│  sw:minor_page_faults │ 41100.0 %     │
├───────────────────────┼────────────┼─────────────┤
│  sw:major_page_faults │ 0100.0 %     │
├───────────────────────┼────────────┼─────────────┤
│     sw:cpu_migrations │ 0100.0 %     │
└───────────────────────┴────────────┴─────────────┘

The @pstats macro provides another (perhaps more concise) tool to measure performance events, which can be used in the same way as @timed of the standard library. The following example measures default events and reports its summary:

julia> using LinuxPerf, Random

julia> mt = MersenneTwister(1234);

julia> @pstats rand(mt, 1_000_000);  # compile

julia> @pstats rand(mt, 1_000_000)  # default events
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
┌ cpu-cycles               2.88e+06   58.1%  #  1.2 cycles per ns
│ stalled-cycles-frontend  9.50e+03   58.1%  #  0.3% of cycles
└ stalled-cycles-backend   1.76e+06   58.1%  # 61.2% of cycles
┌ instructions             1.11e+07   41.9%  #  3.9 insns per cycle
│ branch-instructions      5.32e+05   41.9%  #  4.8% of insns
└ branch-misses            2.07e+03   41.9%  #  0.4% of branch insns
┌ task-clock               2.38e+06  100.0%  #  2.4 ms
│ context-switches         0.00e+00  100.0%
│ cpu-migrations           0.00e+00  100.0%
└ page-faults              1.95e+03  100.0%
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

julia> insn = "(instructions,branch-instructions,branch-misses)"

julia> @pstats insn rand(mt, 1_000_000)  # specific events
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
┌ instructions             1.05e+07  100.0%
│ branch-instructions      5.03e+05  100.0%  #  4.8% of insns
└ branch-misses            2.01e+03  100.0%  #  0.4% of branch insns
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

See the documentation of @pstats for more details and available options.

For more fine tuned performance profile examples, please check out the test directory.

Similar tools

If you're willing to install separate software, you might want to check out LIKWID.jl which provides a simpler and more powerful interface to hardware performance monitoring.