The ARM PMU(Performance Monitor Unit) provide useful information for debugging and performance profiling for ARM based platform and the driver is already included in linux kernel. This session is focus on how to enable the PMU support for RZ/G2L platform.
In kernel-source/arch/arm64/boot/dts/renesas/r9a07g044l2.dtsi, please add the below arm-pmu node in the RZ/G2L device tree.
arm-pmu {
compatible = "arm,armv8-pmuv3";
interrupt-parent = <&gic>;
interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
interrupt-affinity = <&a55_0>, <&a55_1>;
};
To enable the ARM PMU driver is needed for the PMU support, please make sure the 2 kernel config is included in kernel configuration.
CONFIG_PERF_EVENTS
CONFIG_ARM_PMU
In build/conf/local.conf, Please install the perf command to the image
IMAGE_INSTALL_append += " perf"
root@smarc-rzg2l:~# perf
usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]
The most commonly used perf commands are:
annotate Read perf.data (created by perf record) and display annotated code
archive Create archive with object files with build-ids found in perf.data file
bench General framework for benchmark suites
buildid-cache Manage build-id cache.
buildid-list List the buildids in a perf.data file
c2c Shared Data C2C/HITM Analyzer.
config Get and set variables in a configuration file.
data Data file related processing
diff Read perf.data files and display the differential profile
evlist List the event names in a perf.data file
ftrace simple wrapper for kernel's ftrace functionality
inject Filter to augment the events stream with additional information
kallsyms Searches running kernel for symbols
kmem Tool to trace/measure kernel memory properties
kvm Tool to trace/measure kvm guest os
list List all symbolic event types
lock Analyze lock events
mem Profile memory accesses
record Run a command and record its profile into perf.data
report Read perf.data (created by perf record) and display the profile
sched Tool to trace/measure scheduler properties (latencies)
script Read perf.data (created by perf record) and display trace output
stat Run a command and gather performance counter statistics
test Runs sanity tests.
timechart Tool to visualize total system behavior during a workload
top System profiling tool.
probe Define new dynamic tracepoints
trace strace inspired tool
See 'perf help COMMAND' for more information on a specific command.
root@smarc-rzg2l:~# perf list
List of pre-defined events (to be used in -e):
branch-instructions OR branches [Hardware event]
branch-misses [Hardware event]
bus-cycles [Hardware event]
cache-misses [Hardware event]
cache-references [Hardware event]
cpu-cycles OR cycles [Hardware event]
instructions [Hardware event]
stalled-cycles-backend OR idle-cycles-backend [Hardware event]
stalled-cycles-frontend OR idle-cycles-frontend [Hardware event]
alignment-faults [Software event]
bpf-output [Software event]
context-switches OR cs [Software event]
cpu-clock [Software event]
cpu-migrations OR migrations [Software event]
dummy [Software event]
emulation-faults [Software event]
major-faults [Software event]
minor-faults [Software event]
page-faults OR faults [Software event]
task-clock [Software event]
L1-dcache-load-misses [Hardware cache event]
L1-dcache-loads [Hardware cache event]
L1-dcache-store-misses [Hardware cache event]
L1-dcache-stores [Hardware cache event]
L1-icache-load-misses [Hardware cache event]
L1-icache-loads [Hardware cache event]
branch-load-misses [Hardware cache event]
branch-loads [Hardware cache event]
dTLB-load-misses [Hardware cache event]
dTLB-loads [Hardware cache event]
iTLB-load-misses [Hardware cache event]
iTLB-loads [Hardware cache event]
armv8_pmuv3/br_immed_retired/ [Kernel PMU event]
armv8_pmuv3/br_mis_pred/ [Kernel PMU event]
armv8_pmuv3/br_mis_pred_retired/ [Kernel PMU event]
armv8_pmuv3/br_pred/ [Kernel PMU event]
armv8_pmuv3/br_retired/ [Kernel PMU event]
armv8_pmuv3/br_return_retired/ [Kernel PMU event]
armv8_pmuv3/bus_access/ [Kernel PMU event]
armv8_pmuv3/bus_cycles/ [Kernel PMU event]
armv8_pmuv3/cid_write_retired/ [Kernel PMU event]
armv8_pmuv3/cpu_cycles/ [Kernel PMU event]
armv8_pmuv3/exc_return/ [Kernel PMU event]
armv8_pmuv3/exc_taken/ [Kernel PMU event]
armv8_pmuv3/inst_retired/ [Kernel PMU event]
armv8_pmuv3/inst_spec/ [Kernel PMU event]
armv8_pmuv3/l1d_cache/ [Kernel PMU event]
armv8_pmuv3/l1d_cache_refill/ [Kernel PMU event]
armv8_pmuv3/l1d_cache_wb/ [Kernel PMU event]
armv8_pmuv3/l1d_tlb/ [Kernel PMU event]
armv8_pmuv3/l1d_tlb_refill/ [Kernel PMU event]
armv8_pmuv3/l1i_cache/ [Kernel PMU event]
armv8_pmuv3/l1i_cache_refill/ [Kernel PMU event]
armv8_pmuv3/l1i_tlb/ [Kernel PMU event]
armv8_pmuv3/l1i_tlb_refill/ [Kernel PMU event]
armv8_pmuv3/l2d_cache/ [Kernel PMU event]
armv8_pmuv3/l2d_cache_allocate/ [Kernel PMU event]
armv8_pmuv3/l2d_cache_refill/ [Kernel PMU event]
armv8_pmuv3/l2d_tlb/ [Kernel PMU event]
armv8_pmuv3/l2d_tlb_refill/ [Kernel PMU event]
armv8_pmuv3/ld_retired/ [Kernel PMU event]
armv8_pmuv3/mem_access/ [Kernel PMU event]
armv8_pmuv3/memory_error/ [Kernel PMU event]
armv8_pmuv3/pc_write_retired/ [Kernel PMU event]
armv8_pmuv3/st_retired/ [Kernel PMU event]
armv8_pmuv3/stall_backend/ [Kernel PMU event]
armv8_pmuv3/stall_frontend/ [Kernel PMU event]
armv8_pmuv3/sw_incr/ [Kernel PMU event]
armv8_pmuv3/ttbr_write_retired/ [Kernel PMU event]
armv8_pmuv3/unaligned_ldst_retired/ [Kernel PMU event]
rNNN [Raw hardware event descriptor]
cpu/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descriptor]
(see 'man perf-list' on how to encode it)
mem:<addr>[/len][:access] [Hardware breakpoint]
root@smarc-rzg2l:~# perf stat ls
Performance counter stats for 'ls':
3.94 msec task-clock # 0.493 CPUs utilized
6 context-switches # 0.002 M/sec
0 cpu-migrations # 0.000 K/sec
53 page-faults # 0.013 M/sec
4556982 cycles # 1.156 GHz
914867 instructions # 0.20 insn per cycle
106623 branches # 27.043 M/sec
20024 branch-misses # 18.78% of all branches
0.007990161 seconds time elapsed
0.003118000 seconds user
0.003188000 seconds sys
root@smarc-rzg2l:~# perf stat -e 'cache-references,cache-misses' ls
Performance counter stats for 'ls':
302505 cache-references
11986 cache-misses # 3.962 % of all cache refs
0.008759965 seconds time elapsed
0.000000000 seconds user
0.005878000 seconds sys
root@smarc-rzg2l:~# perf stat -e 'armv8_pmuv3/mem_access/' ls
Performance counter stats for 'ls':
354269 armv8_pmuv3/mem_access/
0.010624637 seconds time elapsed
0.000000000 seconds user
0.006220000 seconds sys
Table of Contents |
---|