Profiling and Tracing
Profiling and Tracing
Profiling and Tracing
Interactive debugging using a source-level debugger, as described in the previous chapter, can give you
an insight into the way a program works, but it constrains your view to a small body of code.
In this chapter, I'll begin with the well-known top command as a means of getting an overview. Often
the problem can be localized to a single program, which you can analyze using the Linux profiler, perf.
If the problem is not so localized and you want to get a broader picture, perf can do that as well. To
diagnose problems associated with the kernel, I will describe some trace tools, Ftrace, LTTng, and BPF,
as a means of gathering detailed information.
Generating call graphs relies on the ability to extract call frames from the stack, just as is necessary for backtraces
in GDB. The information needed to unwind stacks is encoded in the debug information of the executables, but not
all combinations of architecture and toolchains are capable of doing so.
perf annotate
Now that you know which functions to look at, it would be nice to step inside and see the code and to
have hit counts for each instruction. That is what perf annotate does,
by calling down to a copy of objdump installed on the target. You just need to use perf annotate
in place of perf report.
perf annotate requires symbol tables for the executables and vmlinux
$ arm-buildroot-linux-gnueabi-objdump --dwarf lib/libc-2.19.so
| grep DW_AT_comp_dir
<3f> DW_AT_comp_dir : /home/chris/buildroot/output/build/ hostgcc-initial-
4.8.3/build/arm-buildroot-linux-gnueabi/libgcc
Tracing events
Function tracing involves instrumenting the code with tracepoints that capture information
about the event, and may include some or all of the following:
• A timestamp
• Context, such as the current PID
• Function parameters and return values
• A callstack
Introducing Ftrace
The kernel function tracer Ftrace evolved from work done by Steven Rostedt and many others as they
were tracking down the causes of high scheduling latency in real-time applications.
Ftrace has a very embedded-friendly user interface that is entirely implemented through virtual files
in the debugfs filesystem, meaning that you do not have to install any tools on the target to make it
work. Nevertheless, there are other user interfaces if you prefer: trace-cmd is a command-line tool
that records and views traces and is available in Buildroot (BR2_PACKAGE_TRACE_CMD) and the
Yocto Project (trace-cmd). There is a graphical trace viewer named KernelShark that is available
as a package for the Yocto Project. Like perf, enabling Ftrace requires setting certain kernel
configuration options
For reasons that will become clear later, you would be well advised to turn on these options as well:
# cat /sys/kernel/debug/tracing/available_tracers
blk function_graph function nop
To capture a trace, select the tracer by writing the name of one of the available_ tracers to
current_tracer, and then enable tracing for a short while, as shown here:
# cat /sys/kernel/debug/tracing/trace
# tracer: function
#
# entries-in-buffer/entries-written: 40051/40051 #P:1
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
sh-361 [000] ...1 992.990646: mutex_unlock <-rb_simple_write
sh-361 [000] ...1 992.990658: fsnotify_parent <-vfs_write
sh-361 [000] ...1 992.990661: fsnotify <-vfs_write
sh-361 [000] ...1 992.990663: srcu_read_lock <-fsnotify
sh-361 [000] ...1 992.990666: preempt_count_add <- srcu_read_
lock
sh-361 [000] ...2 992.990668: preempt_count_sub <- srcu_read_
lock
sh-361 [000] ...1 992.990670: srcu_read_unlock <-fsnotify
sh-361 [000] ...1 992.990672: sb_end_write <-vfs_write
sh-361 [000] ...1 992.990674: preempt_count_add <- sb_end_ write
[…]
function_graph tracer, Ftrace captures call graphs like this:
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
0) + 63.167 us | } /* cpdma_ctlr_int_ctrl */
0) + 73.417 us | } /* cpsw_intr_disable */
0) | disable_irq_nosync() {
0) | disable_irq_nosync() {
0) | irq_get_desc_lock() {
0) 0.541 us | irq_to_desc();
0) 0.500 us | preempt_count_add();
0) + 16.000 us | }
0) | disable_irq() {
0) 0.500 us | irq_disable();
0) 8.208 us | }
0) | irq_put_desc_unlock() {
0) 0.459 us | preempt_count_sub();
0) 8.000 us | }
0) + 55.625 us | }
0) + 63.375 us | }
Dynamic Ftrace and trace filters
Enabling CONFIG_DYNAMIC_FTRACE allows Ftrace to modify the function trace sites at
runtime, which has a couple of benefits. Firstly, it triggers additional build-time processing of the
trace function probes, which allows the Ftrace subsystem to locate them at boot time and
overwrite them with NOP instructions, thus reducing the overhead of the function trace code to
almost nothing
The second advantage is that you can selectively enable function trace sites rather than tracing
everything.
# cd /sys/kernel/debug/tracing
# echo "tcp*" > set_ftrace_filter
# echo function > current_tracer
# echo 1 > tracing_on
# cat trace
# tracer: function
#
# entries-in-buffer/entries-written: 590/590 #P:1
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
dropbear-375 [000] ...1 48545.022235: tcp_poll <-sock_poll
dropbear-375 [000] ...1 48545.022372: tcp_poll <-sock_poll
dropbear-375 [000] ...1 48545.022393: tcp_sendmsg <-inet_
sendmsg
dropbear-375 [000] ...1 48545.022398: tcp_send_mss <-tcp_ sendmsg
dropbear-375 [000] ...1 48545.022400: tcp_current_mss <-tcp_ send_mss
[…]
Trace events
The function and function_graph tracers described in the preceding section record only the
time at which the function was executed. The trace events feature also records parameters associated
with the call, making the trace more readable and informative.
You can see the list of events available at runtime in /sys/kernel/
debug/tracing/available_events. They are named subsystem:function, for example,
kmem:kmalloc. Each event is also represented by a subdirectory in tracing/
events/[subsystem]/[function], as demonstrated here:
# ls events/kmem/kmalloc
enable filter format id trigger