mirror of https://github.com/Qortal/Brooklyn
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
181 lines
8.6 KiB
181 lines
8.6 KiB
############### |
|
Timerlat tracer |
|
############### |
|
|
|
The timerlat tracer aims to help the preemptive kernel developers to |
|
find souces of wakeup latencies of real-time threads. Like cyclictest, |
|
the tracer sets a periodic timer that wakes up a thread. The thread then |
|
computes a *wakeup latency* value as the difference between the *current |
|
time* and the *absolute time* that the timer was set to expire. The main |
|
goal of timerlat is tracing in such a way to help kernel developers. |
|
|
|
Usage |
|
----- |
|
|
|
Write the ASCII text "timerlat" into the current_tracer file of the |
|
tracing system (generally mounted at /sys/kernel/tracing). |
|
|
|
For example:: |
|
|
|
[root@f32 ~]# cd /sys/kernel/tracing/ |
|
[root@f32 tracing]# echo timerlat > current_tracer |
|
|
|
It is possible to follow the trace by reading the trace trace file:: |
|
|
|
[root@f32 tracing]# cat trace |
|
# tracer: timerlat |
|
# |
|
# _-----=> irqs-off |
|
# / _----=> need-resched |
|
# | / _---=> hardirq/softirq |
|
# || / _--=> preempt-depth |
|
# || / |
|
# |||| ACTIVATION |
|
# TASK-PID CPU# |||| TIMESTAMP ID CONTEXT LATENCY |
|
# | | | |||| | | | | |
|
<idle>-0 [000] d.h1 54.029328: #1 context irq timer_latency 932 ns |
|
<...>-867 [000] .... 54.029339: #1 context thread timer_latency 11700 ns |
|
<idle>-0 [001] dNh1 54.029346: #1 context irq timer_latency 2833 ns |
|
<...>-868 [001] .... 54.029353: #1 context thread timer_latency 9820 ns |
|
<idle>-0 [000] d.h1 54.030328: #2 context irq timer_latency 769 ns |
|
<...>-867 [000] .... 54.030330: #2 context thread timer_latency 3070 ns |
|
<idle>-0 [001] d.h1 54.030344: #2 context irq timer_latency 935 ns |
|
<...>-868 [001] .... 54.030347: #2 context thread timer_latency 4351 ns |
|
|
|
|
|
The tracer creates a per-cpu kernel thread with real-time priority that |
|
prints two lines at every activation. The first is the *timer latency* |
|
observed at the *hardirq* context before the activation of the thread. |
|
The second is the *timer latency* observed by the thread. The ACTIVATION |
|
ID field serves to relate the *irq* execution to its respective *thread* |
|
execution. |
|
|
|
The *irq*/*thread* splitting is important to clarify at which context |
|
the unexpected high value is coming from. The *irq* context can be |
|
delayed by hardware related actions, such as SMIs, NMIs, IRQs |
|
or by a thread masking interrupts. Once the timer happens, the delay |
|
can also be influenced by blocking caused by threads. For example, by |
|
postponing the scheduler execution via preempt_disable(), by the |
|
scheduler execution, or by masking interrupts. Threads can |
|
also be delayed by the interference from other threads and IRQs. |
|
|
|
Tracer options |
|
--------------------- |
|
|
|
The timerlat tracer is built on top of osnoise tracer. |
|
So its configuration is also done in the osnoise/ config |
|
directory. The timerlat configs are: |
|
|
|
- cpus: CPUs at which a timerlat thread will execute. |
|
- timerlat_period_us: the period of the timerlat thread. |
|
- osnoise/stop_tracing_us: stop the system tracing if a |
|
timer latency at the *irq* context higher than the configured |
|
value happens. Writing 0 disables this option. |
|
- stop_tracing_total_us: stop the system tracing if a |
|
timer latency at the *thread* context higher than the configured |
|
value happens. Writing 0 disables this option. |
|
- print_stack: save the stack of the IRQ ocurrence, and print |
|
it afte the *thread context* event". |
|
|
|
timerlat and osnoise |
|
---------------------------- |
|
|
|
The timerlat can also take advantage of the osnoise: traceevents. |
|
For example:: |
|
|
|
[root@f32 ~]# cd /sys/kernel/tracing/ |
|
[root@f32 tracing]# echo timerlat > current_tracer |
|
[root@f32 tracing]# echo 1 > events/osnoise/enable |
|
[root@f32 tracing]# echo 25 > osnoise/stop_tracing_total_us |
|
[root@f32 tracing]# tail -10 trace |
|
cc1-87882 [005] d..h... 548.771078: #402268 context irq timer_latency 13585 ns |
|
cc1-87882 [005] dNLh1.. 548.771082: irq_noise: local_timer:236 start 548.771077442 duration 7597 ns |
|
cc1-87882 [005] dNLh2.. 548.771099: irq_noise: qxl:21 start 548.771085017 duration 7139 ns |
|
cc1-87882 [005] d...3.. 548.771102: thread_noise: cc1:87882 start 548.771078243 duration 9909 ns |
|
timerlat/5-1035 [005] ....... 548.771104: #402268 context thread timer_latency 39960 ns |
|
|
|
In this case, the root cause of the timer latency does not point to a |
|
single cause, but to multiple ones. Firstly, the timer IRQ was delayed |
|
for 13 us, which may point to a long IRQ disabled section (see IRQ |
|
stacktrace section). Then the timer interrupt that wakes up the timerlat |
|
thread took 7597 ns, and the qxl:21 device IRQ took 7139 ns. Finally, |
|
the cc1 thread noise took 9909 ns of time before the context switch. |
|
Such pieces of evidence are useful for the developer to use other |
|
tracing methods to figure out how to debug and optimize the system. |
|
|
|
It is worth mentioning that the *duration* values reported |
|
by the osnoise: events are *net* values. For example, the |
|
thread_noise does not include the duration of the overhead caused |
|
by the IRQ execution (which indeed accounted for 12736 ns). But |
|
the values reported by the timerlat tracer (timerlat_latency) |
|
are *gross* values. |
|
|
|
The art below illustrates a CPU timeline and how the timerlat tracer |
|
observes it at the top and the osnoise: events at the bottom. Each "-" |
|
in the timelines means circa 1 us, and the time moves ==>:: |
|
|
|
External timer irq thread |
|
clock latency latency |
|
event 13585 ns 39960 ns |
|
| ^ ^ |
|
v | | |
|
|-------------| | |
|
|-------------+-------------------------| |
|
^ ^ |
|
======================================================================== |
|
[tmr irq] [dev irq] |
|
[another thread...^ v..^ v.......][timerlat/ thread] <-- CPU timeline |
|
========================================================================= |
|
|-------| |-------| |
|
|--^ v-------| |
|
| | | |
|
| | + thread_noise: 9909 ns |
|
| +-> irq_noise: 6139 ns |
|
+-> irq_noise: 7597 ns |
|
|
|
IRQ stacktrace |
|
--------------------------- |
|
|
|
The osnoise/print_stack option is helpful for the cases in which a thread |
|
noise causes the major factor for the timer latency, because of preempt or |
|
irq disabled. For example:: |
|
|
|
[root@f32 tracing]# echo 500 > osnoise/stop_tracing_total_us |
|
[root@f32 tracing]# echo 500 > osnoise/print_stack |
|
[root@f32 tracing]# echo timerlat > current_tracer |
|
[root@f32 tracing]# tail -21 per_cpu/cpu7/trace |
|
insmod-1026 [007] dN.h1.. 200.201948: irq_noise: local_timer:236 start 200.201939376 duration 7872 ns |
|
insmod-1026 [007] d..h1.. 200.202587: #29800 context irq timer_latency 1616 ns |
|
insmod-1026 [007] dN.h2.. 200.202598: irq_noise: local_timer:236 start 200.202586162 duration 11855 ns |
|
insmod-1026 [007] dN.h3.. 200.202947: irq_noise: local_timer:236 start 200.202939174 duration 7318 ns |
|
insmod-1026 [007] d...3.. 200.203444: thread_noise: insmod:1026 start 200.202586933 duration 838681 ns |
|
timerlat/7-1001 [007] ....... 200.203445: #29800 context thread timer_latency 859978 ns |
|
timerlat/7-1001 [007] ....1.. 200.203446: <stack trace> |
|
=> timerlat_irq |
|
=> __hrtimer_run_queues |
|
=> hrtimer_interrupt |
|
=> __sysvec_apic_timer_interrupt |
|
=> asm_call_irq_on_stack |
|
=> sysvec_apic_timer_interrupt |
|
=> asm_sysvec_apic_timer_interrupt |
|
=> delay_tsc |
|
=> dummy_load_1ms_pd_init |
|
=> do_one_initcall |
|
=> do_init_module |
|
=> __do_sys_finit_module |
|
=> do_syscall_64 |
|
=> entry_SYSCALL_64_after_hwframe |
|
|
|
In this case, it is possible to see that the thread added the highest |
|
contribution to the *timer latency* and the stack trace, saved during |
|
the timerlat IRQ handler, points to a function named |
|
dummy_load_1ms_pd_init, which had the following code (on purpose):: |
|
|
|
static int __init dummy_load_1ms_pd_init(void) |
|
{ |
|
preempt_disable(); |
|
mdelay(1); |
|
preempt_enable(); |
|
return 0; |
|
|
|
}
|
|
|