mirror of https://github.com/Qortal/Brooklyn
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
2115 lines
87 KiB
2115 lines
87 KiB
.. SPDX-License-Identifier: GPL-2.0 |
|
|
|
====================== |
|
Histogram Design Notes |
|
====================== |
|
|
|
:Author: Tom Zanussi <[email protected]> |
|
|
|
This document attempts to provide a description of how the ftrace |
|
histograms work and how the individual pieces map to the data |
|
structures used to implement them in trace_events_hist.c and |
|
tracing_map.c. |
|
|
|
Note: All the ftrace histogram command examples assume the working |
|
directory is the ftrace /tracing directory. For example:: |
|
|
|
# cd /sys/kernel/debug/tracing |
|
|
|
Also, the histogram output displayed for those commands will be |
|
generally be truncated - only enough to make the point is displayed. |
|
|
|
'hist_debug' trace event files |
|
============================== |
|
|
|
If the kernel is compiled with CONFIG_HIST_TRIGGERS_DEBUG set, an |
|
event file named 'hist_debug' will appear in each event's |
|
subdirectory. This file can be read at any time and will display some |
|
of the hist trigger internals described in this document. Specific |
|
examples and output will be described in test cases below. |
|
|
|
Basic histograms |
|
================ |
|
|
|
First, basic histograms. Below is pretty much the simplest thing you |
|
can do with histograms - create one with a single key on a single |
|
event and cat the output:: |
|
|
|
# echo 'hist:keys=pid' >> events/sched/sched_waking/trigger |
|
|
|
# cat events/sched/sched_waking/hist |
|
|
|
{ pid: 18249 } hitcount: 1 |
|
{ pid: 13399 } hitcount: 1 |
|
{ pid: 17973 } hitcount: 1 |
|
{ pid: 12572 } hitcount: 1 |
|
... |
|
{ pid: 10 } hitcount: 921 |
|
{ pid: 18255 } hitcount: 1444 |
|
{ pid: 25526 } hitcount: 2055 |
|
{ pid: 5257 } hitcount: 2055 |
|
{ pid: 27367 } hitcount: 2055 |
|
{ pid: 1728 } hitcount: 2161 |
|
|
|
Totals: |
|
Hits: 21305 |
|
Entries: 183 |
|
Dropped: 0 |
|
|
|
What this does is create a histogram on the sched_waking event using |
|
pid as a key and with a single value, hitcount, which even if not |
|
explicitly specified, exists for every histogram regardless. |
|
|
|
The hitcount value is a per-bucket value that's automatically |
|
incremented on every hit for the given key, which in this case is the |
|
pid. |
|
|
|
So in this histogram, there's a separate bucket for each pid, and each |
|
bucket contains a value for that bucket, counting the number of times |
|
sched_waking was called for that pid. |
|
|
|
Each histogram is represented by a hist_data struct. |
|
|
|
To keep track of each key and value field in the histogram, hist_data |
|
keeps an array of these fields named fields[]. The fields[] array is |
|
an array containing struct hist_field representations of each |
|
histogram val and key in the histogram (variables are also included |
|
here, but are discussed later). So for the above histogram we have one |
|
key and one value; in this case the one value is the hitcount value, |
|
which all histograms have, regardless of whether they define that |
|
value or not, which the above histogram does not. |
|
|
|
Each struct hist_field contains a pointer to the ftrace_event_field |
|
from the event's trace_event_file along with various bits related to |
|
that such as the size, offset, type, and a hist_field_fn_t function, |
|
which is used to grab the field's data from the ftrace event buffer |
|
(in most cases - some hist_fields such as hitcount don't directly map |
|
to an event field in the trace buffer - in these cases the function |
|
implementation gets its value from somewhere else). The flags field |
|
indicates which type of field it is - key, value, variable, variable |
|
reference, etc., with value being the default. |
|
|
|
The other important hist_data data structure in addition to the |
|
fields[] array is the tracing_map instance created for the histogram, |
|
which is held in the .map member. The tracing_map implements the |
|
lock-free hash table used to implement histograms (see |
|
kernel/trace/tracing_map.h for much more discussion about the |
|
low-level data structures implementing the tracing_map). For the |
|
purposes of this discussion, the tracing_map contains a number of |
|
buckets, each bucket corresponding to a particular tracing_map_elt |
|
object hashed by a given histogram key. |
|
|
|
Below is a diagram the first part of which describes the hist_data and |
|
associated key and value fields for the histogram described above. As |
|
you can see, there are two fields in the fields array, one val field |
|
for the hitcount and one key field for the pid key. |
|
|
|
Below that is a diagram of a run-time snapshot of what the tracing_map |
|
might look like for a given run. It attempts to show the |
|
relationships between the hist_data fields and the tracing_map |
|
elements for a couple hypothetical keys and values.:: |
|
|
|
+------------------+ |
|
| hist_data | |
|
+------------------+ +----------------+ |
|
| .fields[] |---->| val = hitcount |----------------------------+ |
|
+----------------+ +----------------+ | |
|
| .map | | .size | | |
|
+----------------+ +--------------+ | |
|
| .offset | | |
|
+--------------+ | |
|
| .fn() | | |
|
+--------------+ | |
|
. | |
|
. | |
|
. | |
|
+----------------+ <--- n_vals | |
|
| key = pid |----------------------------|--+ |
|
+----------------+ | | |
|
| .size | | | |
|
+--------------+ | | |
|
| .offset | | | |
|
+--------------+ | | |
|
| .fn() | | | |
|
+----------------+ <--- n_fields | | |
|
| unused | | | |
|
+----------------+ | | |
|
| | | | |
|
+--------------+ | | |
|
| | | | |
|
+--------------+ | | |
|
| | | | |
|
+--------------+ | | |
|
n_keys = n_fields - n_vals | | |
|
|
|
The hist_data n_vals and n_fields delineate the extent of the fields[] | | |
|
array and separate keys from values for the rest of the code. | | |
|
|
|
Below is a run-time representation of the tracing_map part of the | | |
|
histogram, with pointers from various parts of the fields[] array | | |
|
to corresponding parts of the tracing_map. | | |
|
|
|
The tracing_map consists of an array of tracing_map_entrys and a set | | |
|
of preallocated tracing_map_elts (abbreviated below as map_entry and | | |
|
map_elt). The total number of map_entrys in the hist_data.map array = | | |
|
map->max_elts (actually map->map_size but only max_elts of those are | | |
|
used. This is a property required by the map_insert() algorithm). | | |
|
|
|
If a map_entry is unused, meaning no key has yet hashed into it, its | | |
|
.key value is 0 and its .val pointer is NULL. Once a map_entry has | | |
|
been claimed, the .key value contains the key's hash value and the | | |
|
.val member points to a map_elt containing the full key and an entry | | |
|
for each key or value in the map_elt.fields[] array. There is an | | |
|
entry in the map_elt.fields[] array corresponding to each hist_field | | |
|
in the histogram, and this is where the continually aggregated sums | | |
|
corresponding to each histogram value are kept. | | |
|
|
|
The diagram attempts to show the relationship between the | | |
|
hist_data.fields[] and the map_elt.fields[] with the links drawn | | |
|
between diagrams:: |
|
|
|
+-----------+ | | |
|
| hist_data | | | |
|
+-----------+ | | |
|
| .fields | | | |
|
+---------+ +-----------+ | | |
|
| .map |---->| map_entry | | | |
|
+---------+ +-----------+ | | |
|
| .key |---> 0 | | |
|
+---------+ | | |
|
| .val |---> NULL | | |
|
+-----------+ | | |
|
| map_entry | | | |
|
+-----------+ | | |
|
| .key |---> pid = 999 | | |
|
+---------+ +-----------+ | | |
|
| .val |--->| map_elt | | | |
|
+---------+ +-----------+ | | |
|
. | .key |---> full key * | | |
|
. +---------+ +---------------+ | | |
|
. | .fields |--->| .sum (val) |<-+ | |
|
+-----------+ +---------+ | 2345 | | | |
|
| map_entry | +---------------+ | | |
|
+-----------+ | .offset (key) |<----+ |
|
| .key |---> 0 | 0 | | | |
|
+---------+ +---------------+ | | |
|
| .val |---> NULL . | | |
|
+-----------+ . | | |
|
| map_entry | . | | |
|
+-----------+ +---------------+ | | |
|
| .key | | .sum (val) or | | | |
|
+---------+ +---------+ | .offset (key) | | | |
|
| .val |--->| map_elt | +---------------+ | | |
|
+-----------+ +---------+ | .sum (val) or | | | |
|
| map_entry | | .offset (key) | | | |
|
+-----------+ +---------------+ | | |
|
| .key |---> pid = 4444 | | |
|
+---------+ +-----------+ | | |
|
| .val | | map_elt | | | |
|
+---------+ +-----------+ | | |
|
| .key |---> full key * | | |
|
+---------+ +---------------+ | | |
|
| .fields |--->| .sum (val) |<-+ | |
|
+---------+ | 65523 | | |
|
+---------------+ | |
|
| .offset (key) |<----+ |
|
| 0 | |
|
+---------------+ |
|
. |
|
. |
|
. |
|
+---------------+ |
|
| .sum (val) or | |
|
| .offset (key) | |
|
+---------------+ |
|
| .sum (val) or | |
|
| .offset (key) | |
|
+---------------+ |
|
|
|
Abbreviations used in the diagrams:: |
|
|
|
hist_data = struct hist_trigger_data |
|
hist_data.fields = struct hist_field |
|
fn = hist_field_fn_t |
|
map_entry = struct tracing_map_entry |
|
map_elt = struct tracing_map_elt |
|
map_elt.fields = struct tracing_map_field |
|
|
|
Whenever a new event occurs and it has a hist trigger associated with |
|
it, event_hist_trigger() is called. event_hist_trigger() first deals |
|
with the key: for each subkey in the key (in the above example, there |
|
is just one subkey corresponding to pid), the hist_field that |
|
represents that subkey is retrieved from hist_data.fields[] and the |
|
hist_field_fn_t fn() associated with that field, along with the |
|
field's size and offset, is used to grab that subkey's data from the |
|
current trace record. |
|
|
|
Once the complete key has been retrieved, it's used to look that key |
|
up in the tracing_map. If there's no tracing_map_elt associated with |
|
that key, an empty one is claimed and inserted in the map for the new |
|
key. In either case, the tracing_map_elt associated with that key is |
|
returned. |
|
|
|
Once a tracing_map_elt available, hist_trigger_elt_update() is called. |
|
As the name implies, this updates the element, which basically means |
|
updating the element's fields. There's a tracing_map_field associated |
|
with each key and value in the histogram, and each of these correspond |
|
to the key and value hist_fields created when the histogram was |
|
created. hist_trigger_elt_update() goes through each value hist_field |
|
and, as for the keys, uses the hist_field's fn() and size and offset |
|
to grab the field's value from the current trace record. Once it has |
|
that value, it simply adds that value to that field's |
|
continually-updated tracing_map_field.sum member. Some hist_field |
|
fn()s, such as for the hitcount, don't actually grab anything from the |
|
trace record (the hitcount fn() just increments the counter sum by 1), |
|
but the idea is the same. |
|
|
|
Once all the values have been updated, hist_trigger_elt_update() is |
|
done and returns. Note that there are also tracing_map_fields for |
|
each subkey in the key, but hist_trigger_elt_update() doesn't look at |
|
them or update anything - those exist only for sorting, which can |
|
happen later. |
|
|
|
Basic histogram test |
|
-------------------- |
|
|
|
This is a good example to try. It produces 3 value fields and 2 key |
|
fields in the output:: |
|
|
|
# echo 'hist:keys=common_pid,call_site.sym:values=bytes_req,bytes_alloc,hitcount' >> events/kmem/kmalloc/trigger |
|
|
|
To see the debug data, cat the kmem/kmalloc's 'hist_debug' file. It |
|
will show the trigger info of the histogram it corresponds to, along |
|
with the address of the hist_data associated with the histogram, which |
|
will become useful in later examples. It then displays the number of |
|
total hist_fields associated with the histogram along with a count of |
|
how many of those correspond to keys and how many correspond to values. |
|
|
|
It then goes on to display details for each field, including the |
|
field's flags and the position of each field in the hist_data's |
|
fields[] array, which is useful information for verifying that things |
|
internally appear correct or not, and which again will become even |
|
more useful in further examples:: |
|
|
|
# cat events/kmem/kmalloc/hist_debug |
|
|
|
# event histogram |
|
# |
|
# trigger info: hist:keys=common_pid,call_site.sym:vals=hitcount,bytes_req,bytes_alloc:sort=hitcount:size=2048 [active] |
|
# |
|
|
|
hist_data: 000000005e48c9a5 |
|
|
|
n_vals: 3 |
|
n_keys: 2 |
|
n_fields: 5 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
VAL: normal u64 value |
|
ftrace_event_field name: bytes_req |
|
type: size_t |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
VAL: normal u64 value |
|
ftrace_event_field name: bytes_alloc |
|
type: size_t |
|
size: 8 |
|
is_signed: 0 |
|
|
|
key fields: |
|
|
|
hist_data->fields[3]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: common_pid |
|
type: int |
|
size: 8 |
|
is_signed: 1 |
|
|
|
hist_data->fields[4]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: call_site |
|
type: unsigned long |
|
size: 8 |
|
is_signed: 0 |
|
|
|
The commands below can be used to clean things up for the next test:: |
|
|
|
# echo '!hist:keys=common_pid,call_site.sym:values=bytes_req,bytes_alloc,hitcount' >> events/kmem/kmalloc/trigger |
|
|
|
Variables |
|
========= |
|
|
|
Variables allow data from one hist trigger to be saved by one hist |
|
trigger and retrieved by another hist trigger. For example, a trigger |
|
on the sched_waking event can capture a timestamp for a particular |
|
pid, and later a sched_switch event that switches to that pid event |
|
can grab the timestamp and use it to calculate a time delta between |
|
the two events:: |
|
|
|
# echo 'hist:keys=pid:ts0=common_timestamp.usecs' >> |
|
events/sched/sched_waking/trigger |
|
|
|
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0' >> |
|
events/sched/sched_switch/trigger |
|
|
|
In terms of the histogram data structures, variables are implemented |
|
as another type of hist_field and for a given hist trigger are added |
|
to the hist_data.fields[] array just after all the val fields. To |
|
distinguish them from the existing key and val fields, they're given a |
|
new flag type, HIST_FIELD_FL_VAR (abbreviated FL_VAR) and they also |
|
make use of a new .var.idx field member in struct hist_field, which |
|
maps them to an index in a new map_elt.vars[] array added to the |
|
map_elt specifically designed to store and retrieve variable values. |
|
The diagram below shows those new elements and adds a new variable |
|
entry, ts0, corresponding to the ts0 variable in the sched_waking |
|
trigger above. |
|
|
|
sched_waking histogram |
|
----------------------:: |
|
|
|
+------------------+ |
|
| hist_data |<-------------------------------------------------------+ |
|
+------------------+ +-------------------+ | |
|
| .fields[] |-->| val = hitcount | | |
|
+----------------+ +-------------------+ | |
|
| .map | | .size | | |
|
+----------------+ +-----------------+ | |
|
| .offset | | |
|
+-----------------+ | |
|
| .fn() | | |
|
+-----------------+ | |
|
| .flags | | |
|
+-----------------+ | |
|
| .var.idx | | |
|
+-------------------+ | |
|
| var = ts0 | | |
|
+-------------------+ | |
|
| .size | | |
|
+-----------------+ | |
|
| .offset | | |
|
+-----------------+ | |
|
| .fn() | | |
|
+-----------------+ | |
|
| .flags & FL_VAR | | |
|
+-----------------+ | |
|
| .var.idx |----------------------------+-+ | |
|
+-----------------+ | | | |
|
. | | | |
|
. | | | |
|
. | | | |
|
+-------------------+ <--- n_vals | | | |
|
| key = pid | | | | |
|
+-------------------+ | | | |
|
| .size | | | | |
|
+-----------------+ | | | |
|
| .offset | | | | |
|
+-----------------+ | | | |
|
| .fn() | | | | |
|
+-----------------+ | | | |
|
| .flags & FL_KEY | | | | |
|
+-----------------+ | | | |
|
| .var.idx | | | | |
|
+-------------------+ <--- n_fields | | | |
|
| unused | | | | |
|
+-------------------+ | | | |
|
| | | | | |
|
+-----------------+ | | | |
|
| | | | | |
|
+-----------------+ | | | |
|
| | | | | |
|
+-----------------+ | | | |
|
| | | | | |
|
+-----------------+ | | | |
|
| | | | | |
|
+-----------------+ | | | |
|
n_keys = n_fields - n_vals | | | |
|
| | | |
|
|
|
This is very similar to the basic case. In the above diagram, we can | | | |
|
see a new .flags member has been added to the struct hist_field | | | |
|
struct, and a new entry added to hist_data.fields representing the ts0 | | | |
|
variable. For a normal val hist_field, .flags is just 0 (modulo | | | |
|
modifier flags), but if the value is defined as a variable, the .flags | | | |
|
contains a set FL_VAR bit. | | | |
|
|
|
As you can see, the ts0 entry's .var.idx member contains the index | | | |
|
into the tracing_map_elts' .vars[] array containing variable values. | | | |
|
This idx is used whenever the value of the variable is set or read. | | | |
|
The map_elt.vars idx assigned to the given variable is assigned and | | | |
|
saved in .var.idx by create_tracing_map_fields() after it calls | | | |
|
tracing_map_add_var(). | | | |
|
|
|
Below is a representation of the histogram at run-time, which | | | |
|
populates the map, along with correspondence to the above hist_data and | | | |
|
hist_field data structures. | | | |
|
|
|
The diagram attempts to show the relationship between the | | | |
|
hist_data.fields[] and the map_elt.fields[] and map_elt.vars[] with | | | |
|
the links drawn between diagrams. For each of the map_elts, you can | | | |
|
see that the .fields[] members point to the .sum or .offset of a key | | | |
|
or val and the .vars[] members point to the value of a variable. The | | | |
|
arrows between the two diagrams show the linkages between those | | | |
|
tracing_map members and the field definitions in the corresponding | | | |
|
hist_data fields[] members.:: |
|
|
|
+-----------+ | | | |
|
| hist_data | | | | |
|
+-----------+ | | | |
|
| .fields | | | | |
|
+---------+ +-----------+ | | | |
|
| .map |---->| map_entry | | | | |
|
+---------+ +-----------+ | | | |
|
| .key |---> 0 | | | |
|
+---------+ | | | |
|
| .val |---> NULL | | | |
|
+-----------+ | | | |
|
| map_entry | | | | |
|
+-----------+ | | | |
|
| .key |---> pid = 999 | | | |
|
+---------+ +-----------+ | | | |
|
| .val |--->| map_elt | | | | |
|
+---------+ +-----------+ | | | |
|
. | .key |---> full key * | | | |
|
. +---------+ +---------------+ | | | |
|
. | .fields |--->| .sum (val) | | | | |
|
. +---------+ | 2345 | | | | |
|
. +--| .vars | +---------------+ | | | |
|
. | +---------+ | .offset (key) | | | | |
|
. | | 0 | | | | |
|
. | +---------------+ | | | |
|
. | . | | | |
|
. | . | | | |
|
. | . | | | |
|
. | +---------------+ | | | |
|
. | | .sum (val) or | | | | |
|
. | | .offset (key) | | | | |
|
. | +---------------+ | | | |
|
. | | .sum (val) or | | | | |
|
. | | .offset (key) | | | | |
|
. | +---------------+ | | | |
|
. | | | | |
|
. +---------------->+---------------+ | | | |
|
. | ts0 |<--+ | | |
|
. | 113345679876 | | | | |
|
. +---------------+ | | | |
|
. | unused | | | | |
|
. | | | | | |
|
. +---------------+ | | | |
|
. . | | | |
|
. . | | | |
|
. . | | | |
|
. +---------------+ | | | |
|
. | unused | | | | |
|
. | | | | | |
|
. +---------------+ | | | |
|
. | unused | | | | |
|
. | | | | | |
|
. +---------------+ | | | |
|
. | | | |
|
+-----------+ | | | |
|
| map_entry | | | | |
|
+-----------+ | | | |
|
| .key |---> pid = 4444 | | | |
|
+---------+ +-----------+ | | | |
|
| .val |--->| map_elt | | | | |
|
+---------+ +-----------+ | | | |
|
. | .key |---> full key * | | | |
|
. +---------+ +---------------+ | | | |
|
. | .fields |--->| .sum (val) | | | | |
|
+---------+ | 2345 | | | | |
|
+--| .vars | +---------------+ | | | |
|
| +---------+ | .offset (key) | | | | |
|
| | 0 | | | | |
|
| +---------------+ | | | |
|
| . | | | |
|
| . | | | |
|
| . | | | |
|
| +---------------+ | | | |
|
| | .sum (val) or | | | | |
|
| | .offset (key) | | | | |
|
| +---------------+ | | | |
|
| | .sum (val) or | | | | |
|
| | .offset (key) | | | | |
|
| +---------------+ | | | |
|
| | | | |
|
| +---------------+ | | | |
|
+---------------->| ts0 |<--+ | | |
|
| 213499240729 | | | |
|
+---------------+ | | |
|
| unused | | | |
|
| | | | |
|
+---------------+ | | |
|
. | | |
|
. | | |
|
. | | |
|
+---------------+ | | |
|
| unused | | | |
|
| | | | |
|
+---------------+ | | |
|
| unused | | | |
|
| | | | |
|
+---------------+ | | |
|
|
|
For each used map entry, there's a map_elt pointing to an array of | | |
|
.vars containing the current value of the variables associated with | | |
|
that histogram entry. So in the above, the timestamp associated with | | |
|
pid 999 is 113345679876, and the timestamp variable in the same | | |
|
.var.idx for pid 4444 is 213499240729. | | |
|
|
|
sched_switch histogram | | |
|
---------------------- | | |
|
|
|
The sched_switch histogram paired with the above sched_waking | | |
|
histogram is shown below. The most important aspect of the | | |
|
sched_switch histogram is that it references a variable on the | | |
|
sched_waking histogram above. | | |
|
|
|
The histogram diagram is very similar to the others so far displayed, | | |
|
but it adds variable references. You can see the normal hitcount and | | |
|
key fields along with a new wakeup_lat variable implemented in the | | |
|
same way as the sched_waking ts0 variable, but in addition there's an | | |
|
entry with the new FL_VAR_REF (short for HIST_FIELD_FL_VAR_REF) flag. | | |
|
|
|
Associated with the new var ref field are a couple of new hist_field | | |
|
members, var.hist_data and var_ref_idx. For a variable reference, the | | |
|
var.hist_data goes with the var.idx, which together uniquely identify | | |
|
a particular variable on a particular histogram. The var_ref_idx is | | |
|
just the index into the var_ref_vals[] array that caches the values of | | |
|
each variable whenever a hist trigger is updated. Those resulting | | |
|
values are then finally accessed by other code such as trace action | | |
|
code that uses the var_ref_idx values to assign param values. | | |
|
|
|
The diagram below describes the situation for the sched_switch | | |
|
histogram referred to before:: |
|
|
|
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0' >> | | |
|
events/sched/sched_switch/trigger | | |
|
| | |
|
+------------------+ | | |
|
| hist_data | | | |
|
+------------------+ +-----------------------+ | | |
|
| .fields[] |-->| val = hitcount | | | |
|
+----------------+ +-----------------------+ | | |
|
| .map | | .size | | | |
|
+----------------+ +---------------------+ | | |
|
+--| .var_refs[] | | .offset | | | |
|
| +----------------+ +---------------------+ | | |
|
| | .fn() | | | |
|
| var_ref_vals[] +---------------------+ | | |
|
| +-------------+ | .flags | | | |
|
| | $ts0 |<---+ +---------------------+ | | |
|
| +-------------+ | | .var.idx | | | |
|
| | | | +---------------------+ | | |
|
| +-------------+ | | .var.hist_data | | | |
|
| | | | +---------------------+ | | |
|
| +-------------+ | | .var_ref_idx | | | |
|
| | | | +-----------------------+ | | |
|
| +-------------+ | | var = wakeup_lat | | | |
|
| . | +-----------------------+ | | |
|
| . | | .size | | | |
|
| . | +---------------------+ | | |
|
| +-------------+ | | .offset | | | |
|
| | | | +---------------------+ | | |
|
| +-------------+ | | .fn() | | | |
|
| | | | +---------------------+ | | |
|
| +-------------+ | | .flags & FL_VAR | | | |
|
| | +---------------------+ | | |
|
| | | .var.idx | | | |
|
| | +---------------------+ | | |
|
| | | .var.hist_data | | | |
|
| | +---------------------+ | | |
|
| | | .var_ref_idx | | | |
|
| | +---------------------+ | | |
|
| | . | | |
|
| | . | | |
|
| | . | | |
|
| | +-----------------------+ <--- n_vals | | |
|
| | | key = pid | | | |
|
| | +-----------------------+ | | |
|
| | | .size | | | |
|
| | +---------------------+ | | |
|
| | | .offset | | | |
|
| | +---------------------+ | | |
|
| | | .fn() | | | |
|
| | +---------------------+ | | |
|
| | | .flags | | | |
|
| | +---------------------+ | | |
|
| | | .var.idx | | | |
|
| | +-----------------------+ <--- n_fields | | |
|
| | | unused | | | |
|
| | +-----------------------+ | | |
|
| | | | | | |
|
| | +---------------------+ | | |
|
| | | | | | |
|
| | +---------------------+ | | |
|
| | | | | | |
|
| | +---------------------+ | | |
|
| | | | | | |
|
| | +---------------------+ | | |
|
| | | | | | |
|
| | +---------------------+ | | |
|
| | n_keys = n_fields - n_vals | | |
|
| | | | |
|
| | | | |
|
| | +-----------------------+ | | |
|
+---------------------->| var_ref = $ts0 | | | |
|
| +-----------------------+ | | |
|
| | .size | | | |
|
| +---------------------+ | | |
|
| | .offset | | | |
|
| +---------------------+ | | |
|
| | .fn() | | | |
|
| +---------------------+ | | |
|
| | .flags & FL_VAR_REF | | | |
|
| +---------------------+ | | |
|
| | .var.idx |--------------------------+ | |
|
| +---------------------+ | |
|
| | .var.hist_data |----------------------------+ |
|
| +---------------------+ |
|
+---| .var_ref_idx | |
|
+---------------------+ |
|
|
|
Abbreviations used in the diagrams:: |
|
|
|
hist_data = struct hist_trigger_data |
|
hist_data.fields = struct hist_field |
|
fn = hist_field_fn_t |
|
FL_KEY = HIST_FIELD_FL_KEY |
|
FL_VAR = HIST_FIELD_FL_VAR |
|
FL_VAR_REF = HIST_FIELD_FL_VAR_REF |
|
|
|
When a hist trigger makes use of a variable, a new hist_field is |
|
created with flag HIST_FIELD_FL_VAR_REF. For a VAR_REF field, the |
|
var.idx and var.hist_data take the same values as the referenced |
|
variable, as well as the referenced variable's size, type, and |
|
is_signed values. The VAR_REF field's .name is set to the name of the |
|
variable it references. If a variable reference was created using the |
|
explicit system.event.$var_ref notation, the hist_field's system and |
|
event_name variables are also set. |
|
|
|
So, in order to handle an event for the sched_switch histogram, |
|
because we have a reference to a variable on another histogram, we |
|
need to resolve all variable references first. This is done via the |
|
resolve_var_refs() calls made from event_hist_trigger(). What this |
|
does is grabs the var_refs[] array from the hist_data representing the |
|
sched_switch histogram. For each one of those, the referenced |
|
variable's var.hist_data along with the current key is used to look up |
|
the corresponding tracing_map_elt in that histogram. Once found, the |
|
referenced variable's var.idx is used to look up the variable's value |
|
using tracing_map_read_var(elt, var.idx), which yields the value of |
|
the variable for that element, ts0 in the case above. Note that both |
|
the hist_fields representing both the variable and the variable |
|
reference have the same var.idx, so this is straightforward. |
|
|
|
Variable and variable reference test |
|
------------------------------------ |
|
|
|
This example creates a variable on the sched_waking event, ts0, and |
|
uses it in the sched_switch trigger. The sched_switch trigger also |
|
creates its own variable, wakeup_lat, but nothing yet uses it:: |
|
|
|
# echo 'hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger |
|
|
|
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0' >> events/sched/sched_switch/trigger |
|
|
|
Looking at the sched_waking 'hist_debug' output, in addition to the |
|
normal key and value hist_fields, in the val fields section we see a |
|
field with the HIST_FIELD_FL_VAR flag, which indicates that that field |
|
represents a variable. Note that in addition to the variable name, |
|
contained in the var.name field, it includes the var.idx, which is the |
|
index into the tracing_map_elt.vars[] array of the actual variable |
|
location. Note also that the output shows that variables live in the |
|
same part of the hist_data->fields[] array as normal values:: |
|
|
|
# cat events/sched/sched_waking/hist_debug |
|
|
|
# event histogram |
|
# |
|
# trigger info: hist:keys=pid:vals=hitcount:ts0=common_timestamp.usecs:sort=hitcount:size=2048:clock=global [active] |
|
# |
|
|
|
hist_data: 000000009536f554 |
|
|
|
n_vals: 2 |
|
n_keys: 1 |
|
n_fields: 3 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: ts0 |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
key fields: |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: pid |
|
type: pid_t |
|
size: 8 |
|
is_signed: 1 |
|
|
|
Moving on to the sched_switch trigger hist_debug output, in addition |
|
to the unused wakeup_lat variable, we see a new section displaying |
|
variable references. Variable references are displayed in a separate |
|
section because in addition to being logically separate from |
|
variables and values, they actually live in a separate hist_data |
|
array, var_refs[]. |
|
|
|
In this example, the sched_switch trigger has a reference to a |
|
variable on the sched_waking trigger, $ts0. Looking at the details, |
|
we can see that the var.hist_data value of the referenced variable |
|
matches the previously displayed sched_waking trigger, and the var.idx |
|
value matches the previously displayed var.idx value for that |
|
variable. Also displayed is the var_ref_idx value for that variable |
|
reference, which is where the value for that variable is cached for |
|
use when the trigger is invoked:: |
|
|
|
# cat events/sched/sched_switch/hist_debug |
|
|
|
# event histogram |
|
# |
|
# trigger info: hist:keys=next_pid:vals=hitcount:wakeup_lat=common_timestamp.usecs-$ts0:sort=hitcount:size=2048:clock=global [active] |
|
# |
|
|
|
hist_data: 00000000f4ee8006 |
|
|
|
n_vals: 2 |
|
n_keys: 1 |
|
n_fields: 3 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: wakeup_lat |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
type: u64 |
|
size: 0 |
|
is_signed: 0 |
|
|
|
key fields: |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: next_pid |
|
type: pid_t |
|
size: 8 |
|
is_signed: 1 |
|
|
|
variable reference fields: |
|
|
|
hist_data->var_refs[0]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: ts0 |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var.hist_data: 000000009536f554 |
|
var_ref_idx (into hist_data->var_refs[]): 0 |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
The commands below can be used to clean things up for the next test:: |
|
|
|
# echo '!hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0' >> events/sched/sched_switch/trigger |
|
|
|
# echo '!hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger |
|
|
|
Actions and Handlers |
|
==================== |
|
|
|
Adding onto the previous example, we will now do something with that |
|
wakeup_lat variable, namely send it and another field as a synthetic |
|
event. |
|
|
|
The onmatch() action below basically says that whenever we have a |
|
sched_switch event, if we have a matching sched_waking event, in this |
|
case if we have a pid in the sched_waking histogram that matches the |
|
next_pid field on this sched_switch event, we retrieve the |
|
variables specified in the wakeup_latency() trace action, and use |
|
them to generate a new wakeup_latency event into the trace stream. |
|
|
|
Note that the way the trace handlers such as wakeup_latency() (which |
|
could equivalently be written trace(wakeup_latency,$wakeup_lat,next_pid) |
|
are implemented, the parameters specified to the trace handler must be |
|
variables. In this case, $wakeup_lat is obviously a variable, but |
|
next_pid isn't, since it's just naming a field in the sched_switch |
|
trace event. Since this is something that almost every trace() and |
|
save() action does, a special shortcut is implemented to allow field |
|
names to be used directly in those cases. How it works is that under |
|
the covers, a temporary variable is created for the named field, and |
|
this variable is what is actually passed to the trace handler. In the |
|
code and documentation, this type of variable is called a 'field |
|
variable'. |
|
|
|
Fields on other trace event's histograms can be used as well. In that |
|
case we have to generate a new histogram and an unfortunately named |
|
'synthetic_field' (the use of synthetic here has nothing to do with |
|
synthetic events) and use that special histogram field as a variable. |
|
|
|
The diagram below illustrates the new elements described above in the |
|
context of the sched_switch histogram using the onmatch() handler and |
|
the trace() action. |
|
|
|
First, we define the wakeup_latency synthetic event:: |
|
|
|
# echo 'wakeup_latency u64 lat; pid_t pid' >> synthetic_events |
|
|
|
Next, the sched_waking hist trigger as before:: |
|
|
|
# echo 'hist:keys=pid:ts0=common_timestamp.usecs' >> |
|
events/sched/sched_waking/trigger |
|
|
|
Finally, we create a hist trigger on the sched_switch event that |
|
generates a wakeup_latency() trace event. In this case we pass |
|
next_pid into the wakeup_latency synthetic event invocation, which |
|
means it will be automatically converted into a field variable:: |
|
|
|
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ |
|
onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid)' >> |
|
/sys/kernel/debug/tracing/events/sched/sched_switch/trigger |
|
|
|
The diagram for the sched_switch event is similar to previous examples |
|
but shows the additional field_vars[] array for hist_data and shows |
|
the linkages between the field_vars and the variables and references |
|
created to implement the field variables. The details are discussed |
|
below:: |
|
|
|
+------------------+ |
|
| hist_data | |
|
+------------------+ +-----------------------+ |
|
| .fields[] |-->| val = hitcount | |
|
+----------------+ +-----------------------+ |
|
| .map | | .size | |
|
+----------------+ +---------------------+ |
|
+---| .field_vars[] | | .offset | |
|
| +----------------+ +---------------------+ |
|
|+--| .var_refs[] | | .offset | |
|
|| +----------------+ +---------------------+ |
|
|| | .fn() | |
|
|| var_ref_vals[] +---------------------+ |
|
|| +-------------+ | .flags | |
|
|| | $ts0 |<---+ +---------------------+ |
|
|| +-------------+ | | .var.idx | |
|
|| | $next_pid |<-+ | +---------------------+ |
|
|| +-------------+ | | | .var.hist_data | |
|
||+>| $wakeup_lat | | | +---------------------+ |
|
||| +-------------+ | | | .var_ref_idx | |
|
||| | | | | +-----------------------+ |
|
||| +-------------+ | | | var = wakeup_lat | |
|
||| . | | +-----------------------+ |
|
||| . | | | .size | |
|
||| . | | +---------------------+ |
|
||| +-------------+ | | | .offset | |
|
||| | | | | +---------------------+ |
|
||| +-------------+ | | | .fn() | |
|
||| | | | | +---------------------+ |
|
||| +-------------+ | | | .flags & FL_VAR | |
|
||| | | +---------------------+ |
|
||| | | | .var.idx | |
|
||| | | +---------------------+ |
|
||| | | | .var.hist_data | |
|
||| | | +---------------------+ |
|
||| | | | .var_ref_idx | |
|
||| | | +---------------------+ |
|
||| | | . |
|
||| | | . |
|
||| | | . |
|
||| | | . |
|
||| +--------------+ | | . |
|
+-->| field_var | | | . |
|
|| +--------------+ | | . |
|
|| | var | | | . |
|
|| +------------+ | | . |
|
|| | val | | | . |
|
|| +--------------+ | | . |
|
|| | field_var | | | . |
|
|| +--------------+ | | . |
|
|| | var | | | . |
|
|| +------------+ | | . |
|
|| | val | | | . |
|
|| +------------+ | | . |
|
|| . | | . |
|
|| . | | . |
|
|| . | | +-----------------------+ <--- n_vals |
|
|| +--------------+ | | | key = pid | |
|
|| | field_var | | | +-----------------------+ |
|
|| +--------------+ | | | .size | |
|
|| | var |--+| +---------------------+ |
|
|| +------------+ ||| | .offset | |
|
|| | val |-+|| +---------------------+ |
|
|| +------------+ ||| | .fn() | |
|
|| ||| +---------------------+ |
|
|| ||| | .flags | |
|
|| ||| +---------------------+ |
|
|| ||| | .var.idx | |
|
|| ||| +---------------------+ <--- n_fields |
|
|| ||| |
|
|| ||| n_keys = n_fields - n_vals |
|
|| ||| +-----------------------+ |
|
|| |+->| var = next_pid | |
|
|| | | +-----------------------+ |
|
|| | | | .size | |
|
|| | | +---------------------+ |
|
|| | | | .offset | |
|
|| | | +---------------------+ |
|
|| | | | .flags & FL_VAR | |
|
|| | | +---------------------+ |
|
|| | | | .var.idx | |
|
|| | | +---------------------+ |
|
|| | | | .var.hist_data | |
|
|| | | +-----------------------+ |
|
|| +-->| val for next_pid | |
|
|| | | +-----------------------+ |
|
|| | | | .size | |
|
|| | | +---------------------+ |
|
|| | | | .offset | |
|
|| | | +---------------------+ |
|
|| | | | .fn() | |
|
|| | | +---------------------+ |
|
|| | | | .flags | |
|
|| | | +---------------------+ |
|
|| | | | | |
|
|| | | +---------------------+ |
|
|| | | |
|
|| | | |
|
|| | | +-----------------------+ |
|
+|------------------|-|>| var_ref = $ts0 | |
|
| | | +-----------------------+ |
|
| | | | .size | |
|
| | | +---------------------+ |
|
| | | | .offset | |
|
| | | +---------------------+ |
|
| | | | .fn() | |
|
| | | +---------------------+ |
|
| | | | .flags & FL_VAR_REF | |
|
| | | +---------------------+ |
|
| | +---| .var_ref_idx | |
|
| | +-----------------------+ |
|
| | | var_ref = $next_pid | |
|
| | +-----------------------+ |
|
| | | .size | |
|
| | +---------------------+ |
|
| | | .offset | |
|
| | +---------------------+ |
|
| | | .fn() | |
|
| | +---------------------+ |
|
| | | .flags & FL_VAR_REF | |
|
| | +---------------------+ |
|
| +-----| .var_ref_idx | |
|
| +-----------------------+ |
|
| | var_ref = $wakeup_lat | |
|
| +-----------------------+ |
|
| | .size | |
|
| +---------------------+ |
|
| | .offset | |
|
| +---------------------+ |
|
| | .fn() | |
|
| +---------------------+ |
|
| | .flags & FL_VAR_REF | |
|
| +---------------------+ |
|
+------------------------| .var_ref_idx | |
|
+---------------------+ |
|
|
|
As you can see, for a field variable, two hist_fields are created: one |
|
representing the variable, in this case next_pid, and one to actually |
|
get the value of the field from the trace stream, like a normal val |
|
field does. These are created separately from normal variable |
|
creation and are saved in the hist_data->field_vars[] array. See |
|
below for how these are used. In addition, a reference hist_field is |
|
also created, which is needed to reference the field variables such as |
|
$next_pid variable in the trace() action. |
|
|
|
Note that $wakeup_lat is also a variable reference, referencing the |
|
value of the expression common_timestamp-$ts0, and so also needs to |
|
have a hist field entry representing that reference created. |
|
|
|
When hist_trigger_elt_update() is called to get the normal key and |
|
value fields, it also calls update_field_vars(), which goes through |
|
each field_var created for the histogram, and available from |
|
hist_data->field_vars and calls val->fn() to get the data from the |
|
current trace record, and then uses the var's var.idx to set the |
|
variable at the var.idx offset in the appropriate tracing_map_elt's |
|
variable at elt->vars[var.idx]. |
|
|
|
Once all the variables have been updated, resolve_var_refs() can be |
|
called from event_hist_trigger(), and not only can our $ts0 and |
|
$next_pid references be resolved but the $wakeup_lat reference as |
|
well. At this point, the trace() action can simply access the values |
|
assembled in the var_ref_vals[] array and generate the trace event. |
|
|
|
The same process occurs for the field variables associated with the |
|
save() action. |
|
|
|
Abbreviations used in the diagram:: |
|
|
|
hist_data = struct hist_trigger_data |
|
hist_data.fields = struct hist_field |
|
field_var = struct field_var |
|
fn = hist_field_fn_t |
|
FL_KEY = HIST_FIELD_FL_KEY |
|
FL_VAR = HIST_FIELD_FL_VAR |
|
FL_VAR_REF = HIST_FIELD_FL_VAR_REF |
|
|
|
trace() action field variable test |
|
---------------------------------- |
|
|
|
This example adds to the previous test example by finally making use |
|
of the wakeup_lat variable, but in addition also creates a couple of |
|
field variables that then are all passed to the wakeup_latency() trace |
|
action via the onmatch() handler. |
|
|
|
First, we create the wakeup_latency synthetic event:: |
|
|
|
# echo 'wakeup_latency u64 lat; pid_t pid; char comm[16]' >> synthetic_events |
|
|
|
Next, the sched_waking trigger from previous examples:: |
|
|
|
# echo 'hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger |
|
|
|
Finally, as in the previous test example, we calculate and assign the |
|
wakeup latency using the $ts0 reference from the sched_waking trigger |
|
to the wakeup_lat variable, and finally use it along with a couple |
|
sched_switch event fields, next_pid and next_comm, to generate a |
|
wakeup_latency trace event. The next_pid and next_comm event fields |
|
are automatically converted into field variables for this purpose:: |
|
|
|
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,next_comm)' >> /sys/kernel/debug/tracing/events/sched/sched_switch/trigger |
|
|
|
The sched_waking hist_debug output shows the same data as in the |
|
previous test example:: |
|
|
|
# cat events/sched/sched_waking/hist_debug |
|
|
|
# event histogram |
|
# |
|
# trigger info: hist:keys=pid:vals=hitcount:ts0=common_timestamp.usecs:sort=hitcount:size=2048:clock=global [active] |
|
# |
|
|
|
hist_data: 00000000d60ff61f |
|
|
|
n_vals: 2 |
|
n_keys: 1 |
|
n_fields: 3 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: ts0 |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
key fields: |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: pid |
|
type: pid_t |
|
size: 8 |
|
is_signed: 1 |
|
|
|
The sched_switch hist_debug output shows the same key and value fields |
|
as in the previous test example - note that wakeup_lat is still in the |
|
val fields section, but that the new field variables are not there - |
|
although the field variables are variables, they're held separately in |
|
the hist_data's field_vars[] array. Although the field variables and |
|
the normal variables are located in separate places, you can see that |
|
the actual variable locations for those variables in the |
|
tracing_map_elt.vars[] do have increasing indices as expected: |
|
wakeup_lat takes the var.idx = 0 slot, while the field variables for |
|
next_pid and next_comm have values var.idx = 1, and var.idx = 2. Note |
|
also that those are the same values displayed for the variable |
|
references corresponding to those variables in the variable reference |
|
fields section. Since there are two triggers and thus two hist_data |
|
addresses, those addresses also need to be accounted for when doing |
|
the matching - you can see that the first variable refers to the 0 |
|
var.idx on the previous hist trigger (see the hist_data address |
|
associated with that trigger), while the second variable refers to the |
|
0 var.idx on the sched_switch hist trigger, as do all the remaining |
|
variable references. |
|
|
|
Finally, the action tracking variables section just shows the system |
|
and event name for the onmatch() handler:: |
|
|
|
# cat events/sched/sched_switch/hist_debug |
|
|
|
# event histogram |
|
# |
|
# trigger info: hist:keys=next_pid:vals=hitcount:wakeup_lat=common_timestamp.usecs-$ts0:sort=hitcount:size=2048:clock=global:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,next_comm) [active] |
|
# |
|
|
|
hist_data: 0000000008f551b7 |
|
|
|
n_vals: 2 |
|
n_keys: 1 |
|
n_fields: 3 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: wakeup_lat |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
type: u64 |
|
size: 0 |
|
is_signed: 0 |
|
|
|
key fields: |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: next_pid |
|
type: pid_t |
|
size: 8 |
|
is_signed: 1 |
|
|
|
variable reference fields: |
|
|
|
hist_data->var_refs[0]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: ts0 |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var.hist_data: 00000000d60ff61f |
|
var_ref_idx (into hist_data->var_refs[]): 0 |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->var_refs[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: wakeup_lat |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var.hist_data: 0000000008f551b7 |
|
var_ref_idx (into hist_data->var_refs[]): 1 |
|
type: u64 |
|
size: 0 |
|
is_signed: 0 |
|
|
|
hist_data->var_refs[2]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: next_pid |
|
var.idx (into tracing_map_elt.vars[]): 1 |
|
var.hist_data: 0000000008f551b7 |
|
var_ref_idx (into hist_data->var_refs[]): 2 |
|
type: pid_t |
|
size: 4 |
|
is_signed: 0 |
|
|
|
hist_data->var_refs[3]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: next_comm |
|
var.idx (into tracing_map_elt.vars[]): 2 |
|
var.hist_data: 0000000008f551b7 |
|
var_ref_idx (into hist_data->var_refs[]): 3 |
|
type: char[16] |
|
size: 256 |
|
is_signed: 0 |
|
|
|
field variables: |
|
|
|
hist_data->field_vars[0]: |
|
|
|
field_vars[0].var: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: next_pid |
|
var.idx (into tracing_map_elt.vars[]): 1 |
|
|
|
field_vars[0].val: |
|
ftrace_event_field name: next_pid |
|
type: pid_t |
|
size: 4 |
|
is_signed: 1 |
|
|
|
hist_data->field_vars[1]: |
|
|
|
field_vars[1].var: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: next_comm |
|
var.idx (into tracing_map_elt.vars[]): 2 |
|
|
|
field_vars[1].val: |
|
ftrace_event_field name: next_comm |
|
type: char[16] |
|
size: 256 |
|
is_signed: 0 |
|
|
|
action tracking variables (for onmax()/onchange()/onmatch()): |
|
|
|
hist_data->actions[0].match_data.event_system: sched |
|
hist_data->actions[0].match_data.event: sched_waking |
|
|
|
The commands below can be used to clean things up for the next test:: |
|
|
|
# echo '!hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,next_comm)' >> /sys/kernel/debug/tracing/events/sched/sched_switch/trigger |
|
|
|
# echo '!hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger |
|
|
|
# echo '!wakeup_latency u64 lat; pid_t pid; char comm[16]' >> synthetic_events |
|
|
|
action_data and the trace() action |
|
---------------------------------- |
|
|
|
As mentioned above, when the trace() action generates a synthetic |
|
event, all the parameters to the synthetic event either already are |
|
variables or are converted into variables (via field variables), and |
|
finally all those variable values are collected via references to them |
|
into a var_ref_vals[] array. |
|
|
|
The values in the var_ref_vals[] array, however, don't necessarily |
|
follow the same ordering as the synthetic event params. To address |
|
that, struct action_data contains another array, var_ref_idx[] that |
|
maps the trace action params to the var_ref_vals[] values. Below is a |
|
diagram illustrating that for the wakeup_latency() synthetic event:: |
|
|
|
+------------------+ wakeup_latency() |
|
| action_data | event params var_ref_vals[] |
|
+------------------+ +-----------------+ +-----------------+ |
|
| .var_ref_idx[] |--->| $wakeup_lat idx |---+ | | |
|
+----------------+ +-----------------+ | +-----------------+ |
|
| .synth_event | | $next_pid idx |---|-+ | $wakeup_lat val | |
|
+----------------+ +-----------------+ | | +-----------------+ |
|
. | +->| $next_pid val | |
|
. | +-----------------+ |
|
. | . |
|
+-----------------+ | . |
|
| | | . |
|
+-----------------+ | +-----------------+ |
|
+--->| $wakeup_lat val | |
|
+-----------------+ |
|
|
|
Basically, how this ends up getting used in the synthetic event probe |
|
function, trace_event_raw_event_synth(), is as follows:: |
|
|
|
for each field i in .synth_event |
|
val_idx = .var_ref_idx[i] |
|
val = var_ref_vals[val_idx] |
|
|
|
action_data and the onXXX() handlers |
|
------------------------------------ |
|
|
|
The hist trigger onXXX() actions other than onmatch(), such as onmax() |
|
and onchange(), also make use of and internally create hidden |
|
variables. This information is contained in the |
|
action_data.track_data struct, and is also visible in the hist_debug |
|
output as will be described in the example below. |
|
|
|
Typically, the onmax() or onchange() handlers are used in conjunction |
|
with the save() and snapshot() actions. For example:: |
|
|
|
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ |
|
onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm)' >> |
|
/sys/kernel/debug/tracing/events/sched/sched_switch/trigger |
|
|
|
or:: |
|
|
|
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ |
|
onmax($wakeup_lat).snapshot()' >> |
|
/sys/kernel/debug/tracing/events/sched/sched_switch/trigger |
|
|
|
save() action field variable test |
|
--------------------------------- |
|
|
|
For this example, instead of generating a synthetic event, the save() |
|
action is used to save field values whenever an onmax() handler |
|
detects that a new max latency has been hit. As in the previous |
|
example, the values being saved are also field values, but in this |
|
case, are kept in a separate hist_data array named save_vars[]. |
|
|
|
As in previous test examples, we set up the sched_waking trigger:: |
|
|
|
# echo 'hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger |
|
|
|
In this case, however, we set up the sched_switch trigger to save some |
|
sched_switch field values whenever we hit a new maximum latency. For |
|
both the onmax() handler and save() action, variables will be created, |
|
which we can use the hist_debug files to examine:: |
|
|
|
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm)' >> events/sched/sched_switch/trigger |
|
|
|
The sched_waking hist_debug output shows the same data as in the |
|
previous test examples:: |
|
|
|
# cat events/sched/sched_waking/hist_debug |
|
|
|
# |
|
# trigger info: hist:keys=pid:vals=hitcount:ts0=common_timestamp.usecs:sort=hitcount:size=2048:clock=global [active] |
|
# |
|
|
|
hist_data: 00000000e6290f48 |
|
|
|
n_vals: 2 |
|
n_keys: 1 |
|
n_fields: 3 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: ts0 |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
key fields: |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: pid |
|
type: pid_t |
|
size: 8 |
|
is_signed: 1 |
|
|
|
The output of the sched_switch trigger shows the same val and key |
|
values as before, but also shows a couple new sections. |
|
|
|
First, the action tracking variables section now shows the |
|
actions[].track_data information describing the special tracking |
|
variables and references used to track, in this case, the running |
|
maximum value. The actions[].track_data.var_ref member contains the |
|
reference to the variable being tracked, in this case the $wakeup_lat |
|
variable. In order to perform the onmax() handler function, there |
|
also needs to be a variable that tracks the current maximum by getting |
|
updated whenever a new maximum is hit. In this case, we can see that |
|
an auto-generated variable named ' __max' has been created and is |
|
visible in the actions[].track_data.track_var variable. |
|
|
|
Finally, in the new 'save action variables' section, we can see that |
|
the 4 params to the save() function have resulted in 4 field variables |
|
being created for the purposes of saving the values of the named |
|
fields when the max is hit. These variables are kept in a separate |
|
save_vars[] array off of hist_data, so are displayed in a separate |
|
section:: |
|
|
|
# cat events/sched/sched_switch/hist_debug |
|
|
|
# event histogram |
|
# |
|
# trigger info: hist:keys=next_pid:vals=hitcount:wakeup_lat=common_timestamp.usecs-$ts0:sort=hitcount:size=2048:clock=global:onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm) [active] |
|
# |
|
|
|
hist_data: 0000000057bcd28d |
|
|
|
n_vals: 2 |
|
n_keys: 1 |
|
n_fields: 3 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: wakeup_lat |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
type: u64 |
|
size: 0 |
|
is_signed: 0 |
|
|
|
key fields: |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: next_pid |
|
type: pid_t |
|
size: 8 |
|
is_signed: 1 |
|
|
|
variable reference fields: |
|
|
|
hist_data->var_refs[0]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: ts0 |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var.hist_data: 00000000e6290f48 |
|
var_ref_idx (into hist_data->var_refs[]): 0 |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->var_refs[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: wakeup_lat |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var.hist_data: 0000000057bcd28d |
|
var_ref_idx (into hist_data->var_refs[]): 1 |
|
type: u64 |
|
size: 0 |
|
is_signed: 0 |
|
|
|
action tracking variables (for onmax()/onchange()/onmatch()): |
|
|
|
hist_data->actions[0].track_data.var_ref: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: wakeup_lat |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var.hist_data: 0000000057bcd28d |
|
var_ref_idx (into hist_data->var_refs[]): 1 |
|
type: u64 |
|
size: 0 |
|
is_signed: 0 |
|
|
|
hist_data->actions[0].track_data.track_var: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: __max |
|
var.idx (into tracing_map_elt.vars[]): 1 |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
save action variables (save() params): |
|
|
|
hist_data->save_vars[0]: |
|
|
|
save_vars[0].var: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: next_comm |
|
var.idx (into tracing_map_elt.vars[]): 2 |
|
|
|
save_vars[0].val: |
|
ftrace_event_field name: next_comm |
|
type: char[16] |
|
size: 256 |
|
is_signed: 0 |
|
|
|
hist_data->save_vars[1]: |
|
|
|
save_vars[1].var: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: prev_pid |
|
var.idx (into tracing_map_elt.vars[]): 3 |
|
|
|
save_vars[1].val: |
|
ftrace_event_field name: prev_pid |
|
type: pid_t |
|
size: 4 |
|
is_signed: 1 |
|
|
|
hist_data->save_vars[2]: |
|
|
|
save_vars[2].var: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: prev_prio |
|
var.idx (into tracing_map_elt.vars[]): 4 |
|
|
|
save_vars[2].val: |
|
ftrace_event_field name: prev_prio |
|
type: int |
|
size: 4 |
|
is_signed: 1 |
|
|
|
hist_data->save_vars[3]: |
|
|
|
save_vars[3].var: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: prev_comm |
|
var.idx (into tracing_map_elt.vars[]): 5 |
|
|
|
save_vars[3].val: |
|
ftrace_event_field name: prev_comm |
|
type: char[16] |
|
size: 256 |
|
is_signed: 0 |
|
|
|
The commands below can be used to clean things up for the next test:: |
|
|
|
# echo '!hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm)' >> events/sched/sched_switch/trigger |
|
|
|
# echo '!hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger |
|
|
|
A couple special cases |
|
====================== |
|
|
|
While the above covers the basics of the histogram internals, there |
|
are a couple of special cases that should be discussed, since they |
|
tend to create even more confusion. Those are field variables on other |
|
histograms, and aliases, both described below through example tests |
|
using the hist_debug files. |
|
|
|
Test of field variables on other histograms |
|
------------------------------------------- |
|
|
|
This example is similar to the previous examples, but in this case, |
|
the sched_switch trigger references a hist trigger field on another |
|
event, namely the sched_waking event. In order to accomplish this, a |
|
field variable is created for the other event, but since an existing |
|
histogram can't be used, as existing histograms are immutable, a new |
|
histogram with a matching variable is created and used, and we'll see |
|
that reflected in the hist_debug output shown below. |
|
|
|
First, we create the wakeup_latency synthetic event. Note the |
|
addition of the prio field:: |
|
|
|
# echo 'wakeup_latency u64 lat; pid_t pid; int prio' >> synthetic_events |
|
|
|
As in previous test examples, we set up the sched_waking trigger:: |
|
|
|
# echo 'hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger |
|
|
|
Here we set up a hist trigger on sched_switch to send a wakeup_latency |
|
event using an onmatch handler naming the sched_waking event. Note |
|
that the third param being passed to the wakeup_latency() is prio, |
|
which is a field name that needs to have a field variable created for |
|
it. There isn't however any prio field on the sched_switch event so |
|
it would seem that it wouldn't be possible to create a field variable |
|
for it. The matching sched_waking event does have a prio field, so it |
|
should be possible to make use of it for this purpose. The problem |
|
with that is that it's not currently possible to define a new variable |
|
on an existing histogram, so it's not possible to add a new prio field |
|
variable to the existing sched_waking histogram. It is however |
|
possible to create an additional new 'matching' sched_waking histogram |
|
for the same event, meaning that it uses the same key and filters, and |
|
define the new prio field variable on that. |
|
|
|
Here's the sched_switch trigger:: |
|
|
|
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,prio)' >> events/sched/sched_switch/trigger |
|
|
|
And here's the output of the hist_debug information for the |
|
sched_waking hist trigger. Note that there are two histograms |
|
displayed in the output: the first is the normal sched_waking |
|
histogram we've seen in the previous examples, and the second is the |
|
special histogram we created to provide the prio field variable. |
|
|
|
Looking at the second histogram below, we see a variable with the name |
|
synthetic_prio. This is the field variable created for the prio field |
|
on that sched_waking histogram:: |
|
|
|
# cat events/sched/sched_waking/hist_debug |
|
|
|
# event histogram |
|
# |
|
# trigger info: hist:keys=pid:vals=hitcount:ts0=common_timestamp.usecs:sort=hitcount:size=2048:clock=global [active] |
|
# |
|
|
|
hist_data: 00000000349570e4 |
|
|
|
n_vals: 2 |
|
n_keys: 1 |
|
n_fields: 3 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: ts0 |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
key fields: |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: pid |
|
type: pid_t |
|
size: 8 |
|
is_signed: 1 |
|
|
|
|
|
# event histogram |
|
# |
|
# trigger info: hist:keys=pid:vals=hitcount:synthetic_prio=prio:sort=hitcount:size=2048 [active] |
|
# |
|
|
|
hist_data: 000000006920cf38 |
|
|
|
n_vals: 2 |
|
n_keys: 1 |
|
n_fields: 3 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
ftrace_event_field name: prio |
|
var.name: synthetic_prio |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
type: int |
|
size: 4 |
|
is_signed: 1 |
|
|
|
key fields: |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: pid |
|
type: pid_t |
|
size: 8 |
|
is_signed: 1 |
|
|
|
Looking at the sched_switch histogram below, we can see a reference to |
|
the synthetic_prio variable on sched_waking, and looking at the |
|
associated hist_data address we see that it is indeed associated with |
|
the new histogram. Note also that the other references are to a |
|
normal variable, wakeup_lat, and to a normal field variable, next_pid, |
|
the details of which are in the field variables section:: |
|
|
|
# cat events/sched/sched_switch/hist_debug |
|
|
|
# event histogram |
|
# |
|
# trigger info: hist:keys=next_pid:vals=hitcount:wakeup_lat=common_timestamp.usecs-$ts0:sort=hitcount:size=2048:clock=global:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,prio) [active] |
|
# |
|
|
|
hist_data: 00000000a73b67df |
|
|
|
n_vals: 2 |
|
n_keys: 1 |
|
n_fields: 3 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: wakeup_lat |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
type: u64 |
|
size: 0 |
|
is_signed: 0 |
|
|
|
key fields: |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: next_pid |
|
type: pid_t |
|
size: 8 |
|
is_signed: 1 |
|
|
|
variable reference fields: |
|
|
|
hist_data->var_refs[0]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: ts0 |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var.hist_data: 00000000349570e4 |
|
var_ref_idx (into hist_data->var_refs[]): 0 |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->var_refs[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: wakeup_lat |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var.hist_data: 00000000a73b67df |
|
var_ref_idx (into hist_data->var_refs[]): 1 |
|
type: u64 |
|
size: 0 |
|
is_signed: 0 |
|
|
|
hist_data->var_refs[2]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: next_pid |
|
var.idx (into tracing_map_elt.vars[]): 1 |
|
var.hist_data: 00000000a73b67df |
|
var_ref_idx (into hist_data->var_refs[]): 2 |
|
type: pid_t |
|
size: 4 |
|
is_signed: 0 |
|
|
|
hist_data->var_refs[3]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: synthetic_prio |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var.hist_data: 000000006920cf38 |
|
var_ref_idx (into hist_data->var_refs[]): 3 |
|
type: int |
|
size: 4 |
|
is_signed: 1 |
|
|
|
field variables: |
|
|
|
hist_data->field_vars[0]: |
|
|
|
field_vars[0].var: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: next_pid |
|
var.idx (into tracing_map_elt.vars[]): 1 |
|
|
|
field_vars[0].val: |
|
ftrace_event_field name: next_pid |
|
type: pid_t |
|
size: 4 |
|
is_signed: 1 |
|
|
|
action tracking variables (for onmax()/onchange()/onmatch()): |
|
|
|
hist_data->actions[0].match_data.event_system: sched |
|
hist_data->actions[0].match_data.event: sched_waking |
|
|
|
The commands below can be used to clean things up for the next test:: |
|
|
|
# echo '!hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,prio)' >> events/sched/sched_switch/trigger |
|
|
|
# echo '!hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger |
|
|
|
# echo '!wakeup_latency u64 lat; pid_t pid; int prio' >> synthetic_events |
|
|
|
Alias test |
|
---------- |
|
|
|
This example is very similar to previous examples, but demonstrates |
|
the alias flag. |
|
|
|
First, we create the wakeup_latency synthetic event:: |
|
|
|
# echo 'wakeup_latency u64 lat; pid_t pid; char comm[16]' >> synthetic_events |
|
|
|
Next, we create a sched_waking trigger similar to previous examples, |
|
but in this case we save the pid in the waking_pid variable:: |
|
|
|
# echo 'hist:keys=pid:waking_pid=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger |
|
|
|
For the sched_switch trigger, instead of using $waking_pid directly in |
|
the wakeup_latency synthetic event invocation, we create an alias of |
|
$waking_pid named $woken_pid, and use that in the synthetic event |
|
invocation instead:: |
|
|
|
# echo 'hist:keys=next_pid:woken_pid=$waking_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,$woken_pid,next_comm)' >> events/sched/sched_switch/trigger |
|
|
|
Looking at the sched_waking hist_debug output, in addition to the |
|
normal fields, we can see the waking_pid variable:: |
|
|
|
# cat events/sched/sched_waking/hist_debug |
|
|
|
# event histogram |
|
# |
|
# trigger info: hist:keys=pid:vals=hitcount:waking_pid=pid,ts0=common_timestamp.usecs:sort=hitcount:size=2048:clock=global [active] |
|
# |
|
|
|
hist_data: 00000000a250528c |
|
|
|
n_vals: 3 |
|
n_keys: 1 |
|
n_fields: 4 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
ftrace_event_field name: pid |
|
var.name: waking_pid |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
type: pid_t |
|
size: 4 |
|
is_signed: 1 |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: ts0 |
|
var.idx (into tracing_map_elt.vars[]): 1 |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
key fields: |
|
|
|
hist_data->fields[3]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: pid |
|
type: pid_t |
|
size: 8 |
|
is_signed: 1 |
|
|
|
The sched_switch hist_debug output shows that a variable named |
|
woken_pid has been created but that it also has the |
|
HIST_FIELD_FL_ALIAS flag set. It also has the HIST_FIELD_FL_VAR flag |
|
set, which is why it appears in the val field section. |
|
|
|
Despite that implementation detail, an alias variable is actually more |
|
like a variable reference; in fact it can be thought of as a reference |
|
to a reference. The implementation copies the var_ref->fn() from the |
|
variable reference being referenced, in this case, the waking_pid |
|
fn(), which is hist_field_var_ref() and makes that the fn() of the |
|
alias. The hist_field_var_ref() fn() requires the var_ref_idx of the |
|
variable reference it's using, so waking_pid's var_ref_idx is also |
|
copied to the alias. The end result is that when the value of alias |
|
is retrieved, in the end it just does the same thing the original |
|
reference would have done and retrieves the same value from the |
|
var_ref_vals[] array. You can verify this in the output by noting |
|
that the var_ref_idx of the alias, in this case woken_pid, is the same |
|
as the var_ref_idx of the reference, waking_pid, in the variable |
|
reference fields section. |
|
|
|
Additionally, once it gets that value, since it is also a variable, it |
|
then saves that value into its var.idx. So the var.idx of the |
|
woken_pid alias is 0, which it fills with the value from var_ref_idx 0 |
|
when its fn() is called to update itself. You'll also notice that |
|
there's a woken_pid var_ref in the variable refs section. That is the |
|
reference to the woken_pid alias variable, and you can see that it |
|
retrieves the value from the same var.idx as the woken_pid alias, 0, |
|
and then in turn saves that value in its own var_ref_idx slot, 3, and |
|
the value at this position is finally what gets assigned to the |
|
$woken_pid slot in the trace event invocation:: |
|
|
|
# cat events/sched/sched_switch/hist_debug |
|
|
|
# event histogram |
|
# |
|
# trigger info: hist:keys=next_pid:vals=hitcount:woken_pid=$waking_pid,wakeup_lat=common_timestamp.usecs-$ts0:sort=hitcount:size=2048:clock=global:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,$woken_pid,next_comm) [active] |
|
# |
|
|
|
hist_data: 0000000055d65ed0 |
|
|
|
n_vals: 3 |
|
n_keys: 1 |
|
n_fields: 4 |
|
|
|
val fields: |
|
|
|
hist_data->fields[0]: |
|
flags: |
|
VAL: HIST_FIELD_FL_HITCOUNT |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->fields[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
HIST_FIELD_FL_ALIAS |
|
var.name: woken_pid |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var_ref_idx (into hist_data->var_refs[]): 0 |
|
type: pid_t |
|
size: 4 |
|
is_signed: 1 |
|
|
|
hist_data->fields[2]: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: wakeup_lat |
|
var.idx (into tracing_map_elt.vars[]): 1 |
|
type: u64 |
|
size: 0 |
|
is_signed: 0 |
|
|
|
key fields: |
|
|
|
hist_data->fields[3]: |
|
flags: |
|
HIST_FIELD_FL_KEY |
|
ftrace_event_field name: next_pid |
|
type: pid_t |
|
size: 8 |
|
is_signed: 1 |
|
|
|
variable reference fields: |
|
|
|
hist_data->var_refs[0]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: waking_pid |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var.hist_data: 00000000a250528c |
|
var_ref_idx (into hist_data->var_refs[]): 0 |
|
type: pid_t |
|
size: 4 |
|
is_signed: 1 |
|
|
|
hist_data->var_refs[1]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: ts0 |
|
var.idx (into tracing_map_elt.vars[]): 1 |
|
var.hist_data: 00000000a250528c |
|
var_ref_idx (into hist_data->var_refs[]): 1 |
|
type: u64 |
|
size: 8 |
|
is_signed: 0 |
|
|
|
hist_data->var_refs[2]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: wakeup_lat |
|
var.idx (into tracing_map_elt.vars[]): 1 |
|
var.hist_data: 0000000055d65ed0 |
|
var_ref_idx (into hist_data->var_refs[]): 2 |
|
type: u64 |
|
size: 0 |
|
is_signed: 0 |
|
|
|
hist_data->var_refs[3]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: woken_pid |
|
var.idx (into tracing_map_elt.vars[]): 0 |
|
var.hist_data: 0000000055d65ed0 |
|
var_ref_idx (into hist_data->var_refs[]): 3 |
|
type: pid_t |
|
size: 4 |
|
is_signed: 1 |
|
|
|
hist_data->var_refs[4]: |
|
flags: |
|
HIST_FIELD_FL_VAR_REF |
|
name: next_comm |
|
var.idx (into tracing_map_elt.vars[]): 2 |
|
var.hist_data: 0000000055d65ed0 |
|
var_ref_idx (into hist_data->var_refs[]): 4 |
|
type: char[16] |
|
size: 256 |
|
is_signed: 0 |
|
|
|
field variables: |
|
|
|
hist_data->field_vars[0]: |
|
|
|
field_vars[0].var: |
|
flags: |
|
HIST_FIELD_FL_VAR |
|
var.name: next_comm |
|
var.idx (into tracing_map_elt.vars[]): 2 |
|
|
|
field_vars[0].val: |
|
ftrace_event_field name: next_comm |
|
type: char[16] |
|
size: 256 |
|
is_signed: 0 |
|
|
|
action tracking variables (for onmax()/onchange()/onmatch()): |
|
|
|
hist_data->actions[0].match_data.event_system: sched |
|
hist_data->actions[0].match_data.event: sched_waking |
|
|
|
The commands below can be used to clean things up for the next test:: |
|
|
|
# echo '!hist:keys=next_pid:woken_pid=$waking_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,$woken_pid,next_comm)' >> events/sched/sched_switch/trigger |
|
|
|
# echo '!hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger |
|
|
|
# echo '!wakeup_latency u64 lat; pid_t pid; char comm[16]' >> synthetic_events
|
|
|