forked from Qortal/Brooklyn
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
108 lines
3.3 KiB
108 lines
3.3 KiB
Overhead calculation |
|
-------------------- |
|
The overhead can be shown in two columns as 'Children' and 'Self' when |
|
perf collects callchains. The 'self' overhead is simply calculated by |
|
adding all period values of the entry - usually a function (symbol). |
|
This is the value that perf shows traditionally and sum of all the |
|
'self' overhead values should be 100%. |
|
|
|
The 'children' overhead is calculated by adding all period values of |
|
the child functions so that it can show the total overhead of the |
|
higher level functions even if they don't directly execute much. |
|
'Children' here means functions that are called from another (parent) |
|
function. |
|
|
|
It might be confusing that the sum of all the 'children' overhead |
|
values exceeds 100% since each of them is already an accumulation of |
|
'self' overhead of its child functions. But with this enabled, users |
|
can find which function has the most overhead even if samples are |
|
spread over the children. |
|
|
|
Consider the following example; there are three functions like below. |
|
|
|
----------------------- |
|
void foo(void) { |
|
/* do something */ |
|
} |
|
|
|
void bar(void) { |
|
/* do something */ |
|
foo(); |
|
} |
|
|
|
int main(void) { |
|
bar() |
|
return 0; |
|
} |
|
----------------------- |
|
|
|
In this case 'foo' is a child of 'bar', and 'bar' is an immediate |
|
child of 'main' so 'foo' also is a child of 'main'. In other words, |
|
'main' is a parent of 'foo' and 'bar', and 'bar' is a parent of 'foo'. |
|
|
|
Suppose all samples are recorded in 'foo' and 'bar' only. When it's |
|
recorded with callchains the output will show something like below |
|
in the usual (self-overhead-only) output of perf report: |
|
|
|
---------------------------------- |
|
Overhead Symbol |
|
........ ..................... |
|
60.00% foo |
|
| |
|
--- foo |
|
bar |
|
main |
|
__libc_start_main |
|
|
|
40.00% bar |
|
| |
|
--- bar |
|
main |
|
__libc_start_main |
|
---------------------------------- |
|
|
|
When the --children option is enabled, the 'self' overhead values of |
|
child functions (i.e. 'foo' and 'bar') are added to the parents to |
|
calculate the 'children' overhead. In this case the report could be |
|
displayed as: |
|
|
|
------------------------------------------- |
|
Children Self Symbol |
|
........ ........ .................... |
|
100.00% 0.00% __libc_start_main |
|
| |
|
--- __libc_start_main |
|
|
|
100.00% 0.00% main |
|
| |
|
--- main |
|
__libc_start_main |
|
|
|
100.00% 40.00% bar |
|
| |
|
--- bar |
|
main |
|
__libc_start_main |
|
|
|
60.00% 60.00% foo |
|
| |
|
--- foo |
|
bar |
|
main |
|
__libc_start_main |
|
------------------------------------------- |
|
|
|
In the above output, the 'self' overhead of 'foo' (60%) was add to the |
|
'children' overhead of 'bar', 'main' and '\_\_libc_start_main'. |
|
Likewise, the 'self' overhead of 'bar' (40%) was added to the |
|
'children' overhead of 'main' and '\_\_libc_start_main'. |
|
|
|
So '\_\_libc_start_main' and 'main' are shown first since they have |
|
same (100%) 'children' overhead (even though they have zero 'self' |
|
overhead) and they are the parents of 'foo' and 'bar'. |
|
|
|
Since v3.16 the 'children' overhead is shown by default and the output |
|
is sorted by its values. The 'children' overhead is disabled by |
|
specifying --no-children option on the command line or by adding |
|
'report.children = false' or 'top.children = false' in the perf config |
|
file.
|
|
|