LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v4] perf record: collect user registers set jointly with dwarf stacks
@ 2019-05-29 14:30 Alexey Budankov
2019-05-29 19:25 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 8+ messages in thread
From: Alexey Budankov @ 2019-05-29 14:30 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
Ingo Molnar, Andi Kleen, linux-kernel
When dwarf stacks are collected jointly with user specified register
set using --user-regs option like below the full register context is
still captured on a sample:
$ perf record -g --call-graph dwarf,1024 --user-regs=IP,SP,BP -- stack_test2.g.O3
188143843893585 0x6b48 [0x4f8]: PERF_RECORD_SAMPLE(IP, 0x4002): 23828/23828: 0x401236 period: 1363819 addr: 0x7ffedbdd51ac
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
.... AX 0x53b
.... BX 0x7ffedbdd3cc0
.... CX 0xffffffff
.... DX 0x33d3a
.... SI 0x7f09b74c38d0
.... DI 0x0
.... BP 0x401260
.... SP 0x7ffedbdd3cc0
.... IP 0x401236
.... FLAGS 0x20a
.... CS 0x33
.... SS 0x2b
.... R8 0x7f09b74c3800
.... R9 0x7f09b74c2da0
.... R10 0xfffffffffffff3ce
.... R11 0x246
.... R12 0x401070
.... R13 0x7ffedbdd5db0
.... R14 0x0
.... R15 0x0
... ustack: size 1024, offset 0xe0
. data_src: 0x5080021
... thread: stack_test2.g.O:23828
...... dso: /root/abudanko/stacks/stack_test2.g.O3
After applying the change suggested in the patch the sample data contain
only user specified register values. IP and SP registers (dwarf_regs)
are collected anyways regardless of the --user-regs option value provided
from the command line:
-g call-graph dwarf,K full_regs
-g call-graph dwarf,K --user-regs=user_regs user_regs + dwarf_regs
--user-regs=user_regs user_regs
$ perf record -g --call-graph dwarf,1024 --user-regs=BP -- ls
WARNING: specified --user-regs register set doesn't include registers needed by also specified --call-graph=dwarf, auto adding IP, SP registers.
arch COPYING Documentation include Kbuild lbuild MAINTAINERS modules.builtin Module.symvers perf.data.old scripts System.map virt
block CREDITS drivers init Kconfig lib Makefile modules.builtin.modinfo net README security tools vmlinux
certs crypto fs ipc kernel LICENSES mm modules.order perf.data samples sound usr vmlinux.o
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ]
188368474305373 0x5e40 [0x470]: PERF_RECORD_SAMPLE(IP, 0x4002): 23839/23839: 0x401236 period: 1260507 addr: 0x7ffd3d85e96c
... FP chain: nr:0
... user regs: mask 0x1c0 ABI 64-bit
.... BP 0x401260
.... SP 0x7ffd3d85cc20
.... IP 0x401236
... ustack: size 1024, offset 0x58
. data_src: 0x5080021
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
Changes in v4:
- added warning message about dwarf registers unconditionally
included into the collected registers set
Changes in v3:
- avoid changes in platform specific header files
Changes in v2:
- implemented dwarf register set to avoid corrupted trace
when --user-regs option value omits IP,SP
---
tools/perf/util/evsel.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a6f572a40deb..426dfefeecda 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -669,6 +669,9 @@ int perf_evsel__group_desc(struct perf_evsel *evsel, char *buf, size_t size)
return ret;
}
+#define DWARF_REGS_MASK ((1ULL << PERF_REG_IP) | \
+ (1ULL << PERF_REG_SP))
+
static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
struct record_opts *opts,
struct callchain_param *param)
@@ -702,7 +705,13 @@ static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
if (!function) {
perf_evsel__set_sample_bit(evsel, REGS_USER);
perf_evsel__set_sample_bit(evsel, STACK_USER);
- attr->sample_regs_user |= PERF_REGS_MASK;
+ if (opts->sample_user_regs) {
+ attr->sample_regs_user |= DWARF_REGS_MASK;
+ pr_warning("WARNING: specified --user-regs register set doesn't include registers "
+ "needed by also specified --call-graph=dwarf, auto adding IP, SP registers.\n");
+ } else {
+ attr->sample_regs_user |= PERF_REGS_MASK;
+ }
attr->sample_stack_user = param->dump_size;
attr->exclude_callchain_user = 1;
} else {
--
2.20.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v4] perf record: collect user registers set jointly with dwarf stacks
2019-05-29 14:30 [PATCH v4] perf record: collect user registers set jointly with dwarf stacks Alexey Budankov
@ 2019-05-29 19:25 ` Arnaldo Carvalho de Melo
2019-05-30 8:24 ` Alexey Budankov
0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-05-29 19:25 UTC (permalink / raw)
To: Alexey Budankov
Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
Ingo Molnar, Andi Kleen, linux-kernel
Em Wed, May 29, 2019 at 05:30:49PM +0300, Alexey Budankov escreveu:
>
> When dwarf stacks are collected jointly with user specified register
> set using --user-regs option like below the full register context is
> still captured on a sample:
>
> $ perf record -g --call-graph dwarf,1024 --user-regs=IP,SP,BP -- stack_test2.g.O3
>
> 188143843893585 0x6b48 [0x4f8]: PERF_RECORD_SAMPLE(IP, 0x4002): 23828/23828: 0x401236 period: 1363819 addr: 0x7ffedbdd51ac
> ... FP chain: nr:0
> ... user regs: mask 0xff0fff ABI 64-bit
> .... AX 0x53b
> .... BX 0x7ffedbdd3cc0
> .... CX 0xffffffff
> .... DX 0x33d3a
> .... SI 0x7f09b74c38d0
> .... DI 0x0
> .... BP 0x401260
> .... SP 0x7ffedbdd3cc0
> .... IP 0x401236
> .... FLAGS 0x20a
> .... CS 0x33
> .... SS 0x2b
> .... R8 0x7f09b74c3800
> .... R9 0x7f09b74c2da0
> .... R10 0xfffffffffffff3ce
> .... R11 0x246
> .... R12 0x401070
> .... R13 0x7ffedbdd5db0
> .... R14 0x0
> .... R15 0x0
> ... ustack: size 1024, offset 0xe0
> . data_src: 0x5080021
> ... thread: stack_test2.g.O:23828
> ...... dso: /root/abudanko/stacks/stack_test2.g.O3
>
> After applying the change suggested in the patch the sample data contain
> only user specified register values. IP and SP registers (dwarf_regs)
> are collected anyways regardless of the --user-regs option value provided
> from the command line:
>
> -g call-graph dwarf,K full_regs
> -g call-graph dwarf,K --user-regs=user_regs user_regs + dwarf_regs
> --user-regs=user_regs user_regs
>
> $ perf record -g --call-graph dwarf,1024 --user-regs=BP -- ls
> WARNING: specified --user-regs register set doesn't include registers needed by also specified --call-graph=dwarf, auto adding IP, SP registers.
> arch COPYING Documentation include Kbuild lbuild MAINTAINERS modules.builtin Module.symvers perf.data.old scripts System.map virt
> block CREDITS drivers init Kconfig lib Makefile modules.builtin.modinfo net README security tools vmlinux
> certs crypto fs ipc kernel LICENSES mm modules.order perf.data samples sound usr vmlinux.o
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ]
>
> 188368474305373 0x5e40 [0x470]: PERF_RECORD_SAMPLE(IP, 0x4002): 23839/23839: 0x401236 period: 1260507 addr: 0x7ffd3d85e96c
> ... FP chain: nr:0
> ... user regs: mask 0x1c0 ABI 64-bit
> .... BP 0x401260
> .... SP 0x7ffd3d85cc20
> .... IP 0x401236
> ... ustack: size 1024, offset 0x58
> . data_src: 0x5080021
>
> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
> ---
> Changes in v4:
> - added warning message about dwarf registers unconditionally
> included into the collected registers set
>
> Changes in v3:
> - avoid changes in platform specific header files
>
> Changes in v2:
> - implemented dwarf register set to avoid corrupted trace
> when --user-regs option value omits IP,SP
>
> ---
> tools/perf/util/evsel.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index a6f572a40deb..426dfefeecda 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -669,6 +669,9 @@ int perf_evsel__group_desc(struct perf_evsel *evsel, char *buf, size_t size)
> return ret;
> }
>
> +#define DWARF_REGS_MASK ((1ULL << PERF_REG_IP) | \
> + (1ULL << PERF_REG_SP))
> +
> static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
> struct record_opts *opts,
> struct callchain_param *param)
> @@ -702,7 +705,13 @@ static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
> if (!function) {
> perf_evsel__set_sample_bit(evsel, REGS_USER);
> perf_evsel__set_sample_bit(evsel, STACK_USER);
> - attr->sample_regs_user |= PERF_REGS_MASK;
> + if (opts->sample_user_regs) {
Where are you checking that opts->sample_user_regs doesn't have either
IP or SP?
So, __perf_evsel__config_callchain its the routine that sets up the
attr->sample_regs_user when callchains are asked for, and what was it
doing? Asking for _all_ user regs, right?
I.e. what you're saying is that when --callgraph-dwarf is asked for,
then only IP and BP are needed, and we should stop doing that, so that
would be a first patch, if that is the case. I.e. a patch that doesn't
even mention opts->sample_user_regs.
Then, a second patch would fix the opt->sample_user_regs request clash
with --callgraph dwarf, i.e. it would do something like:
if ((opts->sample_regs_user & DWARF_REGS_MASK) != DWARF_REGS_MASK) {
char * ip = (opts->sample_regs_user & (1ULL << PERF_REG_IP)) ? NULL : "IP",
* sp = (opts->sample_regs_user & (1ULL << PERF_REG_SP)) ? NULL : "SP",
* all = (!ip && !sp) ? "s" : "";
pr_warning("WARNING: specified --user-regs register set doesn't include register%s "
"needed by also specified --call-graph=dwarf, auto adding %s%s%s register%s.\n",
all, ip, all : ", " : "", sp, all);
}
This if and only if all the registers that are needed to do DWARF
unwinding are just IP and BP, which doesn't look like its true, since
when no --user_regs is set (i.e. opts->user_regs is not set) then we
continue asking for PERF_REGS_MASK...
Can you check where I'm missing something?
Jiri DWARF unwind uses just IP and SP? Looking at
tools/perf/util/unwind-libunwind-local.c's access_reg() I don't think
so, right?
- Arnaldo
> + attr->sample_regs_user |= DWARF_REGS_MASK;
> + pr_warning("WARNING: specified --user-regs register set doesn't include registers "
> + "needed by also specified --call-graph=dwarf, auto adding IP, SP registers.\n");
> + } else {
> + attr->sample_regs_user |= PERF_REGS_MASK;
> + }
> attr->sample_stack_user = param->dump_size;
> attr->exclude_callchain_user = 1;
> } else {
> --
> 2.20.1
--
- Arnaldo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] perf record: collect user registers set jointly with dwarf stacks
2019-05-29 19:25 ` Arnaldo Carvalho de Melo
@ 2019-05-30 8:24 ` Alexey Budankov
2019-05-30 13:13 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 8+ messages in thread
From: Alexey Budankov @ 2019-05-30 8:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
Ingo Molnar, Andi Kleen, linux-kernel
On 29.05.2019 22:25, Arnaldo Carvalho de Melo wrote:
> Em Wed, May 29, 2019 at 05:30:49PM +0300, Alexey Budankov escreveu:
>>
<SNIP>
>> ---
>> tools/perf/util/evsel.c | 11 ++++++++++-
>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index a6f572a40deb..426dfefeecda 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -669,6 +669,9 @@ int perf_evsel__group_desc(struct perf_evsel *evsel, char *buf, size_t size)
>> return ret;
>> }
>>
>> +#define DWARF_REGS_MASK ((1ULL << PERF_REG_IP) | \
>> + (1ULL << PERF_REG_SP))
>> +
>> static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
>> struct record_opts *opts,
>> struct callchain_param *param)
>> @@ -702,7 +705,13 @@ static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
>> if (!function) {
>> perf_evsel__set_sample_bit(evsel, REGS_USER);
>> perf_evsel__set_sample_bit(evsel, STACK_USER);
>> - attr->sample_regs_user |= PERF_REGS_MASK;
>> + if (opts->sample_user_regs) {
>
> Where are you checking that opts->sample_user_regs doesn't have either
> IP or SP?
Sure. The the intention was to avoid such a complication, merge two
masks and provide explicit warning that the resulting mask is extended.
If you still see the checking and auto detection of the exact mask
extension as essential it can be implemented.
>
> So, __perf_evsel__config_callchain its the routine that sets up the
> attr->sample_regs_user when callchains are asked for, and what was it
> doing? Asking for _all_ user regs, right?
>
> I.e. what you're saying is that when --callgraph-dwarf is asked for,
> then only IP and BP are needed, and we should stop doing that, so that
> would be a first patch, if that is the case. I.e. a patch that doesn't
> even mention opts->sample_user_regs.
>
> Then, a second patch would fix the opt->sample_user_regs request clash
> with --callgraph dwarf, i.e. it would do something like:
>
> if ((opts->sample_regs_user & DWARF_REGS_MASK) != DWARF_REGS_MASK) {
> char * ip = (opts->sample_regs_user & (1ULL << PERF_REG_IP)) ? NULL : "IP",
> * sp = (opts->sample_regs_user & (1ULL << PERF_REG_SP)) ? NULL : "SP",
> * all = (!ip && !sp) ? "s" : "";
>
> pr_warning("WARNING: specified --user-regs register set doesn't include register%s "
> "needed by also specified --call-graph=dwarf, auto adding %s%s%s register%s.\n",
> all, ip, all : ", " : "", sp, all);
> }
>
> This if and only if all the registers that are needed to do DWARF
> unwinding are just IP and BP, which doesn't look like its true, since
> when no --user_regs is set (i.e. opts->user_regs is not set) then we
> continue asking for PERF_REGS_MASK...
>
> Can you check where I'm missing something?
1. -g call-graph dwarf,K full_regs
2. --user-regs=user_regs user_regs
3. -g call-graph dwarf,K --user-regs=user_regs user_regs + dwarf_regs
The default behavior stays the same for cases 1, 2 above.
For case 3 register set becomes the one asked using --user_regs option.
If the option value misses IP or SP or the both then they are explicitly
added to the option value and a warning message mentioning the exact
added registers is provided.
>
> Jiri DWARF unwind uses just IP and SP? Looking at
> tools/perf/util/unwind-libunwind-local.c's access_reg() I don't think
> so, right?
If you ask me, AFAIK, DWARF unwind rules sometimes can refer additional
general purpose registers for frames boundaries calculation.
Thanks,
Alexey
>
> - Arnaldo
>
>> + attr->sample_regs_user |= DWARF_REGS_MASK;
>> + pr_warning("WARNING: specified --user-regs register set doesn't include registers "
>> + "needed by also specified --call-graph=dwarf, auto adding IP, SP registers.\n");
>> + } else {
>> + attr->sample_regs_user |= PERF_REGS_MASK;
>> + }
>> attr->sample_stack_user = param->dump_size;
>> attr->exclude_callchain_user = 1;
>> } else {
>> --
>> 2.20.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] perf record: collect user registers set jointly with dwarf stacks
2019-05-30 8:24 ` Alexey Budankov
@ 2019-05-30 13:13 ` Arnaldo Carvalho de Melo
2019-05-30 16:24 ` Alexey Budankov
0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-05-30 13:13 UTC (permalink / raw)
To: Alexey Budankov
Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
Alexander Shishkin, Peter Zijlstra, Ingo Molnar, Andi Kleen,
linux-kernel
Em Thu, May 30, 2019 at 11:24:49AM +0300, Alexey Budankov escreveu:
> On 29.05.2019 22:25, Arnaldo Carvalho de Melo wrote:
> > Em Wed, May 29, 2019 at 05:30:49PM +0300, Alexey Budankov escreveu:
> <SNIP>
> >> +++ b/tools/perf/util/evsel.c
> >> +#define DWARF_REGS_MASK ((1ULL << PERF_REG_IP) | \
> >> + (1ULL << PERF_REG_SP))
> >> +
> >> static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
> >> struct record_opts *opts,
> >> struct callchain_param *param)
> >> @@ -702,7 +705,13 @@ static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
> >> if (!function) {
> >> perf_evsel__set_sample_bit(evsel, REGS_USER);
> >> perf_evsel__set_sample_bit(evsel, STACK_USER);
> >> - attr->sample_regs_user |= PERF_REGS_MASK;
> >> + if (opts->sample_user_regs) {
> >
> > Where are you checking that opts->sample_user_regs doesn't have either
> > IP or SP?
>
> Sure. The the intention was to avoid such a complication, merge two
> masks and provide explicit warning that the resulting mask is extended.
s/is/may be/g
> If you still see the checking and auto detection of the exact mask
> extension as essential it can be implemented.
perf, tracing, systems internals, etc are super complicated, full of
details, the more precise we can make the messages, the better.
> > So, __perf_evsel__config_callchain its the routine that sets up the
> > attr->sample_regs_user when callchains are asked for, and what was it
> > doing? Asking for _all_ user regs, right?
> >
> > I.e. what you're saying is that when --callgraph-dwarf is asked for,
> > then only IP and BP are needed, and we should stop doing that, so that
> > would be a first patch, if that is the case. I.e. a patch that doesn't
> > even mention opts->sample_user_regs.
> >
> > Then, a second patch would fix the opt->sample_user_regs request clash
> > with --callgraph dwarf, i.e. it would do something like:
> >
> > if ((opts->sample_regs_user & DWARF_REGS_MASK) != DWARF_REGS_MASK) {
> > char * ip = (opts->sample_regs_user & (1ULL << PERF_REG_IP)) ? NULL : "IP",
> > * sp = (opts->sample_regs_user & (1ULL << PERF_REG_SP)) ? NULL : "SP",
> > * all = (!ip && !sp) ? "s" : "";
> >
> > pr_warning("WARNING: specified --user-regs register set doesn't include register%s "
> > "needed by also specified --call-graph=dwarf, auto adding %s%s%s register%s.\n",
> > all, ip, all : ", " : "", sp, all);
> > }
> >
> > This if and only if all the registers that are needed to do DWARF
> > unwinding are just IP and BP, which doesn't look like its true, since
> > when no --user_regs is set (i.e. opts->user_regs is not set) then we
> > continue asking for PERF_REGS_MASK...
> >
> > Can you check where I'm missing something?
>
> 1. -g call-graph dwarf,K full_regs
> 2. --user-regs=user_regs user_regs
> 3. -g call-graph dwarf,K --user-regs=user_regs user_regs + dwarf_regs
>
> The default behavior stays the same for cases 1, 2 above.
> For case 3 register set becomes the one asked using --user_regs option.
> If the option value misses IP or SP or the both then they are explicitly
> added to the option value and a warning message mentioning the exact
> added registers is provided.
> > Jiri DWARF unwind uses just IP and SP? Looking at
> > tools/perf/util/unwind-libunwind-local.c's access_reg() I don't think
> > so, right?
> If you ask me, AFAIK, DWARF unwind rules sometimes can refer additional
> general purpose registers for frames boundaries calculation.
:-) So that DWARF_REGS is misleading, should be something like
DWARF_MINIMAL_REGS, as we may need other registers, so the original code
was correct, right?
After all if the user asks for both --call-graph dwarf and --user-regs,
then probably we should require --force? I.e. the message then would be:
"
WARNING: The use of --call-graph=dwarf may require all the user
registers, specifying a subset with --user-regs may render DWARF
unwinding unreliable, please use --force if you're sure that the subset
specified via --user-regs is enough for your specific use case.
"
And then plain refuse, if the user _really_ wants it, then we have
--force/-f for those cases.
Does this sound better?
- Arnaldo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] perf record: collect user registers set jointly with dwarf stacks
2019-05-30 13:13 ` Arnaldo Carvalho de Melo
@ 2019-05-30 16:24 ` Alexey Budankov
2019-05-30 18:04 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 8+ messages in thread
From: Alexey Budankov @ 2019-05-30 16:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
Ingo Molnar, Andi Kleen, linux-kernel
On 30.05.2019 16:13, Arnaldo Carvalho de Melo wrote:
> Em Thu, May 30, 2019 at 11:24:49AM +0300, Alexey Budankov escreveu:
>> On 29.05.2019 22:25, Arnaldo Carvalho de Melo wrote:
>>> Em Wed, May 29, 2019 at 05:30:49PM +0300, Alexey Budankov escreveu:
>> <SNIP>
>>>> +++ b/tools/perf/util/evsel.c
>>>> +#define DWARF_REGS_MASK ((1ULL << PERF_REG_IP) | \
>>>> + (1ULL << PERF_REG_SP))
>>>> +
>>>> static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
>>>> struct record_opts *opts,
>>>> struct callchain_param *param)
>>>> @@ -702,7 +705,13 @@ static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
>>>> if (!function) {
>>>> perf_evsel__set_sample_bit(evsel, REGS_USER);
>>>> perf_evsel__set_sample_bit(evsel, STACK_USER);
>>>> - attr->sample_regs_user |= PERF_REGS_MASK;
>>>> + if (opts->sample_user_regs) {
>>>
>>> Where are you checking that opts->sample_user_regs doesn't have either
>>> IP or SP?
>>
>> Sure. The the intention was to avoid such a complication, merge two
>> masks and provide explicit warning that the resulting mask is extended.
>
> s/is/may be/g
>
>> If you still see the checking and auto detection of the exact mask
>> extension as essential it can be implemented.
>
> perf, tracing, systems internals, etc are super complicated, full of
> details, the more precise we can make the messages, the better.
>
>>> So, __perf_evsel__config_callchain its the routine that sets up the
>>> attr->sample_regs_user when callchains are asked for, and what was it
>>> doing? Asking for _all_ user regs, right?
>>>
>>> I.e. what you're saying is that when --callgraph-dwarf is asked for,
>>> then only IP and BP are needed, and we should stop doing that, so that
>>> would be a first patch, if that is the case. I.e. a patch that doesn't
>>> even mention opts->sample_user_regs.
>>>
>>> Then, a second patch would fix the opt->sample_user_regs request clash
>>> with --callgraph dwarf, i.e. it would do something like:
>>>
>>> if ((opts->sample_regs_user & DWARF_REGS_MASK) != DWARF_REGS_MASK) {
>>> char * ip = (opts->sample_regs_user & (1ULL << PERF_REG_IP)) ? NULL : "IP",
>>> * sp = (opts->sample_regs_user & (1ULL << PERF_REG_SP)) ? NULL : "SP",
>>> * all = (!ip && !sp) ? "s" : "";
>>>
>>> pr_warning("WARNING: specified --user-regs register set doesn't include register%s "
>>> "needed by also specified --call-graph=dwarf, auto adding %s%s%s register%s.\n",
>>> all, ip, all : ", " : "", sp, all);
>>> }
>>>
>>> This if and only if all the registers that are needed to do DWARF
>>> unwinding are just IP and BP, which doesn't look like its true, since
>>> when no --user_regs is set (i.e. opts->user_regs is not set) then we
>>> continue asking for PERF_REGS_MASK...
>>>
>>> Can you check where I'm missing something?
>>
>> 1. -g call-graph dwarf,K full_regs
>> 2. --user-regs=user_regs user_regs
>> 3. -g call-graph dwarf,K --user-regs=user_regs user_regs + dwarf_regs
>>
>> The default behavior stays the same for cases 1, 2 above.
>> For case 3 register set becomes the one asked using --user_regs option.
>> If the option value misses IP or SP or the both then they are explicitly
>> added to the option value and a warning message mentioning the exact
>> added registers is provided.
>
>>> Jiri DWARF unwind uses just IP and SP? Looking at
>>> tools/perf/util/unwind-libunwind-local.c's access_reg() I don't think
>>> so, right?
>
>> If you ask me, AFAIK, DWARF unwind rules sometimes can refer additional
>> general purpose registers for frames boundaries calculation.
>
> :-) So that DWARF_REGS is misleading, should be something like
> DWARF_MINIMAL_REGS, as we may need other registers, so the original code
> was correct, right?
Right. Actually came to the same conclusion with the same naming for IP,SP mask :)
>
> After all if the user asks for both --call-graph dwarf and --user-regs,
> then probably we should require --force? I.e. the message then would be:
>
> "
> WARNING: The use of --call-graph=dwarf may require all the user
> registers, specifying a subset with --user-regs may render DWARF
> unwinding unreliable, please use --force if you're sure that the subset
> specified via --user-regs is enough for your specific use case.
> "
>
> And then plain refuse, if the user _really_ wants it, then we have
> --force/-f for those cases.
>
> Does this sound better?
If --user-regs is specified jointly with --call-graph dwarf option then
--user-regs already serves as the --force and, IMHO, a warning does the best.
The ideal solution, I could imagine, is to also dynamically calculate regs
set extension and provide it in the warning, but it is only for two registers.
So, if --call-graph dwarf --user-regs=A,B,C are specified jointly then
"
WARNING: The use of --call-graph=dwarf may require all the user registers,
specifying a subset with --user-regs may render DWARF unwinding unreliable,
so the minimal registers set (IP, SP) is explicitly forced.
"
The message is precise and it would fit the majority of use cases.
Final decision is up to you.
~Alexey
>
> - Arnaldo
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] perf record: collect user registers set jointly with dwarf stacks
2019-05-30 16:24 ` Alexey Budankov
@ 2019-05-30 18:04 ` Arnaldo Carvalho de Melo
2019-05-30 18:15 ` Alexey Budankov
0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-05-30 18:04 UTC (permalink / raw)
To: Alexey Budankov
Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
Alexander Shishkin, Peter Zijlstra, Ingo Molnar, Andi Kleen,
linux-kernel
Em Thu, May 30, 2019 at 07:24:57PM +0300, Alexey Budankov escreveu:
>
> On 30.05.2019 16:13, Arnaldo Carvalho de Melo wrote:
> > Em Thu, May 30, 2019 at 11:24:49AM +0300, Alexey Budankov escreveu:
> >> On 29.05.2019 22:25, Arnaldo Carvalho de Melo wrote:
> >>> Em Wed, May 29, 2019 at 05:30:49PM +0300, Alexey Budankov escreveu:
> >> <SNIP>
> >>>> +++ b/tools/perf/util/evsel.c
> >>>> +#define DWARF_REGS_MASK ((1ULL << PERF_REG_IP) | \
> >>>> + (1ULL << PERF_REG_SP))
> >>>> +
> >>>> static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
> >>>> struct record_opts *opts,
> >>>> struct callchain_param *param)
> >>>> @@ -702,7 +705,13 @@ static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
> >>>> if (!function) {
> >>>> perf_evsel__set_sample_bit(evsel, REGS_USER);
> >>>> perf_evsel__set_sample_bit(evsel, STACK_USER);
> >>>> - attr->sample_regs_user |= PERF_REGS_MASK;
> >>>> + if (opts->sample_user_regs) {
> >>>
> >>> Where are you checking that opts->sample_user_regs doesn't have either
> >>> IP or SP?
> >>
> >> Sure. The the intention was to avoid such a complication, merge two
> >> masks and provide explicit warning that the resulting mask is extended.
> >
> > s/is/may be/g
> >
> >> If you still see the checking and auto detection of the exact mask
> >> extension as essential it can be implemented.
> >
> > perf, tracing, systems internals, etc are super complicated, full of
> > details, the more precise we can make the messages, the better.
> >
> >>> So, __perf_evsel__config_callchain its the routine that sets up the
> >>> attr->sample_regs_user when callchains are asked for, and what was it
> >>> doing? Asking for _all_ user regs, right?
> >>>
> >>> I.e. what you're saying is that when --callgraph-dwarf is asked for,
> >>> then only IP and BP are needed, and we should stop doing that, so that
> >>> would be a first patch, if that is the case. I.e. a patch that doesn't
> >>> even mention opts->sample_user_regs.
> >>>
> >>> Then, a second patch would fix the opt->sample_user_regs request clash
> >>> with --callgraph dwarf, i.e. it would do something like:
> >>>
> >>> if ((opts->sample_regs_user & DWARF_REGS_MASK) != DWARF_REGS_MASK) {
> >>> char * ip = (opts->sample_regs_user & (1ULL << PERF_REG_IP)) ? NULL : "IP",
> >>> * sp = (opts->sample_regs_user & (1ULL << PERF_REG_SP)) ? NULL : "SP",
> >>> * all = (!ip && !sp) ? "s" : "";
> >>>
> >>> pr_warning("WARNING: specified --user-regs register set doesn't include register%s "
> >>> "needed by also specified --call-graph=dwarf, auto adding %s%s%s register%s.\n",
> >>> all, ip, all : ", " : "", sp, all);
> >>> }
> >>>
> >>> This if and only if all the registers that are needed to do DWARF
> >>> unwinding are just IP and BP, which doesn't look like its true, since
> >>> when no --user_regs is set (i.e. opts->user_regs is not set) then we
> >>> continue asking for PERF_REGS_MASK...
> >>>
> >>> Can you check where I'm missing something?
> >>
> >> 1. -g call-graph dwarf,K full_regs
> >> 2. --user-regs=user_regs user_regs
> >> 3. -g call-graph dwarf,K --user-regs=user_regs user_regs + dwarf_regs
> >>
> >> The default behavior stays the same for cases 1, 2 above.
> >> For case 3 register set becomes the one asked using --user_regs option.
> >> If the option value misses IP or SP or the both then they are explicitly
> >> added to the option value and a warning message mentioning the exact
> >> added registers is provided.
> >
> >>> Jiri DWARF unwind uses just IP and SP? Looking at
> >>> tools/perf/util/unwind-libunwind-local.c's access_reg() I don't think
> >>> so, right?
> >
> >> If you ask me, AFAIK, DWARF unwind rules sometimes can refer additional
> >> general purpose registers for frames boundaries calculation.
> >
> > :-) So that DWARF_REGS is misleading, should be something like
> > DWARF_MINIMAL_REGS, as we may need other registers, so the original code
> > was correct, right?
>
> Right. Actually came to the same conclusion with the same naming for IP,SP mask :)
>
> >
> > After all if the user asks for both --call-graph dwarf and --user-regs,
> > then probably we should require --force? I.e. the message then would be:
> >
> > "
> > WARNING: The use of --call-graph=dwarf may require all the user
> > registers, specifying a subset with --user-regs may render DWARF
> > unwinding unreliable, please use --force if you're sure that the subset
> > specified via --user-regs is enough for your specific use case.
> > "
> >
> > And then plain refuse, if the user _really_ wants it, then we have
> > --force/-f for those cases.
> >
> > Does this sound better?
>
> If --user-regs is specified jointly with --call-graph dwarf option then
> --user-regs already serves as the --force and, IMHO, a warning does the best.
> The ideal solution, I could imagine, is to also dynamically calculate regs
> set extension and provide it in the warning, but it is only for two registers.
>
> So, if --call-graph dwarf --user-regs=A,B,C are specified jointly then
> "
> WARNING: The use of --call-graph=dwarf may require all the user registers,
> specifying a subset with --user-regs may render DWARF unwinding unreliable,
> so the minimal registers set (IP, SP) is explicitly forced.
> "
I think with this wording and the renaming of DWARF_REGS to
DWARF_MINIMAL_REGS it should be enough.
- Arnaldo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] perf record: collect user registers set jointly with dwarf stacks
2019-05-30 18:04 ` Arnaldo Carvalho de Melo
@ 2019-05-30 18:15 ` Alexey Budankov
0 siblings, 0 replies; 8+ messages in thread
From: Alexey Budankov @ 2019-05-30 18:15 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
Ingo Molnar, Andi Kleen, linux-kernel
On 30.05.2019 21:04, Arnaldo Carvalho de Melo wrote:
> Em Thu, May 30, 2019 at 07:24:57PM +0300, Alexey Budankov escreveu:
>>
>> On 30.05.2019 16:13, Arnaldo Carvalho de Melo wrote:
>>> Em Thu, May 30, 2019 at 11:24:49AM +0300, Alexey Budankov escreveu:
>>>> On 29.05.2019 22:25, Arnaldo Carvalho de Melo wrote:
>>>>> Em Wed, May 29, 2019 at 05:30:49PM +0300, Alexey Budankov escreveu:
>>>> <SNIP>
>>>>>> +++ b/tools/perf/util/evsel.c
>>>>>> +#define DWARF_REGS_MASK ((1ULL << PERF_REG_IP) | \
>>>>>> + (1ULL << PERF_REG_SP))
>>>>>> +
>>>>>> static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
>>>>>> struct record_opts *opts,
>>>>>> struct callchain_param *param)
>>>>>> @@ -702,7 +705,13 @@ static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
>>>>>> if (!function) {
>>>>>> perf_evsel__set_sample_bit(evsel, REGS_USER);
>>>>>> perf_evsel__set_sample_bit(evsel, STACK_USER);
>>>>>> - attr->sample_regs_user |= PERF_REGS_MASK;
>>>>>> + if (opts->sample_user_regs) {
>>>>>
>>>>> Where are you checking that opts->sample_user_regs doesn't have either
>>>>> IP or SP?
>>>>
>>>> Sure. The the intention was to avoid such a complication, merge two
>>>> masks and provide explicit warning that the resulting mask is extended.
>>>
>>> s/is/may be/g
>>>
>>>> If you still see the checking and auto detection of the exact mask
>>>> extension as essential it can be implemented.
>>>
>>> perf, tracing, systems internals, etc are super complicated, full of
>>> details, the more precise we can make the messages, the better.
>>>
>>>>> So, __perf_evsel__config_callchain its the routine that sets up the
>>>>> attr->sample_regs_user when callchains are asked for, and what was it
>>>>> doing? Asking for _all_ user regs, right?
>>>>>
>>>>> I.e. what you're saying is that when --callgraph-dwarf is asked for,
>>>>> then only IP and BP are needed, and we should stop doing that, so that
>>>>> would be a first patch, if that is the case. I.e. a patch that doesn't
>>>>> even mention opts->sample_user_regs.
>>>>>
>>>>> Then, a second patch would fix the opt->sample_user_regs request clash
>>>>> with --callgraph dwarf, i.e. it would do something like:
>>>>>
>>>>> if ((opts->sample_regs_user & DWARF_REGS_MASK) != DWARF_REGS_MASK) {
>>>>> char * ip = (opts->sample_regs_user & (1ULL << PERF_REG_IP)) ? NULL : "IP",
>>>>> * sp = (opts->sample_regs_user & (1ULL << PERF_REG_SP)) ? NULL : "SP",
>>>>> * all = (!ip && !sp) ? "s" : "";
>>>>>
>>>>> pr_warning("WARNING: specified --user-regs register set doesn't include register%s "
>>>>> "needed by also specified --call-graph=dwarf, auto adding %s%s%s register%s.\n",
>>>>> all, ip, all : ", " : "", sp, all);
>>>>> }
>>>>>
>>>>> This if and only if all the registers that are needed to do DWARF
>>>>> unwinding are just IP and BP, which doesn't look like its true, since
>>>>> when no --user_regs is set (i.e. opts->user_regs is not set) then we
>>>>> continue asking for PERF_REGS_MASK...
>>>>>
>>>>> Can you check where I'm missing something?
>>>>
>>>> 1. -g call-graph dwarf,K full_regs
>>>> 2. --user-regs=user_regs user_regs
>>>> 3. -g call-graph dwarf,K --user-regs=user_regs user_regs + dwarf_regs
>>>>
>>>> The default behavior stays the same for cases 1, 2 above.
>>>> For case 3 register set becomes the one asked using --user_regs option.
>>>> If the option value misses IP or SP or the both then they are explicitly
>>>> added to the option value and a warning message mentioning the exact
>>>> added registers is provided.
>>>
>>>>> Jiri DWARF unwind uses just IP and SP? Looking at
>>>>> tools/perf/util/unwind-libunwind-local.c's access_reg() I don't think
>>>>> so, right?
>>>
>>>> If you ask me, AFAIK, DWARF unwind rules sometimes can refer additional
>>>> general purpose registers for frames boundaries calculation.
>>>
>>> :-) So that DWARF_REGS is misleading, should be something like
>>> DWARF_MINIMAL_REGS, as we may need other registers, so the original code
>>> was correct, right?
>>
>> Right. Actually came to the same conclusion with the same naming for IP,SP mask :)
>>
>>>
>>> After all if the user asks for both --call-graph dwarf and --user-regs,
>>> then probably we should require --force? I.e. the message then would be:
>>>
>>> "
>>> WARNING: The use of --call-graph=dwarf may require all the user
>>> registers, specifying a subset with --user-regs may render DWARF
>>> unwinding unreliable, please use --force if you're sure that the subset
>>> specified via --user-regs is enough for your specific use case.
>>> "
>>>
>>> And then plain refuse, if the user _really_ wants it, then we have
>>> --force/-f for those cases.
>>>
>>> Does this sound better?
>>
>> If --user-regs is specified jointly with --call-graph dwarf option then
>> --user-regs already serves as the --force and, IMHO, a warning does the best.
>
>> The ideal solution, I could imagine, is to also dynamically calculate regs
>> set extension and provide it in the warning, but it is only for two registers.
>>
>> So, if --call-graph dwarf --user-regs=A,B,C are specified jointly then
>> "
>> WARNING: The use of --call-graph=dwarf may require all the user registers,
>> specifying a subset with --user-regs may render DWARF unwinding unreliable,
>> so the minimal registers set (IP, SP) is explicitly forced.
>> "
>
> I think with this wording and the renaming of DWARF_REGS to
> DWARF_MINIMAL_REGS it should be enough.
Well, let's have it like this in v5.
Thanks,
Alexey
>
> - Arnaldo
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v4] perf record: collect user registers set jointly with dwarf stacks
@ 2019-05-23 11:20 Alexey Budankov
0 siblings, 0 replies; 8+ messages in thread
From: Alexey Budankov @ 2019-05-23 11:20 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
Ingo Molnar, Andi Kleen, linux-kernel
When dwarf stacks are collected jointly with user specified register
set using --user-regs option like below the full register context is
still captured on a sample:
$ perf record -g --call-graph dwarf,1024 --user-regs=IP,SP,BP -- stack_test2.g.O3
188143843893585 0x6b48 [0x4f8]: PERF_RECORD_SAMPLE(IP, 0x4002): 23828/23828: 0x401236 period: 1363819 addr: 0x7ffedbdd51ac
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
.... AX 0x53b
.... BX 0x7ffedbdd3cc0
.... CX 0xffffffff
.... DX 0x33d3a
.... SI 0x7f09b74c38d0
.... DI 0x0
.... BP 0x401260
.... SP 0x7ffedbdd3cc0
.... IP 0x401236
.... FLAGS 0x20a
.... CS 0x33
.... SS 0x2b
.... R8 0x7f09b74c3800
.... R9 0x7f09b74c2da0
.... R10 0xfffffffffffff3ce
.... R11 0x246
.... R12 0x401070
.... R13 0x7ffedbdd5db0
.... R14 0x0
.... R15 0x0
... ustack: size 1024, offset 0xe0
. data_src: 0x5080021
... thread: stack_test2.g.O:23828
...... dso: /root/abudanko/stacks/stack_test2.g.O3
After applying the change suggested in the patch the sample data contain
only user specified register values. IP and SP registers (dwarf_regs)
are collected anyways regardless of the --user-regs option value provided
from the command line:
-g call-graph dwarf,K full_regs
-g call-graph dwarf,K --user-regs=user_regs user_regs + dwarf_regs
--user-regs=user_regs user_regs
$ perf record -g --call-graph dwarf,1024 --user-regs=BP -- ls
WARNING: specified --user-regs register set doesn't include registers needed by also specified --call-graph=dwarf, auto adding IP, SP registers.
arch COPYING Documentation include Kbuild lbuild MAINTAINERS modules.builtin Module.symvers perf.data.old scripts System.map virt
block CREDITS drivers init Kconfig lib Makefile modules.builtin.modinfo net README security tools vmlinux
certs crypto fs ipc kernel LICENSES mm modules.order perf.data samples sound usr vmlinux.o
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ]
188368474305373 0x5e40 [0x470]: PERF_RECORD_SAMPLE(IP, 0x4002): 23839/23839: 0x401236 period: 1260507 addr: 0x7ffd3d85e96c
... FP chain: nr:0
... user regs: mask 0x1c0 ABI 64-bit
.... BP 0x401260
.... SP 0x7ffd3d85cc20
.... IP 0x401236
... ustack: size 1024, offset 0x58
. data_src: 0x5080021
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
Changes in v4:
- added warning message about dwarf registers unconditionally
included into the collected registers set
Changes in v3:
- avoid changes in platform specific header files
Changes in v2:
- implemented dwarf register set to avoid corrupted trace
when --user-regs option value omits IP,SP
---
tools/perf/util/evsel.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a6f572a40deb..05b403ba0ded 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -669,6 +669,9 @@ int perf_evsel__group_desc(struct perf_evsel *evsel, char *buf, size_t size)
return ret;
}
+#define DWARF_REGS_MASK ((1ULL << PERF_REG_IP) | \
+ (1ULL << PERF_REG_SP))
+
static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
struct record_opts *opts,
struct callchain_param *param)
@@ -702,7 +705,14 @@ static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
if (!function) {
perf_evsel__set_sample_bit(evsel, REGS_USER);
perf_evsel__set_sample_bit(evsel, STACK_USER);
- attr->sample_regs_user |= PERF_REGS_MASK;
+ if (opts->sample_user_regs) {
+ attr->sample_regs_user |= DWARF_REGS_MASK;
+ pr_warning("WARNING: specified --user-regs register set doesn't include registers "
+ "needed by also specified --call-graph=dwarf, auto adding IP, SP registers.\n");
+
+ } else {
+ attr->sample_regs_user |= PERF_REGS_MASK;
+ }
attr->sample_stack_user = param->dump_size;
attr->exclude_callchain_user = 1;
} else {
--
2.20.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-05-30 18:15 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-29 14:30 [PATCH v4] perf record: collect user registers set jointly with dwarf stacks Alexey Budankov
2019-05-29 19:25 ` Arnaldo Carvalho de Melo
2019-05-30 8:24 ` Alexey Budankov
2019-05-30 13:13 ` Arnaldo Carvalho de Melo
2019-05-30 16:24 ` Alexey Budankov
2019-05-30 18:04 ` Arnaldo Carvalho de Melo
2019-05-30 18:15 ` Alexey Budankov
-- strict thread matches above, loose matches on Subject: below --
2019-05-23 11:20 Alexey Budankov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).