LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Jiri Olsa <jolsa@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Namhyung Kim <namhyung@kernel.org>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] perf script python: Allow reporting [un]throttle
Date: Wed, 1 Sep 2021 10:31:59 -0300	[thread overview]
Message-ID: <YS+AzzoPci74EmWp@kernel.org> (raw)
In-Reply-To: <3035e557-9752-9277-6a84-f6d65a6a119e@oracle.com>

Em Tue, Aug 31, 2021 at 02:47:47PM -0700, Stephen Brennan escreveu:
> On 8/31/21 11:46 AM, Jiri Olsa wrote:
> > On Mon, Aug 16, 2021 at 05:21:33PM -0700, Stephen Brennan wrote:
> > > perf_events may sometimes throttle an event due to creating too many
> > > samples during a given timer tick. As of now, the perf tool will not
> > > report on throttling, which means this is a silent error. Implement a
> > > callback for the throttle and unthrottle events within the Python
> > > scripting engine, which can allow scripts to detect and report when
> > > events may have been lost due to throttling.

> > > A python script could simply define throttle() and unthrottle()
> > > functions to begin receiving them, e.g.:

> > > ```
> > > from __future__ import print_function

> > > def process_event(param_dict):
> > >      print("event cpu={} time={}".format(
> > >          param_dict["sample"]["cpu"], param_dict["sample"]["time"]))

> > > def throttle(*args):
> > >      print("throttle(time={}, cpu={}, pid={}, tid={})".format(*args))

> > > def unthrottle(*args):
> > >      print("unthrottle(time={}, cpu={}, pid={}, tid={})".format(*args))
> > > ```

> > throttle event has also 'id' and 'stream_id' I guess you don't
> > need it, but maybe we should add it to be complete

> I tried adding these from struct perf_sample.id and struct
> perf_sample.sample_id respectively. I then tested these on a perf.data which
> contains a throttle event. The values which my python script reported were
> 0xFFFFFFFFFFFFFFFF -- is this expected? If so, I'll send my revised patch.

So, perf_sample.X is only defined if PERF_SAMPLE_X is in
perf_event_attr.sample_type, and for the perf.data file below it is:

  sample_type = IP|TID|TIME

So only perf_sample.ip, perf_sample.tid and perf_sample.time are set,
not perf_sample.id nor perf_sample.stream_id.

What Jiri suggests is to use the _fixed_ payload for
PERF_SAMPLE_THROTTLE/UNTHROTTLE, which is, according to
include/uapi/linux/perf_event.h:

         * struct {
         *      struct perf_event_header        header;
         *      u64                             time;          <<<<
         *      u64                             id;            <<<<
         *      u64                             stream_id;     <<<<
         *      struct sample_id                sample_id;
         * };
         */
        PERF_RECORD_THROTTLE                    = 5,
        PERF_RECORD_UNTHROTTLE                  = 6,
 

The 'time', 'id' and 'stream_id' should always be there, while what is
in 'sample_id' is documented in the same file at the beginning of the
'enum perf_event_type' definition:

         * If perf_event_attr.sample_id_all is set then all event types will
         * have the sample_type selected fields related to where/when
         * (identity) an event took place (TID, TIME, ID, STREAM_ID, CPU,
         * IDENTIFIER) described in PERF_RECORD_SAMPLE below, it will be stashed
         * just after the perf_event_header and the fields already present for
         * the existing fields, i.e. at the end of the payload. That way a newer
         * perf.data file will be supported by older perf tools, with these new
         * optional fields being ignored.
         *
         * struct sample_id {
         *      { u32                   pid, tid; } && PERF_SAMPLE_TID
         *      { u64                   time;     } && PERF_SAMPLE_TIME
         *      { u64                   id;       } && PERF_SAMPLE_ID
         *      { u64                   stream_id;} && PERF_SAMPLE_STREAM_ID
         *      { u32                   cpu, res; } && PERF_SAMPLE_CPU
         *      { u64                   id;       } && PERF_SAMPLE_IDENTIFIER
         * } && perf_event_attr::sample_id_all
         *
         * Note that PERF_SAMPLE_IDENTIFIER duplicates PERF_SAMPLE_ID.  The
         * advantage of PERF_SAMPLE_IDENTIFIER is that its position is fixed
         * relative to header.size.

There are artifacts here from the evolution of the perf_event subsystem,
i.e. throttle.id is equal to throttle.sample-id.id when PERF_SAMPLE_ID
is set in perf_event_attr.sample_type, ditto for 'time' and 'stream_id'.
entries.

But since 'time', 'id' and 'stream_id' are _always_ set in the
PERF_RECORD_THROTTLE event, we can as well make them available via perf
script python.

Take a look at ./kernel/events/core.c, function perf_log_throttle(), the
only one in the kernel that emits the PERF_RECORD_THROTTLE meta event
and see that it always set the 'time', 'id' and 'stream_id' fields:

static void perf_log_throttle(struct perf_event *event, int enable)
{
        struct perf_output_handle handle;
        struct perf_sample_data sample;
        int ret;

        struct {
                struct perf_event_header        header;
                u64                             time;
                u64                             id;
                u64                             stream_id;
        } throttle_event = {
                .header = {
                        .type = PERF_RECORD_THROTTLE,
                        .misc = 0,
                        .size = sizeof(throttle_event),
                },
                .time           = perf_event_clock(event),
                .id             = primary_event_id(event),
                .stream_id      = event->id,
        };

        if (enable)
                throttle_event.header.type = PERF_RECORD_UNTHROTTLE;

        perf_event_header__init_id(&throttle_event.header, &sample, event);

        ret = perf_output_begin(&handle, &sample, event,
                                throttle_event.header.size);
        if (ret)
                return;

        perf_output_put(&handle, throttle_event);
        perf_event__output_id_sample(event, &handle, &sample);
        perf_output_end(&handle);
}

So instead of doing:

+       tuple_set_u64(t, 0, sample->time);
+       tuple_set_s32(t, 1, sample->cpu);
+       tuple_set_s32(t, 2, sample->pid);
+       tuple_set_s32(t, 3, sample->tid);

That may not be set at all, you should do:

+       tuple_set_u64(t, 0, event->throttle.time);
+       tuple_set_u64(t, 1, event->throttle.id);
+       tuple_set_u64(t, 2, event->throttle.stream_id);

These can be left as is with 0xFFFFFFFFFFFFFFFF meaning its not set:

+       tuple_set_s32(t, 3, sample->cpu);
+       tuple_set_s32(t, 4, sample->pid);
+       tuple_set_s32(t, 5, sample->tid);

to confirm the user can look at perf_event_attr.sample_type |
PERF_SAMPLE_X.

- Arnaldo

> Thanks,
> Stephen
> 
> `perf report --header-only` from my testing perf.data:
> 
> # ========
> # captured on    : Wed Aug 11 11:55:44 2021
> # header version : 1
> # data offset    : 264
> # data size      : 18240
> # feat offset    : 18504
> # hostname : stepbren-ol7-2
> # os release : 5.4.17-2102.203.6.el7uek.x86_64
> # perf version : 5.4.17-2102.203.6.el7uek.x86_64
> # arch : x86_64
> # nrcpus online : 4
> # nrcpus avail : 4
> # cpudesc : Intel(R) Xeon(R) Platinum 8167M CPU @ 2.00GHz
> # cpuid : GenuineIntel,6,85,4
> # total memory : 30522624 kB
> # cmdline : /usr/libexec/perf.5.4.17-2102.203.6.el7uek.x86_64 record -c
> 100000 -e cycles -- sh -c echo 2000 > /proc/irqoff_sleep_millis
> # event : name = cycles, , id = { 5, 6, 7, 8 }, size = 112, { sample_period,
> sample_freq } = 100000, sample_type = IP|TID|TIME, read_format = ID,
> disabled = 1, inherit = 1, mmap = 1, comm = 1, enable_on_exec = 1, task = 1,
> sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1,
> bpf_event = 1
> # CPU_TOPOLOGY info available, use -I to display
> # NUMA_TOPOLOGY info available, use -I to display
> # pmu mappings: software = 1, uprobe = 7, cpu = 4, breakpoint = 5,
> tracepoint = 2, kprobe = 6, msr = 8
> # CACHE info available, use -I to display
> # time of first sample : 228.321751
> # time of last sample : 230.362698
> # sample duration :   2040.947 ms
> # MEM_TOPOLOGY info available, use -I to display
> # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT
> CLOCKID DIR_FORMAT COMPRESSED CPU_PMU_CAPS CLOCK_DATA HYBRID_TOPOLOGY
> HYBRID_CPU_PMU_CAPS
> # ========

  reply	other threads:[~2021-09-01 13:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-17  0:21 Stephen Brennan
2021-08-31 18:46 ` Jiri Olsa
2021-08-31 21:47   ` Stephen Brennan
2021-09-01 13:31     ` Arnaldo Carvalho de Melo [this message]
2021-09-01 20:47       ` Stephen Brennan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YS+AzzoPci74EmWp@kernel.org \
    --to=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stephen.s.brennan@oracle.com \
    --subject='Re: [RFC PATCH] perf script python: Allow reporting [un]throttle' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).