LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [PATCH] drivers/perf: arm-ccn: stop spamming dmesg in event_init
       [not found]       ` <cee9c043-3d64-56b9-4f23-e8c8370387dd@gmail.com>
@ 2018-05-17  0:25         ` Kim Phillips
  2018-05-17 15:07           ` Matt Sealey
  0 siblings, 1 reply; 2+ messages in thread
From: Kim Phillips @ 2018-05-17  0:25 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Florian Fainelli, Pawel Moll, Mark Rutland, Will Deacon,
	linux-arm-kernel, linux-kernel, linux-perf-users, Robin Murphy,
	Ingo Molnar, Linus Torvalds, Matt Sealey, mathieu.poirier,
	peterz, alexander.shishkin, suzuki.poulose

[adding acme, LKML, linux-perf-users, and other potentially interested]

On Wed, 9 May 2018 17:22:42 -0700
Florian Fainelli <f.fainelli@gmail.com> wrote:

> On 05/09/2018 04:41 PM, Kim Phillips wrote:
> > On Wed, 9 May 2018 17:06:43 +0100
> > Pawel Moll <pawel.moll@arm.com> wrote:
> > 
> >> On Wed, 2018-05-09 at 00:40 +0100, Kim Phillips wrote:
> >>> On Fri,  4 May 2018 11:41:17 +0100 Mark Rutland <mark.rutland@arm.com> wrote:
> >>>
> >>> [adding Pawel, arm-ccn driver author]
> >>
> >> We had this discussion way too many times for my liking. As I said
> >> before - *I* will be fine with the debug messages in the CCN driver.
> >> Now, if there ever turns out to be other user of this thing and gets
> > 
> > Sure, this isn't just about Pawel using the driver he wrote - we know
> > you know how to use it because you wrote it.  No, it's about all the
> > other potential users out there, esp. first time users, as I once was.
> > 
> >> into problems with event configuration, I'd hope that he/she can count
> >> on support from the knowledgable people here... (just checked and both
> > 
> > I abhor having to suggest our users email this list in order to find out
> > how to use the PMU drivers.  First time users are going to tend to
> > steer completely away if they don't have the patience to debug a silent
> > PMU, rather than email this mailing list - sorry, but that's just
> > adding a huge usage barrier - for what - fuzzer runner's convenience?
> > 
> >> RHEL 7.5 and Ubuntu 16.04.3 kernels on AArch64 come with
> >> CONFIG_DYNAMIC_DEBUG=y, so it's a matter of explaining what has to be
> >> enabled where; sadly this information did not find its way into the
> >> commit description)
> > 
> > I can't think of a more difficult to find, more far-away, alien
> > place for end users to find help with perf, sorry.
> > 
> >>>> The ARM CCN PMU driver uses dev_warn() to complain about parameters
> >>> in
> >>>> the user-provided perf_event_attr. This means that under normal
> >>>> operation (e.g. a single invocation of the perf tool), dmesg may be
> >>>> spammed with multiple messages.
> >>
> >> Surely Mark, in his role as maintainer of drivers/perf/ (and a few
> >> other locations), meant to use much more technical and emotion-free
> >> subject, along the lines of "reduce a number of dmesg warnings at event
> >> init".
> > 
> > 'reduce a number' is the wrong word:  warnings are completely
> > eliminated. Debug-level messages occur at exactly the same
> > frequency/amount.
> > 
> > But I still object to the rationale overall - it seems this is about
> > running fuzzers?  I even offered an alternative for fuzzer runners: is
> > modifying the loglevel prior to fuzzing somehow unacceptable?
> 
> I don't have any dog in this, but maybe if providing information to the
> users is so essential to having a pleasant user experience, then
> rethinking the whole way these messages are funneled is necessary
> because the kernel log + dmesg is by no means appropriate. Take a look
> at what the networking maintainers recently did with netlink extended
> ack. You used to just get an "unsupported" error, and now you know
> exactly what is wrong when extack is available. It seems to me like
> something like this is what you might want here since you want to have
> perf be as user friendly as possible.

Thanks, Florian.

Acme & other perf people, do you foresee a problem adding netlink
extended ack support to the perf subsystem for extended error message
communication?

If not, is struct perf_event_attr amenable to having error reporting
bit(s) added?  Recall my last attempt failed because it couldn't
discriminate between the perf core and the PMU driver returning the
-Evalue:

https://patchwork.kernel.org/patch/10025555/

Thanks,

Kim

^ permalink raw reply	[flat|nested] 2+ messages in thread

* RE: [PATCH] drivers/perf: arm-ccn: stop spamming dmesg in event_init
  2018-05-17  0:25         ` [PATCH] drivers/perf: arm-ccn: stop spamming dmesg in event_init Kim Phillips
@ 2018-05-17 15:07           ` Matt Sealey
  0 siblings, 0 replies; 2+ messages in thread
From: Matt Sealey @ 2018-05-17 15:07 UTC (permalink / raw)
  To: Kim Phillips, Arnaldo Carvalho de Melo
  Cc: Florian Fainelli, Pawel Moll, Mark Rutland, Will Deacon,
	linux-arm-kernel, linux-kernel, linux-perf-users, Robin Murphy,
	Ingo Molnar, Linus Torvalds, mathieu.poirier, peterz,
	alexander.shishkin, Suzuki Poulose

Hi all,

> > I don't have any dog in this, but maybe if providing information to the
> > users is so essential to having a pleasant user experience, then
> > rethinking the whole way these messages are funneled is necessary
> > because the kernel log + dmesg is by no means appropriate. Take a look
> > at what the networking maintainers recently did with netlink extended
> > ack. You used to just get an "unsupported" error, and now you know
> > exactly what is wrong when extack is available. It seems to me like
> > something like this is what you might want here since you want to have
> > perf be as user friendly as possible.
>
> Thanks, Florian.

Florian, I'd love to know if you mean "implement netlink extended ack in
perf" or "do something idiomatic for perf"?

Let us assume we are semantically challenged over here. I'm going to
proceed from the latter.

> Acme & other perf people, do you foresee a problem adding netlink
> extended ack support to the perf subsystem for extended error message
> communication?
>
> If not, is struct perf_event_attr amenable to having error reporting
> bit(s) added?

I did have a think about this when Kim mentioned it in passing, and my
reasoning was that a serialized error record similar to ACPI APEI BERT/ERST
tables (without the NVRAM abstraction) would fit the need.

As soon as I wrote down my thoughts I realized it scratch more itches than
just something useful for perf.

It would conceptually be a buffer passed from userspace with the syscall,
which would be populated in the kernel with an identifying header, each
record denoting its own length. The syscall fills the buffer with records
of particular format for the syscall, and the errno returned to the
application is then the state of the error recording buffer - for instance,
if the kernel ran out of space in the buffer before reporting all the errors
it could (-ENOBUFS or -EAGAIN), if it stopped recording errors on something
that requires being contained and returned at that point (-EFAULT), and so
on (anyone who's got RAS on the brain will see where I'm coming from).

I have a distinct dislike of filling the kernel with const strings, so the
records would be strictly machine-readable and contain information about
the error for the record and the source. If userspace needs to print a
string then it can look up unique identifiers (UUIDv1, give or take, to
remove the need for any authority on numbering) in a database - be that
plaintext, gettext.po, bdb, sqlite, xml, json, C structure embedded in the
tool - one for each error source. That keeps strings, translations, string
formatting entirely outside the kernel, and keeps records from being freeform
typo-laden strings.

That'd give some generic Producer code in the kernel, and imply a companion
Consumer library in userspace (with said database backend), which could also
be responsible for logging the binary records somewhere for future reference
(perhaps bounded by capabilities or container privileges).

Pretty much every syscall that has problems returning 'just an errno'
could benefit from such a system, the only impediment I can see to it is
that it's adding a new subsystem to the kernel to produce these records,
and any syscall that needs it would have to gain either an extra parameter,
or attribute setting addition (like setsockopts) or to shoehorn the buffer
pointer into an existing parameter structure like perf_event_attr. That
would have to be locked down to a consistent method that would be
recommended (like adding a new syscall interface, not everyone has an
attribute interface or parameter structure) although there's no stopping
kernel devs adding in a way for legacy applications to easily receive
the same information through existing ABI.

In the case of perf_event_attr there is space enough to mark a bit to say
that errors could be reported in a buffer, but not enough in the reserved
space that exists to store a pointer for 64-bit systems (or 128-bit ones..)
without increasing its size. But, besides that, it would have the benefit
of simply being serialized with the syscall, and not a supplemental,
potentially non-thread-safe errno/strerror-like kernel-side implementation,
nor extra syscalls to retrieve information or arbitrary formatted or
unformatted strings.

Netlink extended ack could benefit from it simply by having been passed
an nlattr pointing to the buffer and recording extended error information
in it - the extended ack structure can report the status of the buffer
and use the cookie field to reproduce some information if necessary (which
extends the RAS error record concept further when needed).

Thoughts? Does anyone have any objections to a RAS-like error reporting
system for system calls, or any ideas on things that would benefit from
it above and beyond perf? We could always audit all the system calls and
their behaviors so we could do some worked examples but anyone who's got
a good candidate outside perf is welcome to suggest it.

Ta,
Matt




IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-05-17 15:07 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20180504104117.8086-1-mark.rutland@arm.com>
     [not found] ` <20180508184050.75c60b82ba538d0ecdeb2d78@arm.com>
     [not found]   ` <1525882003.4199.55.camel@arm.com>
     [not found]     ` <20180509184154.9994df2874e7a08025b56dea@arm.com>
     [not found]       ` <cee9c043-3d64-56b9-4f23-e8c8370387dd@gmail.com>
2018-05-17  0:25         ` [PATCH] drivers/perf: arm-ccn: stop spamming dmesg in event_init Kim Phillips
2018-05-17 15:07           ` Matt Sealey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).