LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] arch/x86/entry/vsyscall/vsyscall_gtod.c: remove __read_mostly from vclocks_used
@ 2018-06-04 17:27 Michael Rodin
  2018-06-22 22:47 ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Rodin @ 2018-06-04 17:27 UTC (permalink / raw)
  To: tglx, mingo, x86; +Cc: hpa, linux-kernel, Michael Rodin

The variable "vclocks_used" doesn't appear to be "read mostly".
Measurements of the access frequency with perf stat [1] and
perf report show, that approximately half of the accesses to
this variable are write accesses and happen in update_vsyscall()
in the file arch/x86/entry/vsyscall/vsyscall_gtod.c.
The measurements were done with the kernel 4.13.0-43-generic used by
ubuntu as well as with the stable kernel 4.16.7 with a custom config.
I've used "perf bench sched" and iperf3 as workloads.

This was discovered during my master thesis in the CADOS project [2].

[1] perf stat --all-cpus --group --event="{mem:addr/4:rw:k,mem:addr/4:w:k}"
[2] https://www.sra.uni-hannover.de/Research/CADOS/

Signed-off-by: Michael Rodin <michael-git@rodin.online>
---
 arch/x86/entry/vsyscall/vsyscall_gtod.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/entry/vsyscall/vsyscall_gtod.c b/arch/x86/entry/vsyscall/vsyscall_gtod.c
index e1216dd95c04..fc39f701fe4a 100644
--- a/arch/x86/entry/vsyscall/vsyscall_gtod.c
+++ b/arch/x86/entry/vsyscall/vsyscall_gtod.c
@@ -17,7 +17,7 @@
 #include <asm/vgtod.h>
 #include <asm/vvar.h>
 
-int vclocks_used __read_mostly;
+int vclocks_used;
 
 DEFINE_VVAR(struct vsyscall_gtod_data, vsyscall_gtod_data);
 
-- 
2.14.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] arch/x86/entry/vsyscall/vsyscall_gtod.c: remove __read_mostly from vclocks_used
  2018-06-04 17:27 [PATCH] arch/x86/entry/vsyscall/vsyscall_gtod.c: remove __read_mostly from vclocks_used Michael Rodin
@ 2018-06-22 22:47 ` Thomas Gleixner
  2018-06-22 23:25   ` Andy Lutomirski
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Gleixner @ 2018-06-22 22:47 UTC (permalink / raw)
  To: Michael Rodin
  Cc: x86, H. Peter Anvin, Ingo Molnar, LKML, Peter Zijlstra, Andy Lutomirski

Michael,

On Mon, 4 Jun 2018, Michael Rodin wrote:

> The variable "vclocks_used" doesn't appear to be "read mostly".
> Measurements of the access frequency with perf stat [1] and
> perf report show, that approximately half of the accesses to
> this variable are write accesses and happen in update_vsyscall()
> in the file arch/x86/entry/vsyscall/vsyscall_gtod.c.
> The measurements were done with the kernel 4.13.0-43-generic used by
> ubuntu as well as with the stable kernel 4.16.7 with a custom config.
> I've used "perf bench sched" and iperf3 as workloads.
> 
> This was discovered during my master thesis in the CADOS project [2].

Nice find, but ...

The point is that the content of that variable changes once in a blue moon,
so the intent of marking it read_mostly is almost correct. Almost, because
the unconditional write even with the ever repeating same value is not
free. To the best of my knowledge there is no CPU implementation which
supports optimizing silent stores, but I might be just not up to date or
wrong as usual.

So there is a cost to the write and there are two things which have to be
looked at before just doing the obvious knee jerk reaction
's/read_mostly//'.

1) Does it make sense to avoid the write if the value hasn't changed? That
   might be pretty much a free lunch because the variable has to be read in
   the first place in order to OR the supplied value to it before writing
   it back.

2) The variable is in a totally separate cache line than what the following
   stores target, i.e. vsyscall_gtod_data

So it might make sense just to do #1 but as vsyscall_gtod_data is anyway
going to be dirtied it probably makes even more sense to move that
vclocks_used variable into struct vsyscall_gtod_data right away to reduce
the amount of cache lines touched by update_vsyscall().

Care to have a look at that?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] arch/x86/entry/vsyscall/vsyscall_gtod.c: remove __read_mostly from vclocks_used
  2018-06-22 22:47 ` Thomas Gleixner
@ 2018-06-22 23:25   ` Andy Lutomirski
  2018-06-23  9:04     ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: Andy Lutomirski @ 2018-06-22 23:25 UTC (permalink / raw)
  To: Thomas Gleixner, Stephen Boyd, John Stultz
  Cc: michael-git, X86 ML, H. Peter Anvin, Ingo Molnar, LKML,
	Peter Zijlstra, Andrew Lutomirski

On Fri, Jun 22, 2018 at 3:47 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Michael,
>
> On Mon, 4 Jun 2018, Michael Rodin wrote:
>
> > The variable "vclocks_used" doesn't appear to be "read mostly".
> > Measurements of the access frequency with perf stat [1] and
> > perf report show, that approximately half of the accesses to
> > this variable are write accesses and happen in update_vsyscall()
> > in the file arch/x86/entry/vsyscall/vsyscall_gtod.c.
> > The measurements were done with the kernel 4.13.0-43-generic used by
> > ubuntu as well as with the stable kernel 4.16.7 with a custom config.
> > I've used "perf bench sched" and iperf3 as workloads.
> >
> > This was discovered during my master thesis in the CADOS project [2].
>
> Nice find, but ...
>
> The point is that the content of that variable changes once in a blue moon,
> so the intent of marking it read_mostly is almost correct.

I would propose a rather different fix.  Add a an
arch_change_clocksource() function.  Do:

static inline void arch_change_clocksource(struct clocksource
*new_clocksource) { ... }
#define arch_change_clocksource arch_change_clocksource

and

#ifndef arch_change_clocksource
static inline void arch_change_clocksource(struct clocksource
*new_clocksource) {}
#endif

in the generic header.  In change_clocksource(), add a call to
arch_change_clocksource() right after tk_setup_internals().  In x86's
arch_change_clocksource, update vclocks_used.

Now it's genuinely read_mostly, and we don't need to touch that
cacheline at all in the normal clock tick code.  Everyone wins.
(vclocks_used is actually rather rarely read.  It's only used to
prevent user code from accessing a never-used clocksource through the
vvar area, which is a hardening measure.  It's only referenced from
the vvar fault handler code.)

--Andy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] arch/x86/entry/vsyscall/vsyscall_gtod.c: remove __read_mostly from vclocks_used
  2018-06-22 23:25   ` Andy Lutomirski
@ 2018-06-23  9:04     ` Thomas Gleixner
  0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2018-06-23  9:04 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Stephen Boyd, John Stultz, michael-git, X86 ML, H. Peter Anvin,
	Ingo Molnar, LKML, Peter Zijlstra

On Fri, 22 Jun 2018, Andy Lutomirski wrote:
> On Fri, Jun 22, 2018 at 3:47 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Mon, 4 Jun 2018, Michael Rodin wrote:
> >
> > > The variable "vclocks_used" doesn't appear to be "read mostly".
> > > Measurements of the access frequency with perf stat [1] and
> > > perf report show, that approximately half of the accesses to
> > > this variable are write accesses and happen in update_vsyscall()
> > > in the file arch/x86/entry/vsyscall/vsyscall_gtod.c.
> > > The measurements were done with the kernel 4.13.0-43-generic used by
> > > ubuntu as well as with the stable kernel 4.16.7 with a custom config.
> > > I've used "perf bench sched" and iperf3 as workloads.
> > >
> > > This was discovered during my master thesis in the CADOS project [2].
> >
> > Nice find, but ...
> >
> > The point is that the content of that variable changes once in a blue moon,
> > so the intent of marking it read_mostly is almost correct.
> 
> I would propose a rather different fix.  Add a an
> arch_change_clocksource() function.  Do:
> 
> static inline void arch_change_clocksource(struct clocksource
> *new_clocksource) { ... }
> #define arch_change_clocksource arch_change_clocksource
> 
> and
> 
> #ifndef arch_change_clocksource
> static inline void arch_change_clocksource(struct clocksource
> *new_clocksource) {}
> #endif
>
> in the generic header.  In change_clocksource(), add a call to
> arch_change_clocksource() right after tk_setup_internals().  In x86's
> arch_change_clocksource, update vclocks_used.
> 
> Now it's genuinely read_mostly, and we don't need to touch that
> cacheline at all in the normal clock tick code.  Everyone wins.
> (vclocks_used is actually rather rarely read.  It's only used to
> prevent user code from accessing a never-used clocksource through the
> vvar area, which is a hardening measure.  It's only referenced from
> the vvar fault handler code.)

Agreed.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-06-23  9:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-04 17:27 [PATCH] arch/x86/entry/vsyscall/vsyscall_gtod.c: remove __read_mostly from vclocks_used Michael Rodin
2018-06-22 22:47 ` Thomas Gleixner
2018-06-22 23:25   ` Andy Lutomirski
2018-06-23  9:04     ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).