LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: rcu@vger.kernel.org,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Kernel Team" <kernel-team@fb.com>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Lai Jiangshan" <jiangshanlai@gmail.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>,
	"Josh Triplett" <josh@joshtriplett.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"David Howells" <dhowells@redhat.com>,
	"Eric Dumazet" <edumazet@google.com>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Oleg Nesterov" <oleg@redhat.com>,
	"Joel Fernandes" <joel@joelfernandes.org>
Subject: Re: [PATCH rcu 04/18] rcu: Weaken ->dynticks accesses and updates
Date: Wed, 21 Jul 2021 14:25:15 -0700	[thread overview]
Message-ID: <20210721212515.GV4397@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <CAHk-=wgm_W82CcbiJHZPw45QRwomFbWcHkFoOd5C5hG-5GGoRQ@mail.gmail.com>

On Wed, Jul 21, 2021 at 01:41:46PM -0700, Linus Torvalds wrote:
> Hmm.
> 
> This actually seems to make some of the ordering worse.
> 
> I'm not seeing a lot of weakening or optimization, but it depends a
> bit on what is common and what is not.

Agreed, and I expect that I will be reworking this patch rather
thoroughly.

Something about smp_mb() often being a locked atomic operation on a
stack location.  :-/

But you did ask for this to be sped up some years back (before the
memory model was formalized), so I figured I should at least show what
can be done.  Plus I expect that you know much more about what Intel is
planning than I do.

> On Wed, Jul 21, 2021 at 1:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > +/*
> > + * Increment the current CPU's rcu_data structure's ->dynticks field
> > + * with ordering.  Return the new value.
> > + */
> > +static noinstr unsigned long rcu_dynticks_inc(int incby)
> > +{
> > +       struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> > +       int seq;
> > +
> > +       seq = READ_ONCE(rdp->dynticks) + incby;
> > +       smp_store_release(&rdp->dynticks, seq);
> > +       smp_mb();  // Fundamental RCU ordering guarantee.
> > +       return seq;
> > +}
> 
> So this is actually likely *more* expensive than the old code was, at
> least on x86.
> 
> The READ_ONCE/smp_store_release are cheap, but then the smp_mb() is expensive.
> 
> The old code did just arch_atomic_inc_return(), which included the
> memory barrier.
> 
> There *might* be some cache ordering advantage to letting the
> READ_ONCE() float upwards, but from a pure barrier standpoint this is
> more expensive than what we used to have.

No argument here.

> > -       if (atomic_read(&rdp->dynticks) & 0x1)
> > +       if (READ_ONCE(rdp->dynticks) & 0x1)
> >                 return;
> > -       atomic_inc(&rdp->dynticks);
> > +       rcu_dynticks_inc(1);
> 
> And this one seems to not take advantage of the new rule, so we end up
> having two reads, and then that potentially more expensive sequence.

This one only executes when a CPU comes online, so I am not worried
about its overhead.

> >  static int rcu_dynticks_snap(struct rcu_data *rdp)
> >  {
> > -       return atomic_add_return(0, &rdp->dynticks);
> > +       smp_mb();  // Fundamental RCU ordering guarantee.
> > +       return smp_load_acquire(&rdp->dynticks);
> >  }
> 
> This is likely cheaper - not because of barriers, but simply because
> it avoids dirtying the cacheline.
> 
> So which operation do we _care_ about, and do we have numbers for why
> this improves anything? Because looking at the patch, it's not obvious
> that this is an improvement.

It sounds like I should keep this hunk and revert the rest back to
atomic operations, but still in the new rcu_dynticks_inc() function.

Either way, thank you for looking this over!

							Thanx, Paul

  reply	other threads:[~2021-07-21 21:25 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-21 20:20 [PATCH rcu 0/18] Miscellaneous fixes for v5.15 Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 01/18] rcu: Fix to include first blocked task in stall warning Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 02/18] rcu: Fix stall-warning deadlock due to non-release of rcu_node ->lock Paul E. McKenney
2021-08-03 14:24   ` Qais Yousef
2021-08-03 15:52     ` Paul E. McKenney
2021-08-03 16:12       ` Qais Yousef
2021-08-03 16:28         ` Paul E. McKenney
2021-08-03 16:33           ` Qais Yousef
2021-08-04 13:50           ` Qais Yousef
2021-08-04 22:33             ` Paul E. McKenney
2021-08-06  9:56               ` Qais Yousef
2021-08-06  9:57   ` Qais Yousef
2021-08-06 11:43     ` Paul E. McKenney
2021-08-06 12:33       ` Qais Yousef
2021-07-21 20:21 ` [PATCH rcu 03/18] rcu: Remove special bit at the bottom of the ->dynticks counter Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 04/18] rcu: Weaken ->dynticks accesses and updates Paul E. McKenney
2021-07-21 20:41   ` Linus Torvalds
2021-07-21 21:25     ` Paul E. McKenney [this message]
2021-07-28 17:37   ` [PATCH v2 " Paul E. McKenney
2021-07-28 17:58     ` Linus Torvalds
2021-07-28 18:12       ` Mathieu Desnoyers
2021-07-28 18:32         ` Linus Torvalds
2021-07-28 18:39           ` Mathieu Desnoyers
2021-07-28 18:46         ` Paul E. McKenney
2021-07-28 18:46       ` Paul E. McKenney
2021-07-28 18:57         ` Linus Torvalds
2021-07-28 18:23     ` Mathieu Desnoyers
2021-07-28 18:58       ` Paul E. McKenney
2021-07-28 19:45         ` Paul E. McKenney
2021-07-28 20:03           ` Mathieu Desnoyers
2021-07-28 20:28             ` Paul E. McKenney
2021-07-29 14:41               ` Mathieu Desnoyers
2021-07-29 15:57                 ` Paul E. McKenney
2021-07-29 17:41                   ` Mathieu Desnoyers
2021-07-29 18:05                     ` Paul E. McKenney
2021-07-29 18:42                       ` Mathieu Desnoyers
2021-07-28 20:37     ` Josh Triplett
2021-07-28 20:47       ` Paul E. McKenney
2021-07-28 22:23         ` Frederic Weisbecker
2021-07-29  1:07           ` Paul E. McKenney
2021-07-29  7:58   ` [PATCH " Boqun Feng
2021-07-29 10:53     ` Frederic Weisbecker
2021-07-30  5:56       ` Boqun Feng
2021-07-30 17:18         ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 05/18] rcu: Mark accesses to ->rcu_read_lock_nesting Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 06/18] rculist: Unify documentation about missing list_empty_rcu() Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 07/18] rcu/tree: Handle VM stoppage in stall detection Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 08/18] rcu: Do not disable GP stall detection in rcu_cpu_stall_reset() Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 09/18] rcu: Start timing stall repetitions after warning complete Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 10/18] srcutiny: Mark read-side data races Paul E. McKenney
2021-07-29  8:23   ` Boqun Feng
2021-07-29 13:36     ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 11/18] rcu: Mark lockless ->qsmask read in rcu_check_boost_fail() Paul E. McKenney
2021-07-29  8:54   ` Boqun Feng
2021-07-29 14:03     ` Paul E. McKenney
2021-07-30  2:28       ` Boqun Feng
2021-07-30  3:26         ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 12/18] rcu: Make rcu_gp_init() and rcu_gp_fqs_loop noinline to conserve stack Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 13/18] rcu: Remove trailing spaces and tabs Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 14/18] rcu: Mark accesses in tree_stall.h Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 15/18] rcu: Remove useless "ret" update in rcu_gp_fqs_loop() Paul E. McKenney
2021-08-03 16:48   ` Joe Perches
2021-08-03 17:10     ` Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 16/18] rcu: Use per_cpu_ptr to get the pointer of per_cpu variable Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 17/18] rcu: Explain why rcu_all_qs() is a stub in preemptible TREE RCU Paul E. McKenney
2021-07-21 20:21 ` [PATCH rcu 18/18] rcu: Print human-readable message for schedule() in RCU reader Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210721212515.GV4397@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --subject='Re: [PATCH rcu 04/18] rcu: Weaken ->dynticks accesses and updates' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).