LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Rik van Riel <riel@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	mtosatti@redhat.com, mingo@kernel.org, ak@linux.intel.com,
	oleg@redhat.com, masami.hiramatsu.pt@hitachi.com,
	fweisbec@gmail.com, lcapitulino@redhat.com, pbonzini@redhat.com
Subject: Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest
Date: Thu, 5 Feb 2015 10:56:27 -0800	[thread overview]
Message-ID: <20150205185627.GK5370@linux.vnet.ibm.com> (raw)
In-Reply-To: <54D3B1CF.1000301@redhat.com>

On Thu, Feb 05, 2015 at 01:09:19PM -0500, Rik van Riel wrote:
> On 02/05/2015 12:50 PM, Paul E. McKenney wrote:
> > On Thu, Feb 05, 2015 at 11:52:37AM -0500, Rik van Riel wrote:
> >> On 02/05/2015 11:44 AM, Christian Borntraeger wrote:
> >>> Am 05.02.2015 um 17:35 schrieb riel@redhat.com:
> >>>> From: Rik van Riel <riel@redhat.com>
> >>>>
> >>>> The host kernel is not doing anything while the CPU is executing
> >>>> a KVM guest VCPU, so it can be marked as being in an extended
> >>>> quiescent state, identical to that used when running user space
> >>>> code.
> >>>>
> >>>> The only exception to that rule is when the host handles an
> >>>> interrupt, which is already handled by the irq code, which
> >>>> calls rcu_irq_enter and rcu_irq_exit.
> >>>>
> >>>> The guest_enter and guest_exit functions already switch vtime
> >>>> accounting independent of context tracking, so leave those calls
> >>>> where they are, instead of moving them into the context tracking
> >>>> code.
> >>>>
> >>>> Signed-off-by: Rik van Riel <riel@redhat.com>
> >>>> ---
> >>>>  include/linux/context_tracking.h       | 8 +++++++-
> >>>>  include/linux/context_tracking_state.h | 1 +
> >>>>  2 files changed, 8 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> >>>> index bd9f000fc98d..a5d3bb44b897 100644
> >>>> --- a/include/linux/context_tracking.h
> >>>> +++ b/include/linux/context_tracking.h
> >>>> @@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
> >>>>  static inline void exception_exit(enum ctx_state prev_ctx)
> >>>>  {
> >>>>  	if (context_tracking_is_enabled()) {
> >>>> -		if (prev_ctx == IN_USER)
> >>>> +		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
> >>>>  			context_tracking_user_enter(prev_ctx);
> >>>>  	}
> >>>>  }
> >>>> @@ -78,6 +78,9 @@ static inline void guest_enter(void)
> >>>>  		vtime_guest_enter(current);
> >>>>  	else
> >>>>  		current->flags |= PF_VCPU;
> >>>> +
> >>>> +	if (context_tracking_is_enabled())
> >>>> +		context_tracking_user_enter(IN_GUEST);
> >>>>  }
> >>>
> >>>
> >>> Couldnt we make
> >>>     rcu_virt_note_context_switch(smp_processor_id());
> >>> conditional in include/linux/kvm_host.h (kvm_guest_enter)
> >>>
> >>> e.g. something like
> >>>     if (!context_tracking_is_enabled())
> >>> 	    rcu_virt_note_context_switch(smp_processor_id());
> >>
> >> Possibly. I considered the same, but I do not know whether
> >> or not just rcu_user_enter / rcu_user_exit is enough.
> >>
> >> I could certainly try it out and see whether anything
> >> explodes, but I am not convinced that is careful enough
> >> when it comes to handling RCU code...
> >>
> >> Paul? :)
> > 
> > That can fail for some odd combinations of Kconfig and boot parameters.
> > As I understand it, you instead want the following:
> > 
> > 	if (!tick_nohz_full_cpu(smp_processor_id()))
> > 		rcu_virt_note_context_switch(smp_processor_id());
> > 
> > Frederic might know of a better approach.
> 
> Interesting, in what cases would that happen?

My concern, perhaps misplaced, is that context_tracking_is_enabled(),
but that the current CPU is not a nohz_full= CPU.  In that case, the
context-tracking code would be within its rights to not tell RCU about
the transition to the guest, and thus RCU would be unaware that the
CPU was in a quiescent state.

In most workloads, you would expect the CPU to get interrupted or
preempted or something at some point, which would eventually inform
RCU, but too much delay would result in the infamous RCU CPU stall
warning message.  So let's code a bit more defensively.

> If context_tracking_is_enabled() we end up eventually
> calling into rcu_user_enter & rcu_user_exit.
> 
> If it is not enabled, we call rcu_virt_note_context_switch.
> 
> In what cases would we need to call both?
> 
> I don't see exit-to-userspace do the things that
> rcu_virt_note_context_switch does, and do not understand
> why userspace is fine with that...

The real danger is doing neither.

On tick_nohz_full_cpu() CPUs, the exit-to-userspace code should invoke
rcu_user_enter(), which sets some per-CPU state telling RCU to ignore
that CPU, since it cannot possibly do host RCU read-side critical sections
while running a guest.

In contrast, a non-tick_nohz_full_cpu() CPU doesn't let RCU
know that it is executing in a guest or in userspace.  So the
rcu_virt_note_context_switch() does the notification in that case.

							Thanx, Paul


  reply	other threads:[~2015-02-05 18:56 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-05 16:35 [PATCH 0/4] rcu,nohz,kvm: " riel
2015-02-05 16:35 ` [PATCH 1/4] rcu,nohz: add state parameter to context_tracking_user_enter/exit riel
2015-02-05 16:35 ` [PATCH 2/4] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER riel
2015-02-05 16:35 ` [PATCH 3/4] nohz,kvm: export context_tracking_user_enter/exit riel
2015-02-05 16:35 ` [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest riel
2015-02-05 16:44   ` Christian Borntraeger
2015-02-05 16:52     ` Rik van Riel
2015-02-05 17:50       ` Paul E. McKenney
2015-02-05 18:09         ` Rik van Riel
2015-02-05 18:56           ` Paul E. McKenney [this message]
2015-02-05 18:59             ` Rik van Riel
2015-02-05 19:02             ` Rik van Riel
2015-02-05 19:27               ` Paul E. McKenney
2015-02-05 20:19                 ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150205185627.GK5370@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=ak@linux.intel.com \
    --cc=borntraeger@de.ibm.com \
    --cc=fweisbec@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=oleg@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@redhat.com \
    --subject='Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).