LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/4] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest
@ 2015-02-05 16:35 riel
  2015-02-05 16:35 ` [PATCH 1/4] rcu,nohz: add state parameter to context_tracking_user_enter/exit riel
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: riel @ 2015-02-05 16:35 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, mtosatti, mingo, ak, oleg, masami.hiramatsu.pt,
	fweisbec, paulmck, lcapitulino, pbonzini

When running a KVM guest on a system with NOHZ_FULL enabled, and the
KVM guest running with idle=poll mode, we still get wakeups of the
rcuos/N threads.

This problem has already been solved for user space by telling the
RCU subsystem that the CPU is in an extended quiescent state while
running user space code.

This patch series extends that code a little bit to make it usable
to track KVM guest space, too.

I tested the code by booting a KVM guest with idle=poll, on a system
with NOHZ_FULL enabled on most CPUs, and a VCPU thread bound to a
CPU. In a 10 second interval, rcuos/N threads on other CPUs got woken
up several times, while the rcuos thread on the CPU running the bound
and alwasy running VCPU thread never got woken up once.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-05 16:35 [PATCH 0/4] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
@ 2015-02-05 16:35 ` riel
  2015-02-05 16:35 ` [PATCH 2/4] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER riel
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: riel @ 2015-02-05 16:35 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, mtosatti, mingo, ak, oleg, masami.hiramatsu.pt,
	fweisbec, paulmck, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

Add the expected ctx_state as a parameter to context_tracking_user_enter
and context_tracking_user_exit, allowing the same functions to not just
track kernel <> user space switching, but also kernel <> guest transitions.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/context_tracking.h | 12 ++++++------
 kernel/context_tracking.c        | 10 +++++-----
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 37b81bd51ec0..bd9f000fc98d 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -10,21 +10,21 @@
 #ifdef CONFIG_CONTEXT_TRACKING
 extern void context_tracking_cpu_set(int cpu);
 
-extern void context_tracking_user_enter(void);
-extern void context_tracking_user_exit(void);
+extern void context_tracking_user_enter(enum ctx_state state);
+extern void context_tracking_user_exit(enum ctx_state state);
 extern void __context_tracking_task_switch(struct task_struct *prev,
 					   struct task_struct *next);
 
 static inline void user_enter(void)
 {
 	if (context_tracking_is_enabled())
-		context_tracking_user_enter();
+		context_tracking_user_enter(IN_USER);
 
 }
 static inline void user_exit(void)
 {
 	if (context_tracking_is_enabled())
-		context_tracking_user_exit();
+		context_tracking_user_exit(IN_USER);
 }
 
 static inline enum ctx_state exception_enter(void)
@@ -35,7 +35,7 @@ static inline enum ctx_state exception_enter(void)
 		return 0;
 
 	prev_ctx = this_cpu_read(context_tracking.state);
-	context_tracking_user_exit();
+	context_tracking_user_exit(prev_ctx);
 
 	return prev_ctx;
 }
@@ -44,7 +44,7 @@ static inline void exception_exit(enum ctx_state prev_ctx)
 {
 	if (context_tracking_is_enabled()) {
 		if (prev_ctx == IN_USER)
-			context_tracking_user_enter();
+			context_tracking_user_enter(prev_ctx);
 	}
 }
 
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 937ecdfdf258..4c010787c9ec 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -47,7 +47,7 @@ void context_tracking_cpu_set(int cpu)
  * to execute won't use any RCU read side critical section because this
  * function sets RCU in extended quiescent state.
  */
-void context_tracking_user_enter(void)
+void context_tracking_user_enter(enum ctx_state state)
 {
 	unsigned long flags;
 
@@ -75,7 +75,7 @@ void context_tracking_user_enter(void)
 	WARN_ON_ONCE(!current->mm);
 
 	local_irq_save(flags);
-	if ( __this_cpu_read(context_tracking.state) != IN_USER) {
+	if ( __this_cpu_read(context_tracking.state) != state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			trace_user_enter(0);
 			/*
@@ -101,7 +101,7 @@ void context_tracking_user_enter(void)
 		 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
 		 * is false because we know that CPU is not tickless.
 		 */
-		__this_cpu_write(context_tracking.state, IN_USER);
+		__this_cpu_write(context_tracking.state, state);
 	}
 	local_irq_restore(flags);
 }
@@ -118,7 +118,7 @@ NOKPROBE_SYMBOL(context_tracking_user_enter);
  * This call supports re-entrancy. This way it can be called from any exception
  * handler without needing to know if we came from userspace or not.
  */
-void context_tracking_user_exit(void)
+void context_tracking_user_exit(enum ctx_state state)
 {
 	unsigned long flags;
 
@@ -129,7 +129,7 @@ void context_tracking_user_exit(void)
 		return;
 
 	local_irq_save(flags);
-	if (__this_cpu_read(context_tracking.state) == IN_USER) {
+	if (__this_cpu_read(context_tracking.state) == state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			/*
 			 * We are going to run code that may use RCU. Inform
-- 
1.9.3


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/4] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER
  2015-02-05 16:35 [PATCH 0/4] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
  2015-02-05 16:35 ` [PATCH 1/4] rcu,nohz: add state parameter to context_tracking_user_enter/exit riel
@ 2015-02-05 16:35 ` riel
  2015-02-05 16:35 ` [PATCH 3/4] nohz,kvm: export context_tracking_user_enter/exit riel
  2015-02-05 16:35 ` [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest riel
  3 siblings, 0 replies; 14+ messages in thread
From: riel @ 2015-02-05 16:35 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, mtosatti, mingo, ak, oleg, masami.hiramatsu.pt,
	fweisbec, paulmck, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

Only run vtime_user_enter and vtime_user_exit when we are entering
or exiting user state, respectively.

The RCU code only distinguishes between "idle" and "not idle or kernel".
There should be no need to add an additional (unused) state there.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/context_tracking.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 4c010787c9ec..97806c4deec5 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -85,7 +85,8 @@ void context_tracking_user_enter(enum ctx_state state)
 			 * user_exit() or rcu_irq_enter(). Let's remove RCU's dependency
 			 * on the tick.
 			 */
-			vtime_user_enter(current);
+			if (state == IN_USER)
+				vtime_user_enter(current);
 			rcu_user_enter();
 		}
 		/*
@@ -136,7 +137,8 @@ void context_tracking_user_exit(enum ctx_state state)
 			 * RCU core about that (ie: we may need the tick again).
 			 */
 			rcu_user_exit();
-			vtime_user_exit(current);
+			if (state == IN_USER)
+				vtime_user_exit(current);
 			trace_user_exit(0);
 		}
 		__this_cpu_write(context_tracking.state, IN_KERNEL);
-- 
1.9.3


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/4] nohz,kvm: export context_tracking_user_enter/exit
  2015-02-05 16:35 [PATCH 0/4] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
  2015-02-05 16:35 ` [PATCH 1/4] rcu,nohz: add state parameter to context_tracking_user_enter/exit riel
  2015-02-05 16:35 ` [PATCH 2/4] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER riel
@ 2015-02-05 16:35 ` riel
  2015-02-05 16:35 ` [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest riel
  3 siblings, 0 replies; 14+ messages in thread
From: riel @ 2015-02-05 16:35 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, mtosatti, mingo, ak, oleg, masami.hiramatsu.pt,
	fweisbec, paulmck, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

Export context_tracking_user_enter/exit so it can be used by KVM.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/context_tracking.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 97806c4deec5..a3f4a2840637 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -107,6 +107,7 @@ void context_tracking_user_enter(enum ctx_state state)
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_user_enter);
+EXPORT_SYMBOL_GPL(context_tracking_user_enter);
 
 /**
  * context_tracking_user_exit - Inform the context tracking that the CPU is
@@ -146,6 +147,7 @@ void context_tracking_user_exit(enum ctx_state state)
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_user_exit);
+EXPORT_SYMBOL_GPL(context_tracking_user_exit);
 
 /**
  * __context_tracking_task_switch - context switch the syscall callbacks
-- 
1.9.3


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest
  2015-02-05 16:35 [PATCH 0/4] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
                   ` (2 preceding siblings ...)
  2015-02-05 16:35 ` [PATCH 3/4] nohz,kvm: export context_tracking_user_enter/exit riel
@ 2015-02-05 16:35 ` riel
  2015-02-05 16:44   ` Christian Borntraeger
  3 siblings, 1 reply; 14+ messages in thread
From: riel @ 2015-02-05 16:35 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, mtosatti, mingo, ak, oleg, masami.hiramatsu.pt,
	fweisbec, paulmck, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

The host kernel is not doing anything while the CPU is executing
a KVM guest VCPU, so it can be marked as being in an extended
quiescent state, identical to that used when running user space
code.

The only exception to that rule is when the host handles an
interrupt, which is already handled by the irq code, which
calls rcu_irq_enter and rcu_irq_exit.

The guest_enter and guest_exit functions already switch vtime
accounting independent of context tracking, so leave those calls
where they are, instead of moving them into the context tracking
code.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/context_tracking.h       | 8 +++++++-
 include/linux/context_tracking_state.h | 1 +
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index bd9f000fc98d..a5d3bb44b897 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
 static inline void exception_exit(enum ctx_state prev_ctx)
 {
 	if (context_tracking_is_enabled()) {
-		if (prev_ctx == IN_USER)
+		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
 			context_tracking_user_enter(prev_ctx);
 	}
 }
@@ -78,6 +78,9 @@ static inline void guest_enter(void)
 		vtime_guest_enter(current);
 	else
 		current->flags |= PF_VCPU;
+
+	if (context_tracking_is_enabled())
+		context_tracking_user_enter(IN_GUEST);
 }
 
 static inline void guest_exit(void)
@@ -86,6 +89,9 @@ static inline void guest_exit(void)
 		vtime_guest_exit(current);
 	else
 		current->flags &= ~PF_VCPU;
+
+	if (context_tracking_is_enabled())
+		context_tracking_user_exit(IN_GUEST);
 }
 
 #else
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 97a81225d037..f3ef027af749 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -15,6 +15,7 @@ struct context_tracking {
 	enum ctx_state {
 		IN_KERNEL = 0,
 		IN_USER,
+		IN_GUEST,
 	} state;
 };
 
-- 
1.9.3


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest
  2015-02-05 16:35 ` [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest riel
@ 2015-02-05 16:44   ` Christian Borntraeger
  2015-02-05 16:52     ` Rik van Riel
  0 siblings, 1 reply; 14+ messages in thread
From: Christian Borntraeger @ 2015-02-05 16:44 UTC (permalink / raw)
  To: riel, kvm
  Cc: linux-kernel, mtosatti, mingo, ak, oleg, masami.hiramatsu.pt,
	fweisbec, paulmck, lcapitulino, pbonzini

Am 05.02.2015 um 17:35 schrieb riel@redhat.com:
> From: Rik van Riel <riel@redhat.com>
> 
> The host kernel is not doing anything while the CPU is executing
> a KVM guest VCPU, so it can be marked as being in an extended
> quiescent state, identical to that used when running user space
> code.
> 
> The only exception to that rule is when the host handles an
> interrupt, which is already handled by the irq code, which
> calls rcu_irq_enter and rcu_irq_exit.
> 
> The guest_enter and guest_exit functions already switch vtime
> accounting independent of context tracking, so leave those calls
> where they are, instead of moving them into the context tracking
> code.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
>  include/linux/context_tracking.h       | 8 +++++++-
>  include/linux/context_tracking_state.h | 1 +
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> index bd9f000fc98d..a5d3bb44b897 100644
> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
>  static inline void exception_exit(enum ctx_state prev_ctx)
>  {
>  	if (context_tracking_is_enabled()) {
> -		if (prev_ctx == IN_USER)
> +		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
>  			context_tracking_user_enter(prev_ctx);
>  	}
>  }
> @@ -78,6 +78,9 @@ static inline void guest_enter(void)
>  		vtime_guest_enter(current);
>  	else
>  		current->flags |= PF_VCPU;
> +
> +	if (context_tracking_is_enabled())
> +		context_tracking_user_enter(IN_GUEST);
>  }


Couldnt we make
    rcu_virt_note_context_switch(smp_processor_id());
conditional in include/linux/kvm_host.h (kvm_guest_enter)

e.g. something like
    if (!context_tracking_is_enabled())
	    rcu_virt_note_context_switch(smp_processor_id());



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest
  2015-02-05 16:44   ` Christian Borntraeger
@ 2015-02-05 16:52     ` Rik van Riel
  2015-02-05 17:50       ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Rik van Riel @ 2015-02-05 16:52 UTC (permalink / raw)
  To: Christian Borntraeger, kvm
  Cc: linux-kernel, mtosatti, mingo, ak, oleg, masami.hiramatsu.pt,
	fweisbec, paulmck, lcapitulino, pbonzini

On 02/05/2015 11:44 AM, Christian Borntraeger wrote:
> Am 05.02.2015 um 17:35 schrieb riel@redhat.com:
>> From: Rik van Riel <riel@redhat.com>
>>
>> The host kernel is not doing anything while the CPU is executing
>> a KVM guest VCPU, so it can be marked as being in an extended
>> quiescent state, identical to that used when running user space
>> code.
>>
>> The only exception to that rule is when the host handles an
>> interrupt, which is already handled by the irq code, which
>> calls rcu_irq_enter and rcu_irq_exit.
>>
>> The guest_enter and guest_exit functions already switch vtime
>> accounting independent of context tracking, so leave those calls
>> where they are, instead of moving them into the context tracking
>> code.
>>
>> Signed-off-by: Rik van Riel <riel@redhat.com>
>> ---
>>  include/linux/context_tracking.h       | 8 +++++++-
>>  include/linux/context_tracking_state.h | 1 +
>>  2 files changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
>> index bd9f000fc98d..a5d3bb44b897 100644
>> --- a/include/linux/context_tracking.h
>> +++ b/include/linux/context_tracking.h
>> @@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
>>  static inline void exception_exit(enum ctx_state prev_ctx)
>>  {
>>  	if (context_tracking_is_enabled()) {
>> -		if (prev_ctx == IN_USER)
>> +		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
>>  			context_tracking_user_enter(prev_ctx);
>>  	}
>>  }
>> @@ -78,6 +78,9 @@ static inline void guest_enter(void)
>>  		vtime_guest_enter(current);
>>  	else
>>  		current->flags |= PF_VCPU;
>> +
>> +	if (context_tracking_is_enabled())
>> +		context_tracking_user_enter(IN_GUEST);
>>  }
> 
> 
> Couldnt we make
>     rcu_virt_note_context_switch(smp_processor_id());
> conditional in include/linux/kvm_host.h (kvm_guest_enter)
> 
> e.g. something like
>     if (!context_tracking_is_enabled())
> 	    rcu_virt_note_context_switch(smp_processor_id());

Possibly. I considered the same, but I do not know whether
or not just rcu_user_enter / rcu_user_exit is enough.

I could certainly try it out and see whether anything
explodes, but I am not convinced that is careful enough
when it comes to handling RCU code...

Paul? :)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest
  2015-02-05 16:52     ` Rik van Riel
@ 2015-02-05 17:50       ` Paul E. McKenney
  2015-02-05 18:09         ` Rik van Riel
  0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2015-02-05 17:50 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Christian Borntraeger, kvm, linux-kernel, mtosatti, mingo, ak,
	oleg, masami.hiramatsu.pt, fweisbec, lcapitulino, pbonzini

On Thu, Feb 05, 2015 at 11:52:37AM -0500, Rik van Riel wrote:
> On 02/05/2015 11:44 AM, Christian Borntraeger wrote:
> > Am 05.02.2015 um 17:35 schrieb riel@redhat.com:
> >> From: Rik van Riel <riel@redhat.com>
> >>
> >> The host kernel is not doing anything while the CPU is executing
> >> a KVM guest VCPU, so it can be marked as being in an extended
> >> quiescent state, identical to that used when running user space
> >> code.
> >>
> >> The only exception to that rule is when the host handles an
> >> interrupt, which is already handled by the irq code, which
> >> calls rcu_irq_enter and rcu_irq_exit.
> >>
> >> The guest_enter and guest_exit functions already switch vtime
> >> accounting independent of context tracking, so leave those calls
> >> where they are, instead of moving them into the context tracking
> >> code.
> >>
> >> Signed-off-by: Rik van Riel <riel@redhat.com>
> >> ---
> >>  include/linux/context_tracking.h       | 8 +++++++-
> >>  include/linux/context_tracking_state.h | 1 +
> >>  2 files changed, 8 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> >> index bd9f000fc98d..a5d3bb44b897 100644
> >> --- a/include/linux/context_tracking.h
> >> +++ b/include/linux/context_tracking.h
> >> @@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
> >>  static inline void exception_exit(enum ctx_state prev_ctx)
> >>  {
> >>  	if (context_tracking_is_enabled()) {
> >> -		if (prev_ctx == IN_USER)
> >> +		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
> >>  			context_tracking_user_enter(prev_ctx);
> >>  	}
> >>  }
> >> @@ -78,6 +78,9 @@ static inline void guest_enter(void)
> >>  		vtime_guest_enter(current);
> >>  	else
> >>  		current->flags |= PF_VCPU;
> >> +
> >> +	if (context_tracking_is_enabled())
> >> +		context_tracking_user_enter(IN_GUEST);
> >>  }
> > 
> > 
> > Couldnt we make
> >     rcu_virt_note_context_switch(smp_processor_id());
> > conditional in include/linux/kvm_host.h (kvm_guest_enter)
> > 
> > e.g. something like
> >     if (!context_tracking_is_enabled())
> > 	    rcu_virt_note_context_switch(smp_processor_id());
> 
> Possibly. I considered the same, but I do not know whether
> or not just rcu_user_enter / rcu_user_exit is enough.
> 
> I could certainly try it out and see whether anything
> explodes, but I am not convinced that is careful enough
> when it comes to handling RCU code...
> 
> Paul? :)

That can fail for some odd combinations of Kconfig and boot parameters.
As I understand it, you instead want the following:

	if (!tick_nohz_full_cpu(smp_processor_id()))
		rcu_virt_note_context_switch(smp_processor_id());

Frederic might know of a better approach.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest
  2015-02-05 17:50       ` Paul E. McKenney
@ 2015-02-05 18:09         ` Rik van Riel
  2015-02-05 18:56           ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Rik van Riel @ 2015-02-05 18:09 UTC (permalink / raw)
  To: paulmck
  Cc: Christian Borntraeger, kvm, linux-kernel, mtosatti, mingo, ak,
	oleg, masami.hiramatsu.pt, fweisbec, lcapitulino, pbonzini

On 02/05/2015 12:50 PM, Paul E. McKenney wrote:
> On Thu, Feb 05, 2015 at 11:52:37AM -0500, Rik van Riel wrote:
>> On 02/05/2015 11:44 AM, Christian Borntraeger wrote:
>>> Am 05.02.2015 um 17:35 schrieb riel@redhat.com:
>>>> From: Rik van Riel <riel@redhat.com>
>>>>
>>>> The host kernel is not doing anything while the CPU is executing
>>>> a KVM guest VCPU, so it can be marked as being in an extended
>>>> quiescent state, identical to that used when running user space
>>>> code.
>>>>
>>>> The only exception to that rule is when the host handles an
>>>> interrupt, which is already handled by the irq code, which
>>>> calls rcu_irq_enter and rcu_irq_exit.
>>>>
>>>> The guest_enter and guest_exit functions already switch vtime
>>>> accounting independent of context tracking, so leave those calls
>>>> where they are, instead of moving them into the context tracking
>>>> code.
>>>>
>>>> Signed-off-by: Rik van Riel <riel@redhat.com>
>>>> ---
>>>>  include/linux/context_tracking.h       | 8 +++++++-
>>>>  include/linux/context_tracking_state.h | 1 +
>>>>  2 files changed, 8 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
>>>> index bd9f000fc98d..a5d3bb44b897 100644
>>>> --- a/include/linux/context_tracking.h
>>>> +++ b/include/linux/context_tracking.h
>>>> @@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
>>>>  static inline void exception_exit(enum ctx_state prev_ctx)
>>>>  {
>>>>  	if (context_tracking_is_enabled()) {
>>>> -		if (prev_ctx == IN_USER)
>>>> +		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
>>>>  			context_tracking_user_enter(prev_ctx);
>>>>  	}
>>>>  }
>>>> @@ -78,6 +78,9 @@ static inline void guest_enter(void)
>>>>  		vtime_guest_enter(current);
>>>>  	else
>>>>  		current->flags |= PF_VCPU;
>>>> +
>>>> +	if (context_tracking_is_enabled())
>>>> +		context_tracking_user_enter(IN_GUEST);
>>>>  }
>>>
>>>
>>> Couldnt we make
>>>     rcu_virt_note_context_switch(smp_processor_id());
>>> conditional in include/linux/kvm_host.h (kvm_guest_enter)
>>>
>>> e.g. something like
>>>     if (!context_tracking_is_enabled())
>>> 	    rcu_virt_note_context_switch(smp_processor_id());
>>
>> Possibly. I considered the same, but I do not know whether
>> or not just rcu_user_enter / rcu_user_exit is enough.
>>
>> I could certainly try it out and see whether anything
>> explodes, but I am not convinced that is careful enough
>> when it comes to handling RCU code...
>>
>> Paul? :)
> 
> That can fail for some odd combinations of Kconfig and boot parameters.
> As I understand it, you instead want the following:
> 
> 	if (!tick_nohz_full_cpu(smp_processor_id()))
> 		rcu_virt_note_context_switch(smp_processor_id());
> 
> Frederic might know of a better approach.

Interesting, in what cases would that happen?

If context_tracking_is_enabled() we end up eventually
calling into rcu_user_enter & rcu_user_exit.

If it is not enabled, we call rcu_virt_note_context_switch.

In what cases would we need to call both?

I don't see exit-to-userspace do the things that
rcu_virt_note_context_switch does, and do not understand
why userspace is fine with that...


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest
  2015-02-05 18:09         ` Rik van Riel
@ 2015-02-05 18:56           ` Paul E. McKenney
  2015-02-05 18:59             ` Rik van Riel
  2015-02-05 19:02             ` Rik van Riel
  0 siblings, 2 replies; 14+ messages in thread
From: Paul E. McKenney @ 2015-02-05 18:56 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Christian Borntraeger, kvm, linux-kernel, mtosatti, mingo, ak,
	oleg, masami.hiramatsu.pt, fweisbec, lcapitulino, pbonzini

On Thu, Feb 05, 2015 at 01:09:19PM -0500, Rik van Riel wrote:
> On 02/05/2015 12:50 PM, Paul E. McKenney wrote:
> > On Thu, Feb 05, 2015 at 11:52:37AM -0500, Rik van Riel wrote:
> >> On 02/05/2015 11:44 AM, Christian Borntraeger wrote:
> >>> Am 05.02.2015 um 17:35 schrieb riel@redhat.com:
> >>>> From: Rik van Riel <riel@redhat.com>
> >>>>
> >>>> The host kernel is not doing anything while the CPU is executing
> >>>> a KVM guest VCPU, so it can be marked as being in an extended
> >>>> quiescent state, identical to that used when running user space
> >>>> code.
> >>>>
> >>>> The only exception to that rule is when the host handles an
> >>>> interrupt, which is already handled by the irq code, which
> >>>> calls rcu_irq_enter and rcu_irq_exit.
> >>>>
> >>>> The guest_enter and guest_exit functions already switch vtime
> >>>> accounting independent of context tracking, so leave those calls
> >>>> where they are, instead of moving them into the context tracking
> >>>> code.
> >>>>
> >>>> Signed-off-by: Rik van Riel <riel@redhat.com>
> >>>> ---
> >>>>  include/linux/context_tracking.h       | 8 +++++++-
> >>>>  include/linux/context_tracking_state.h | 1 +
> >>>>  2 files changed, 8 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> >>>> index bd9f000fc98d..a5d3bb44b897 100644
> >>>> --- a/include/linux/context_tracking.h
> >>>> +++ b/include/linux/context_tracking.h
> >>>> @@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
> >>>>  static inline void exception_exit(enum ctx_state prev_ctx)
> >>>>  {
> >>>>  	if (context_tracking_is_enabled()) {
> >>>> -		if (prev_ctx == IN_USER)
> >>>> +		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
> >>>>  			context_tracking_user_enter(prev_ctx);
> >>>>  	}
> >>>>  }
> >>>> @@ -78,6 +78,9 @@ static inline void guest_enter(void)
> >>>>  		vtime_guest_enter(current);
> >>>>  	else
> >>>>  		current->flags |= PF_VCPU;
> >>>> +
> >>>> +	if (context_tracking_is_enabled())
> >>>> +		context_tracking_user_enter(IN_GUEST);
> >>>>  }
> >>>
> >>>
> >>> Couldnt we make
> >>>     rcu_virt_note_context_switch(smp_processor_id());
> >>> conditional in include/linux/kvm_host.h (kvm_guest_enter)
> >>>
> >>> e.g. something like
> >>>     if (!context_tracking_is_enabled())
> >>> 	    rcu_virt_note_context_switch(smp_processor_id());
> >>
> >> Possibly. I considered the same, but I do not know whether
> >> or not just rcu_user_enter / rcu_user_exit is enough.
> >>
> >> I could certainly try it out and see whether anything
> >> explodes, but I am not convinced that is careful enough
> >> when it comes to handling RCU code...
> >>
> >> Paul? :)
> > 
> > That can fail for some odd combinations of Kconfig and boot parameters.
> > As I understand it, you instead want the following:
> > 
> > 	if (!tick_nohz_full_cpu(smp_processor_id()))
> > 		rcu_virt_note_context_switch(smp_processor_id());
> > 
> > Frederic might know of a better approach.
> 
> Interesting, in what cases would that happen?

My concern, perhaps misplaced, is that context_tracking_is_enabled(),
but that the current CPU is not a nohz_full= CPU.  In that case, the
context-tracking code would be within its rights to not tell RCU about
the transition to the guest, and thus RCU would be unaware that the
CPU was in a quiescent state.

In most workloads, you would expect the CPU to get interrupted or
preempted or something at some point, which would eventually inform
RCU, but too much delay would result in the infamous RCU CPU stall
warning message.  So let's code a bit more defensively.

> If context_tracking_is_enabled() we end up eventually
> calling into rcu_user_enter & rcu_user_exit.
> 
> If it is not enabled, we call rcu_virt_note_context_switch.
> 
> In what cases would we need to call both?
> 
> I don't see exit-to-userspace do the things that
> rcu_virt_note_context_switch does, and do not understand
> why userspace is fine with that...

The real danger is doing neither.

On tick_nohz_full_cpu() CPUs, the exit-to-userspace code should invoke
rcu_user_enter(), which sets some per-CPU state telling RCU to ignore
that CPU, since it cannot possibly do host RCU read-side critical sections
while running a guest.

In contrast, a non-tick_nohz_full_cpu() CPU doesn't let RCU
know that it is executing in a guest or in userspace.  So the
rcu_virt_note_context_switch() does the notification in that case.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest
  2015-02-05 18:56           ` Paul E. McKenney
@ 2015-02-05 18:59             ` Rik van Riel
  2015-02-05 19:02             ` Rik van Riel
  1 sibling, 0 replies; 14+ messages in thread
From: Rik van Riel @ 2015-02-05 18:59 UTC (permalink / raw)
  To: paulmck
  Cc: Christian Borntraeger, kvm, linux-kernel, mtosatti, mingo, ak,
	oleg, masami.hiramatsu.pt, fweisbec, lcapitulino, pbonzini

On 02/05/2015 01:56 PM, Paul E. McKenney wrote:
> On Thu, Feb 05, 2015 at 01:09:19PM -0500, Rik van Riel wrote:
>> On 02/05/2015 12:50 PM, Paul E. McKenney wrote:
>>> On Thu, Feb 05, 2015 at 11:52:37AM -0500, Rik van Riel wrote:
>>>> On 02/05/2015 11:44 AM, Christian Borntraeger wrote:
>>>>> Am 05.02.2015 um 17:35 schrieb riel@redhat.com:
>>>>>> From: Rik van Riel <riel@redhat.com>
>>>>>>
>>>>>> The host kernel is not doing anything while the CPU is executing
>>>>>> a KVM guest VCPU, so it can be marked as being in an extended
>>>>>> quiescent state, identical to that used when running user space
>>>>>> code.
>>>>>>
>>>>>> The only exception to that rule is when the host handles an
>>>>>> interrupt, which is already handled by the irq code, which
>>>>>> calls rcu_irq_enter and rcu_irq_exit.
>>>>>>
>>>>>> The guest_enter and guest_exit functions already switch vtime
>>>>>> accounting independent of context tracking, so leave those calls
>>>>>> where they are, instead of moving them into the context tracking
>>>>>> code.
>>>>>>
>>>>>> Signed-off-by: Rik van Riel <riel@redhat.com>
>>>>>> ---
>>>>>>  include/linux/context_tracking.h       | 8 +++++++-
>>>>>>  include/linux/context_tracking_state.h | 1 +
>>>>>>  2 files changed, 8 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
>>>>>> index bd9f000fc98d..a5d3bb44b897 100644
>>>>>> --- a/include/linux/context_tracking.h
>>>>>> +++ b/include/linux/context_tracking.h
>>>>>> @@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
>>>>>>  static inline void exception_exit(enum ctx_state prev_ctx)
>>>>>>  {
>>>>>>  	if (context_tracking_is_enabled()) {
>>>>>> -		if (prev_ctx == IN_USER)
>>>>>> +		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
>>>>>>  			context_tracking_user_enter(prev_ctx);
>>>>>>  	}
>>>>>>  }
>>>>>> @@ -78,6 +78,9 @@ static inline void guest_enter(void)
>>>>>>  		vtime_guest_enter(current);
>>>>>>  	else
>>>>>>  		current->flags |= PF_VCPU;
>>>>>> +
>>>>>> +	if (context_tracking_is_enabled())
>>>>>> +		context_tracking_user_enter(IN_GUEST);
>>>>>>  }
>>>>>
>>>>>
>>>>> Couldnt we make
>>>>>     rcu_virt_note_context_switch(smp_processor_id());
>>>>> conditional in include/linux/kvm_host.h (kvm_guest_enter)
>>>>>
>>>>> e.g. something like
>>>>>     if (!context_tracking_is_enabled())
>>>>> 	    rcu_virt_note_context_switch(smp_processor_id());
>>>>
>>>> Possibly. I considered the same, but I do not know whether
>>>> or not just rcu_user_enter / rcu_user_exit is enough.
>>>>
>>>> I could certainly try it out and see whether anything
>>>> explodes, but I am not convinced that is careful enough
>>>> when it comes to handling RCU code...
>>>>
>>>> Paul? :)
>>>
>>> That can fail for some odd combinations of Kconfig and boot parameters.
>>> As I understand it, you instead want the following:
>>>
>>> 	if (!tick_nohz_full_cpu(smp_processor_id()))
>>> 		rcu_virt_note_context_switch(smp_processor_id());
>>>
>>> Frederic might know of a better approach.
>>
>> Interesting, in what cases would that happen?
> 
> My concern, perhaps misplaced, is that context_tracking_is_enabled(),
> but that the current CPU is not a nohz_full= CPU.  In that case, the
> context-tracking code would be within its rights to not tell RCU about
> the transition to the guest, and thus RCU would be unaware that the
> CPU was in a quiescent state.
> 
> In most workloads, you would expect the CPU to get interrupted or
> preempted or something at some point, which would eventually inform
> RCU, but too much delay would result in the infamous RCU CPU stall
> warning message.  So let's code a bit more defensively.
> 
>> If context_tracking_is_enabled() we end up eventually
>> calling into rcu_user_enter & rcu_user_exit.
>>
>> If it is not enabled, we call rcu_virt_note_context_switch.
>>
>> In what cases would we need to call both?
>>
>> I don't see exit-to-userspace do the things that
>> rcu_virt_note_context_switch does, and do not understand
>> why userspace is fine with that...
> 
> The real danger is doing neither.
> 
> On tick_nohz_full_cpu() CPUs, the exit-to-userspace code should invoke
> rcu_user_enter(), which sets some per-CPU state telling RCU to ignore
> that CPU, since it cannot possibly do host RCU read-side critical sections
> while running a guest.

Ahhh, I understand now.  Thank you for your
explanation, Paul.

I will make sure your suggestion is in version 2 of
this series.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest
  2015-02-05 18:56           ` Paul E. McKenney
  2015-02-05 18:59             ` Rik van Riel
@ 2015-02-05 19:02             ` Rik van Riel
  2015-02-05 19:27               ` Paul E. McKenney
  1 sibling, 1 reply; 14+ messages in thread
From: Rik van Riel @ 2015-02-05 19:02 UTC (permalink / raw)
  To: paulmck
  Cc: Christian Borntraeger, kvm, linux-kernel, mtosatti, mingo, ak,
	oleg, masami.hiramatsu.pt, fweisbec, lcapitulino, pbonzini

On 02/05/2015 01:56 PM, Paul E. McKenney wrote:

> The real danger is doing neither.
> 
> On tick_nohz_full_cpu() CPUs, the exit-to-userspace code should invoke
> rcu_user_enter(), which sets some per-CPU state telling RCU to ignore
> that CPU, since it cannot possibly do host RCU read-side critical sections
> while running a guest.
> 
> In contrast, a non-tick_nohz_full_cpu() CPU doesn't let RCU
> know that it is executing in a guest or in userspace.  So the
> rcu_virt_note_context_switch() does the notification in that case.

Looking at context_tracking.h, I see the
function context_tracking_cpu_is_enabled().

That looks like it should do the right thing
in this case.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest
  2015-02-05 19:02             ` Rik van Riel
@ 2015-02-05 19:27               ` Paul E. McKenney
  2015-02-05 20:19                 ` Rik van Riel
  0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2015-02-05 19:27 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Christian Borntraeger, kvm, linux-kernel, mtosatti, mingo, ak,
	oleg, masami.hiramatsu.pt, fweisbec, lcapitulino, pbonzini

On Thu, Feb 05, 2015 at 02:02:35PM -0500, Rik van Riel wrote:
> On 02/05/2015 01:56 PM, Paul E. McKenney wrote:
> 
> > The real danger is doing neither.
> > 
> > On tick_nohz_full_cpu() CPUs, the exit-to-userspace code should invoke
> > rcu_user_enter(), which sets some per-CPU state telling RCU to ignore
> > that CPU, since it cannot possibly do host RCU read-side critical sections
> > while running a guest.
> > 
> > In contrast, a non-tick_nohz_full_cpu() CPU doesn't let RCU
> > know that it is executing in a guest or in userspace.  So the
> > rcu_virt_note_context_switch() does the notification in that case.
> 
> Looking at context_tracking.h, I see the
> function context_tracking_cpu_is_enabled().
> 
> That looks like it should do the right thing
> in this case.

Right you are -- that same check is used to guard the
context_tracking_user_enter() function's call to rcu_user_enter().

Not sure why it open-codes the check rather than invoking
context_tracking_cpu_is_enabled().  Hmmm....  One reason is that
the context_tracking_cpu_is_enabled() function isn't available in
that context, according to my compiler.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest
  2015-02-05 19:27               ` Paul E. McKenney
@ 2015-02-05 20:19                 ` Rik van Riel
  0 siblings, 0 replies; 14+ messages in thread
From: Rik van Riel @ 2015-02-05 20:19 UTC (permalink / raw)
  To: paulmck
  Cc: Christian Borntraeger, kvm, linux-kernel, mtosatti, mingo, ak,
	oleg, masami.hiramatsu.pt, fweisbec, lcapitulino, pbonzini

On 02/05/2015 02:27 PM, Paul E. McKenney wrote:
> On Thu, Feb 05, 2015 at 02:02:35PM -0500, Rik van Riel wrote:

>> Looking at context_tracking.h, I see the
>> function context_tracking_cpu_is_enabled().
>>
>> That looks like it should do the right thing
>> in this case.
> 
> Right you are -- that same check is used to guard the
> context_tracking_user_enter() function's call to rcu_user_enter().

My test suggests it is working.

v2 incoming :)


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-02-05 20:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-05 16:35 [PATCH 0/4] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
2015-02-05 16:35 ` [PATCH 1/4] rcu,nohz: add state parameter to context_tracking_user_enter/exit riel
2015-02-05 16:35 ` [PATCH 2/4] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER riel
2015-02-05 16:35 ` [PATCH 3/4] nohz,kvm: export context_tracking_user_enter/exit riel
2015-02-05 16:35 ` [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running KVM guest riel
2015-02-05 16:44   ` Christian Borntraeger
2015-02-05 16:52     ` Rik van Riel
2015-02-05 17:50       ` Paul E. McKenney
2015-02-05 18:09         ` Rik van Riel
2015-02-05 18:56           ` Paul E. McKenney
2015-02-05 18:59             ` Rik van Riel
2015-02-05 19:02             ` Rik van Riel
2015-02-05 19:27               ` Paul E. McKenney
2015-02-05 20:19                 ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).