LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest
@ 2015-02-05 20:23 riel
  2015-02-05 20:23 ` [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit riel
                   ` (6 more replies)
  0 siblings, 7 replies; 34+ messages in thread
From: riel @ 2015-02-05 20:23 UTC (permalink / raw)
  To: kvm
  Cc: borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, paulmck, lcapitulino, pbonzini

When running a KVM guest on a system with NOHZ_FULL enabled, and the
KVM guest running with idle=poll mode, we still get wakeups of the
rcuos/N threads.

This problem has already been solved for user space by telling the
RCU subsystem that the CPU is in an extended quiescent state while
running user space code.

This patch series extends that code a little bit to make it usable
to track KVM guest space, too.

I tested the code by booting a KVM guest with idle=poll, on a system
with NOHZ_FULL enabled on most CPUs, and a VCPU thread bound to a
CPU. In a 10 second interval, rcuos/N threads on other CPUs got woken
up several times, while the rcuos thread on the CPU running the bound
and alwasy running VCPU thread never got woken up once.

Thanks to Christian Borntraeger and Paul McKenney for reviewing the
first version of this patch series, and helping optimize patch 4/5.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-05 20:23 [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
@ 2015-02-05 20:23 ` riel
  2015-02-05 23:55   ` Paul E. McKenney
  2015-02-06 17:22   ` Frederic Weisbecker
  2015-02-05 20:23 ` [PATCH 2/5] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER riel
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 34+ messages in thread
From: riel @ 2015-02-05 20:23 UTC (permalink / raw)
  To: kvm
  Cc: borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, paulmck, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

Add the expected ctx_state as a parameter to context_tracking_user_enter
and context_tracking_user_exit, allowing the same functions to not just
track kernel <> user space switching, but also kernel <> guest transitions.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/context_tracking.h | 12 ++++++------
 kernel/context_tracking.c        | 10 +++++-----
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 37b81bd51ec0..bd9f000fc98d 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -10,21 +10,21 @@
 #ifdef CONFIG_CONTEXT_TRACKING
 extern void context_tracking_cpu_set(int cpu);
 
-extern void context_tracking_user_enter(void);
-extern void context_tracking_user_exit(void);
+extern void context_tracking_user_enter(enum ctx_state state);
+extern void context_tracking_user_exit(enum ctx_state state);
 extern void __context_tracking_task_switch(struct task_struct *prev,
 					   struct task_struct *next);
 
 static inline void user_enter(void)
 {
 	if (context_tracking_is_enabled())
-		context_tracking_user_enter();
+		context_tracking_user_enter(IN_USER);
 
 }
 static inline void user_exit(void)
 {
 	if (context_tracking_is_enabled())
-		context_tracking_user_exit();
+		context_tracking_user_exit(IN_USER);
 }
 
 static inline enum ctx_state exception_enter(void)
@@ -35,7 +35,7 @@ static inline enum ctx_state exception_enter(void)
 		return 0;
 
 	prev_ctx = this_cpu_read(context_tracking.state);
-	context_tracking_user_exit();
+	context_tracking_user_exit(prev_ctx);
 
 	return prev_ctx;
 }
@@ -44,7 +44,7 @@ static inline void exception_exit(enum ctx_state prev_ctx)
 {
 	if (context_tracking_is_enabled()) {
 		if (prev_ctx == IN_USER)
-			context_tracking_user_enter();
+			context_tracking_user_enter(prev_ctx);
 	}
 }
 
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 937ecdfdf258..4c010787c9ec 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -47,7 +47,7 @@ void context_tracking_cpu_set(int cpu)
  * to execute won't use any RCU read side critical section because this
  * function sets RCU in extended quiescent state.
  */
-void context_tracking_user_enter(void)
+void context_tracking_user_enter(enum ctx_state state)
 {
 	unsigned long flags;
 
@@ -75,7 +75,7 @@ void context_tracking_user_enter(void)
 	WARN_ON_ONCE(!current->mm);
 
 	local_irq_save(flags);
-	if ( __this_cpu_read(context_tracking.state) != IN_USER) {
+	if ( __this_cpu_read(context_tracking.state) != state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			trace_user_enter(0);
 			/*
@@ -101,7 +101,7 @@ void context_tracking_user_enter(void)
 		 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
 		 * is false because we know that CPU is not tickless.
 		 */
-		__this_cpu_write(context_tracking.state, IN_USER);
+		__this_cpu_write(context_tracking.state, state);
 	}
 	local_irq_restore(flags);
 }
@@ -118,7 +118,7 @@ NOKPROBE_SYMBOL(context_tracking_user_enter);
  * This call supports re-entrancy. This way it can be called from any exception
  * handler without needing to know if we came from userspace or not.
  */
-void context_tracking_user_exit(void)
+void context_tracking_user_exit(enum ctx_state state)
 {
 	unsigned long flags;
 
@@ -129,7 +129,7 @@ void context_tracking_user_exit(void)
 		return;
 
 	local_irq_save(flags);
-	if (__this_cpu_read(context_tracking.state) == IN_USER) {
+	if (__this_cpu_read(context_tracking.state) == state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			/*
 			 * We are going to run code that may use RCU. Inform
-- 
1.9.3


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 2/5] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER
  2015-02-05 20:23 [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
  2015-02-05 20:23 ` [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit riel
@ 2015-02-05 20:23 ` riel
  2015-02-05 20:23 ` [PATCH 3/5] nohz,kvm: export context_tracking_user_enter/exit riel
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 34+ messages in thread
From: riel @ 2015-02-05 20:23 UTC (permalink / raw)
  To: kvm
  Cc: borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, paulmck, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

Only run vtime_user_enter and vtime_user_exit when we are entering
or exiting user state, respectively.

The RCU code only distinguishes between "idle" and "not idle or kernel".
There should be no need to add an additional (unused) state there.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/context_tracking.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 4c010787c9ec..97806c4deec5 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -85,7 +85,8 @@ void context_tracking_user_enter(enum ctx_state state)
 			 * user_exit() or rcu_irq_enter(). Let's remove RCU's dependency
 			 * on the tick.
 			 */
-			vtime_user_enter(current);
+			if (state == IN_USER)
+				vtime_user_enter(current);
 			rcu_user_enter();
 		}
 		/*
@@ -136,7 +137,8 @@ void context_tracking_user_exit(enum ctx_state state)
 			 * RCU core about that (ie: we may need the tick again).
 			 */
 			rcu_user_exit();
-			vtime_user_exit(current);
+			if (state == IN_USER)
+				vtime_user_exit(current);
 			trace_user_exit(0);
 		}
 		__this_cpu_write(context_tracking.state, IN_KERNEL);
-- 
1.9.3


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 3/5] nohz,kvm: export context_tracking_user_enter/exit
  2015-02-05 20:23 [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
  2015-02-05 20:23 ` [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit riel
  2015-02-05 20:23 ` [PATCH 2/5] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER riel
@ 2015-02-05 20:23 ` riel
  2015-02-05 23:55   ` Paul E. McKenney
  2015-02-05 20:23 ` [PATCH 4/5] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest riel
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 34+ messages in thread
From: riel @ 2015-02-05 20:23 UTC (permalink / raw)
  To: kvm
  Cc: borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, paulmck, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

Export context_tracking_user_enter/exit so it can be used by KVM.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/context_tracking.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 97806c4deec5..a3f4a2840637 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -107,6 +107,7 @@ void context_tracking_user_enter(enum ctx_state state)
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_user_enter);
+EXPORT_SYMBOL_GPL(context_tracking_user_enter);
 
 /**
  * context_tracking_user_exit - Inform the context tracking that the CPU is
@@ -146,6 +147,7 @@ void context_tracking_user_exit(enum ctx_state state)
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_user_exit);
+EXPORT_SYMBOL_GPL(context_tracking_user_exit);
 
 /**
  * __context_tracking_task_switch - context switch the syscall callbacks
-- 
1.9.3


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 4/5] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-05 20:23 [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
                   ` (2 preceding siblings ...)
  2015-02-05 20:23 ` [PATCH 3/5] nohz,kvm: export context_tracking_user_enter/exit riel
@ 2015-02-05 20:23 ` riel
  2015-02-05 23:56   ` Paul E. McKenney
                     ` (2 more replies)
  2015-02-05 20:23 ` [PATCH 5/5] nohz: add stub context_tracking_is_enabled riel
                   ` (2 subsequent siblings)
  6 siblings, 3 replies; 34+ messages in thread
From: riel @ 2015-02-05 20:23 UTC (permalink / raw)
  To: kvm
  Cc: borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, paulmck, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

The host kernel is not doing anything while the CPU is executing
a KVM guest VCPU, so it can be marked as being in an extended
quiescent state, identical to that used when running user space
code.

The only exception to that rule is when the host handles an
interrupt, which is already handled by the irq code, which
calls rcu_irq_enter and rcu_irq_exit.

The guest_enter and guest_exit functions already switch vtime
accounting independent of context tracking, so leave those calls
where they are, instead of moving them into the context tracking
code.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/context_tracking.h       | 8 +++++++-
 include/linux/context_tracking_state.h | 1 +
 include/linux/kvm_host.h               | 3 ++-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index bd9f000fc98d..a5d3bb44b897 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
 static inline void exception_exit(enum ctx_state prev_ctx)
 {
 	if (context_tracking_is_enabled()) {
-		if (prev_ctx == IN_USER)
+		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
 			context_tracking_user_enter(prev_ctx);
 	}
 }
@@ -78,6 +78,9 @@ static inline void guest_enter(void)
 		vtime_guest_enter(current);
 	else
 		current->flags |= PF_VCPU;
+
+	if (context_tracking_is_enabled())
+		context_tracking_user_enter(IN_GUEST);
 }
 
 static inline void guest_exit(void)
@@ -86,6 +89,9 @@ static inline void guest_exit(void)
 		vtime_guest_exit(current);
 	else
 		current->flags &= ~PF_VCPU;
+
+	if (context_tracking_is_enabled())
+		context_tracking_user_exit(IN_GUEST);
 }
 
 #else
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 97a81225d037..f3ef027af749 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -15,6 +15,7 @@ struct context_tracking {
 	enum ctx_state {
 		IN_KERNEL = 0,
 		IN_USER,
+		IN_GUEST,
 	} state;
 };
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 26f106022c88..c7828a6a9614 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -772,7 +772,8 @@ static inline void kvm_guest_enter(void)
 	 * one time slice). Lets treat guest mode as quiescent state, just like
 	 * we do with user-mode execution.
 	 */
-	rcu_virt_note_context_switch(smp_processor_id());
+	if (!context_tracking_cpu_is_enabled())
+		rcu_virt_note_context_switch(smp_processor_id());
 }
 
 static inline void kvm_guest_exit(void)
-- 
1.9.3


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 5/5] nohz: add stub context_tracking_is_enabled
  2015-02-05 20:23 [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
                   ` (3 preceding siblings ...)
  2015-02-05 20:23 ` [PATCH 4/5] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest riel
@ 2015-02-05 20:23 ` riel
  2015-02-05 23:56   ` Paul E. McKenney
  2015-02-06 13:46 ` [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest Frederic Weisbecker
  2015-02-06 15:00 ` Christian Borntraeger
  6 siblings, 1 reply; 34+ messages in thread
From: riel @ 2015-02-05 20:23 UTC (permalink / raw)
  To: kvm
  Cc: borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, paulmck, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

With code elsewhere doing something conditional on whether or not
context tracking is enabled, we want a stub function that tells us
context tracking is not enabled, when CONFIG_CONTEXT_TRACKING is
not set.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/context_tracking_state.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index f3ef027af749..90a7bab8779e 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -40,6 +40,8 @@ static inline bool context_tracking_in_user(void)
 #else
 static inline bool context_tracking_in_user(void) { return false; }
 static inline bool context_tracking_active(void) { return false; }
+static inline bool context_tracking_is_enabled(void) { return false; }
+static inline bool context_tracking_cpu_is_enabled(void) { return false; }
 #endif /* CONFIG_CONTEXT_TRACKING */
 
 #endif
-- 
1.9.3


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-05 20:23 ` [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit riel
@ 2015-02-05 23:55   ` Paul E. McKenney
  2015-02-06 10:15     ` Paolo Bonzini
  2015-02-06 17:22   ` Frederic Weisbecker
  1 sibling, 1 reply; 34+ messages in thread
From: Paul E. McKenney @ 2015-02-05 23:55 UTC (permalink / raw)
  To: riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, lcapitulino, pbonzini

On Thu, Feb 05, 2015 at 03:23:48PM -0500, riel@redhat.com wrote:
> From: Rik van Riel <riel@redhat.com>
> 
> Add the expected ctx_state as a parameter to context_tracking_user_enter
> and context_tracking_user_exit, allowing the same functions to not just
> track kernel <> user space switching, but also kernel <> guest transitions.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  include/linux/context_tracking.h | 12 ++++++------
>  kernel/context_tracking.c        | 10 +++++-----
>  2 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> index 37b81bd51ec0..bd9f000fc98d 100644
> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -10,21 +10,21 @@
>  #ifdef CONFIG_CONTEXT_TRACKING
>  extern void context_tracking_cpu_set(int cpu);
> 
> -extern void context_tracking_user_enter(void);
> -extern void context_tracking_user_exit(void);
> +extern void context_tracking_user_enter(enum ctx_state state);
> +extern void context_tracking_user_exit(enum ctx_state state);
>  extern void __context_tracking_task_switch(struct task_struct *prev,
>  					   struct task_struct *next);
> 
>  static inline void user_enter(void)
>  {
>  	if (context_tracking_is_enabled())
> -		context_tracking_user_enter();
> +		context_tracking_user_enter(IN_USER);
> 
>  }
>  static inline void user_exit(void)
>  {
>  	if (context_tracking_is_enabled())
> -		context_tracking_user_exit();
> +		context_tracking_user_exit(IN_USER);
>  }
> 
>  static inline enum ctx_state exception_enter(void)
> @@ -35,7 +35,7 @@ static inline enum ctx_state exception_enter(void)
>  		return 0;
> 
>  	prev_ctx = this_cpu_read(context_tracking.state);
> -	context_tracking_user_exit();
> +	context_tracking_user_exit(prev_ctx);
> 
>  	return prev_ctx;
>  }
> @@ -44,7 +44,7 @@ static inline void exception_exit(enum ctx_state prev_ctx)
>  {
>  	if (context_tracking_is_enabled()) {
>  		if (prev_ctx == IN_USER)
> -			context_tracking_user_enter();
> +			context_tracking_user_enter(prev_ctx);
>  	}
>  }
> 
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 937ecdfdf258..4c010787c9ec 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -47,7 +47,7 @@ void context_tracking_cpu_set(int cpu)
>   * to execute won't use any RCU read side critical section because this
>   * function sets RCU in extended quiescent state.
>   */
> -void context_tracking_user_enter(void)
> +void context_tracking_user_enter(enum ctx_state state)
>  {
>  	unsigned long flags;
> 
> @@ -75,7 +75,7 @@ void context_tracking_user_enter(void)
>  	WARN_ON_ONCE(!current->mm);
> 
>  	local_irq_save(flags);
> -	if ( __this_cpu_read(context_tracking.state) != IN_USER) {
> +	if ( __this_cpu_read(context_tracking.state) != state) {
>  		if (__this_cpu_read(context_tracking.active)) {
>  			trace_user_enter(0);
>  			/*
> @@ -101,7 +101,7 @@ void context_tracking_user_enter(void)
>  		 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
>  		 * is false because we know that CPU is not tickless.
>  		 */
> -		__this_cpu_write(context_tracking.state, IN_USER);
> +		__this_cpu_write(context_tracking.state, state);
>  	}
>  	local_irq_restore(flags);
>  }
> @@ -118,7 +118,7 @@ NOKPROBE_SYMBOL(context_tracking_user_enter);
>   * This call supports re-entrancy. This way it can be called from any exception
>   * handler without needing to know if we came from userspace or not.
>   */
> -void context_tracking_user_exit(void)
> +void context_tracking_user_exit(enum ctx_state state)
>  {
>  	unsigned long flags;
> 
> @@ -129,7 +129,7 @@ void context_tracking_user_exit(void)
>  		return;
> 
>  	local_irq_save(flags);
> -	if (__this_cpu_read(context_tracking.state) == IN_USER) {
> +	if (__this_cpu_read(context_tracking.state) == state) {
>  		if (__this_cpu_read(context_tracking.active)) {
>  			/*
>  			 * We are going to run code that may use RCU. Inform
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/5] nohz,kvm: export context_tracking_user_enter/exit
  2015-02-05 20:23 ` [PATCH 3/5] nohz,kvm: export context_tracking_user_enter/exit riel
@ 2015-02-05 23:55   ` Paul E. McKenney
  0 siblings, 0 replies; 34+ messages in thread
From: Paul E. McKenney @ 2015-02-05 23:55 UTC (permalink / raw)
  To: riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, lcapitulino, pbonzini

On Thu, Feb 05, 2015 at 03:23:50PM -0500, riel@redhat.com wrote:
> From: Rik van Riel <riel@redhat.com>
> 
> Export context_tracking_user_enter/exit so it can be used by KVM.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  kernel/context_tracking.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 97806c4deec5..a3f4a2840637 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -107,6 +107,7 @@ void context_tracking_user_enter(enum ctx_state state)
>  	local_irq_restore(flags);
>  }
>  NOKPROBE_SYMBOL(context_tracking_user_enter);
> +EXPORT_SYMBOL_GPL(context_tracking_user_enter);
> 
>  /**
>   * context_tracking_user_exit - Inform the context tracking that the CPU is
> @@ -146,6 +147,7 @@ void context_tracking_user_exit(enum ctx_state state)
>  	local_irq_restore(flags);
>  }
>  NOKPROBE_SYMBOL(context_tracking_user_exit);
> +EXPORT_SYMBOL_GPL(context_tracking_user_exit);
> 
>  /**
>   * __context_tracking_task_switch - context switch the syscall callbacks
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/5] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-05 20:23 ` [PATCH 4/5] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest riel
@ 2015-02-05 23:56   ` Paul E. McKenney
  2015-02-06 18:01   ` Frederic Weisbecker
  2015-02-06 23:24   ` Frederic Weisbecker
  2 siblings, 0 replies; 34+ messages in thread
From: Paul E. McKenney @ 2015-02-05 23:56 UTC (permalink / raw)
  To: riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, lcapitulino, pbonzini

On Thu, Feb 05, 2015 at 03:23:51PM -0500, riel@redhat.com wrote:
> From: Rik van Riel <riel@redhat.com>
> 
> The host kernel is not doing anything while the CPU is executing
> a KVM guest VCPU, so it can be marked as being in an extended
> quiescent state, identical to that used when running user space
> code.
> 
> The only exception to that rule is when the host handles an
> interrupt, which is already handled by the irq code, which
> calls rcu_irq_enter and rcu_irq_exit.
> 
> The guest_enter and guest_exit functions already switch vtime
> accounting independent of context tracking, so leave those calls
> where they are, instead of moving them into the context tracking
> code.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  include/linux/context_tracking.h       | 8 +++++++-
>  include/linux/context_tracking_state.h | 1 +
>  include/linux/kvm_host.h               | 3 ++-
>  3 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> index bd9f000fc98d..a5d3bb44b897 100644
> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
>  static inline void exception_exit(enum ctx_state prev_ctx)
>  {
>  	if (context_tracking_is_enabled()) {
> -		if (prev_ctx == IN_USER)
> +		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
>  			context_tracking_user_enter(prev_ctx);
>  	}
>  }
> @@ -78,6 +78,9 @@ static inline void guest_enter(void)
>  		vtime_guest_enter(current);
>  	else
>  		current->flags |= PF_VCPU;
> +
> +	if (context_tracking_is_enabled())
> +		context_tracking_user_enter(IN_GUEST);
>  }
> 
>  static inline void guest_exit(void)
> @@ -86,6 +89,9 @@ static inline void guest_exit(void)
>  		vtime_guest_exit(current);
>  	else
>  		current->flags &= ~PF_VCPU;
> +
> +	if (context_tracking_is_enabled())
> +		context_tracking_user_exit(IN_GUEST);
>  }
> 
>  #else
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index 97a81225d037..f3ef027af749 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -15,6 +15,7 @@ struct context_tracking {
>  	enum ctx_state {
>  		IN_KERNEL = 0,
>  		IN_USER,
> +		IN_GUEST,
>  	} state;
>  };
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 26f106022c88..c7828a6a9614 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -772,7 +772,8 @@ static inline void kvm_guest_enter(void)
>  	 * one time slice). Lets treat guest mode as quiescent state, just like
>  	 * we do with user-mode execution.
>  	 */
> -	rcu_virt_note_context_switch(smp_processor_id());
> +	if (!context_tracking_cpu_is_enabled())
> +		rcu_virt_note_context_switch(smp_processor_id());
>  }
> 
>  static inline void kvm_guest_exit(void)
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 5/5] nohz: add stub context_tracking_is_enabled
  2015-02-05 20:23 ` [PATCH 5/5] nohz: add stub context_tracking_is_enabled riel
@ 2015-02-05 23:56   ` Paul E. McKenney
  0 siblings, 0 replies; 34+ messages in thread
From: Paul E. McKenney @ 2015-02-05 23:56 UTC (permalink / raw)
  To: riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, lcapitulino, pbonzini

On Thu, Feb 05, 2015 at 03:23:52PM -0500, riel@redhat.com wrote:
> From: Rik van Riel <riel@redhat.com>
> 
> With code elsewhere doing something conditional on whether or not
> context tracking is enabled, we want a stub function that tells us
> context tracking is not enabled, when CONFIG_CONTEXT_TRACKING is
> not set.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  include/linux/context_tracking_state.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index f3ef027af749..90a7bab8779e 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -40,6 +40,8 @@ static inline bool context_tracking_in_user(void)
>  #else
>  static inline bool context_tracking_in_user(void) { return false; }
>  static inline bool context_tracking_active(void) { return false; }
> +static inline bool context_tracking_is_enabled(void) { return false; }
> +static inline bool context_tracking_cpu_is_enabled(void) { return false; }
>  #endif /* CONFIG_CONTEXT_TRACKING */
> 
>  #endif
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-05 23:55   ` Paul E. McKenney
@ 2015-02-06 10:15     ` Paolo Bonzini
  2015-02-06 13:41       ` Paul E. McKenney
  0 siblings, 1 reply; 34+ messages in thread
From: Paolo Bonzini @ 2015-02-06 10:15 UTC (permalink / raw)
  To: paulmck, riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, lcapitulino



On 06/02/2015 00:55, Paul E. McKenney wrote:
> On Thu, Feb 05, 2015 at 03:23:48PM -0500, riel@redhat.com wrote:
>> From: Rik van Riel <riel@redhat.com>
>>
>> Add the expected ctx_state as a parameter to context_tracking_user_enter
>> and context_tracking_user_exit, allowing the same functions to not just
>> track kernel <> user space switching, but also kernel <> guest transitions.
>>
>> Signed-off-by: Rik van Riel <riel@redhat.com>
> 
> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

/me wonders: whose tree is supposed to carry these patches?

Paolo

>> ---
>>  include/linux/context_tracking.h | 12 ++++++------
>>  kernel/context_tracking.c        | 10 +++++-----
>>  2 files changed, 11 insertions(+), 11 deletions(-)
>>
>> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
>> index 37b81bd51ec0..bd9f000fc98d 100644
>> --- a/include/linux/context_tracking.h
>> +++ b/include/linux/context_tracking.h
>> @@ -10,21 +10,21 @@
>>  #ifdef CONFIG_CONTEXT_TRACKING
>>  extern void context_tracking_cpu_set(int cpu);
>>
>> -extern void context_tracking_user_enter(void);
>> -extern void context_tracking_user_exit(void);
>> +extern void context_tracking_user_enter(enum ctx_state state);
>> +extern void context_tracking_user_exit(enum ctx_state state);
>>  extern void __context_tracking_task_switch(struct task_struct *prev,
>>  					   struct task_struct *next);
>>
>>  static inline void user_enter(void)
>>  {
>>  	if (context_tracking_is_enabled())
>> -		context_tracking_user_enter();
>> +		context_tracking_user_enter(IN_USER);
>>
>>  }
>>  static inline void user_exit(void)
>>  {
>>  	if (context_tracking_is_enabled())
>> -		context_tracking_user_exit();
>> +		context_tracking_user_exit(IN_USER);
>>  }
>>
>>  static inline enum ctx_state exception_enter(void)
>> @@ -35,7 +35,7 @@ static inline enum ctx_state exception_enter(void)
>>  		return 0;
>>
>>  	prev_ctx = this_cpu_read(context_tracking.state);
>> -	context_tracking_user_exit();
>> +	context_tracking_user_exit(prev_ctx);
>>
>>  	return prev_ctx;
>>  }
>> @@ -44,7 +44,7 @@ static inline void exception_exit(enum ctx_state prev_ctx)
>>  {
>>  	if (context_tracking_is_enabled()) {
>>  		if (prev_ctx == IN_USER)
>> -			context_tracking_user_enter();
>> +			context_tracking_user_enter(prev_ctx);
>>  	}
>>  }
>>
>> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
>> index 937ecdfdf258..4c010787c9ec 100644
>> --- a/kernel/context_tracking.c
>> +++ b/kernel/context_tracking.c
>> @@ -47,7 +47,7 @@ void context_tracking_cpu_set(int cpu)
>>   * to execute won't use any RCU read side critical section because this
>>   * function sets RCU in extended quiescent state.
>>   */
>> -void context_tracking_user_enter(void)
>> +void context_tracking_user_enter(enum ctx_state state)
>>  {
>>  	unsigned long flags;
>>
>> @@ -75,7 +75,7 @@ void context_tracking_user_enter(void)
>>  	WARN_ON_ONCE(!current->mm);
>>
>>  	local_irq_save(flags);
>> -	if ( __this_cpu_read(context_tracking.state) != IN_USER) {
>> +	if ( __this_cpu_read(context_tracking.state) != state) {
>>  		if (__this_cpu_read(context_tracking.active)) {
>>  			trace_user_enter(0);
>>  			/*
>> @@ -101,7 +101,7 @@ void context_tracking_user_enter(void)
>>  		 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
>>  		 * is false because we know that CPU is not tickless.
>>  		 */
>> -		__this_cpu_write(context_tracking.state, IN_USER);
>> +		__this_cpu_write(context_tracking.state, state);
>>  	}
>>  	local_irq_restore(flags);
>>  }
>> @@ -118,7 +118,7 @@ NOKPROBE_SYMBOL(context_tracking_user_enter);
>>   * This call supports re-entrancy. This way it can be called from any exception
>>   * handler without needing to know if we came from userspace or not.
>>   */
>> -void context_tracking_user_exit(void)
>> +void context_tracking_user_exit(enum ctx_state state)
>>  {
>>  	unsigned long flags;
>>
>> @@ -129,7 +129,7 @@ void context_tracking_user_exit(void)
>>  		return;
>>
>>  	local_irq_save(flags);
>> -	if (__this_cpu_read(context_tracking.state) == IN_USER) {
>> +	if (__this_cpu_read(context_tracking.state) == state) {
>>  		if (__this_cpu_read(context_tracking.active)) {
>>  			/*
>>  			 * We are going to run code that may use RCU. Inform
>> -- 
>> 1.9.3
>>
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-06 10:15     ` Paolo Bonzini
@ 2015-02-06 13:41       ` Paul E. McKenney
  0 siblings, 0 replies; 34+ messages in thread
From: Paul E. McKenney @ 2015-02-06 13:41 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: riel, kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, fweisbec, lcapitulino

On Fri, Feb 06, 2015 at 11:15:57AM +0100, Paolo Bonzini wrote:
> 
> 
> On 06/02/2015 00:55, Paul E. McKenney wrote:
> > On Thu, Feb 05, 2015 at 03:23:48PM -0500, riel@redhat.com wrote:
> >> From: Rik van Riel <riel@redhat.com>
> >>
> >> Add the expected ctx_state as a parameter to context_tracking_user_enter
> >> and context_tracking_user_exit, allowing the same functions to not just
> >> track kernel <> user space switching, but also kernel <> guest transitions.
> >>
> >> Signed-off-by: Rik van Riel <riel@redhat.com>
> > 
> > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> /me wonders: whose tree is supposed to carry these patches?

If no one else does, I would be happy to.  I would be thinking in terms
of 3.21, in other words, not the merge window starting in three days,
but the one after that.

							Thanx, Paul

> Paolo
> 
> >> ---
> >>  include/linux/context_tracking.h | 12 ++++++------
> >>  kernel/context_tracking.c        | 10 +++++-----
> >>  2 files changed, 11 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> >> index 37b81bd51ec0..bd9f000fc98d 100644
> >> --- a/include/linux/context_tracking.h
> >> +++ b/include/linux/context_tracking.h
> >> @@ -10,21 +10,21 @@
> >>  #ifdef CONFIG_CONTEXT_TRACKING
> >>  extern void context_tracking_cpu_set(int cpu);
> >>
> >> -extern void context_tracking_user_enter(void);
> >> -extern void context_tracking_user_exit(void);
> >> +extern void context_tracking_user_enter(enum ctx_state state);
> >> +extern void context_tracking_user_exit(enum ctx_state state);
> >>  extern void __context_tracking_task_switch(struct task_struct *prev,
> >>  					   struct task_struct *next);
> >>
> >>  static inline void user_enter(void)
> >>  {
> >>  	if (context_tracking_is_enabled())
> >> -		context_tracking_user_enter();
> >> +		context_tracking_user_enter(IN_USER);
> >>
> >>  }
> >>  static inline void user_exit(void)
> >>  {
> >>  	if (context_tracking_is_enabled())
> >> -		context_tracking_user_exit();
> >> +		context_tracking_user_exit(IN_USER);
> >>  }
> >>
> >>  static inline enum ctx_state exception_enter(void)
> >> @@ -35,7 +35,7 @@ static inline enum ctx_state exception_enter(void)
> >>  		return 0;
> >>
> >>  	prev_ctx = this_cpu_read(context_tracking.state);
> >> -	context_tracking_user_exit();
> >> +	context_tracking_user_exit(prev_ctx);
> >>
> >>  	return prev_ctx;
> >>  }
> >> @@ -44,7 +44,7 @@ static inline void exception_exit(enum ctx_state prev_ctx)
> >>  {
> >>  	if (context_tracking_is_enabled()) {
> >>  		if (prev_ctx == IN_USER)
> >> -			context_tracking_user_enter();
> >> +			context_tracking_user_enter(prev_ctx);
> >>  	}
> >>  }
> >>
> >> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> >> index 937ecdfdf258..4c010787c9ec 100644
> >> --- a/kernel/context_tracking.c
> >> +++ b/kernel/context_tracking.c
> >> @@ -47,7 +47,7 @@ void context_tracking_cpu_set(int cpu)
> >>   * to execute won't use any RCU read side critical section because this
> >>   * function sets RCU in extended quiescent state.
> >>   */
> >> -void context_tracking_user_enter(void)
> >> +void context_tracking_user_enter(enum ctx_state state)
> >>  {
> >>  	unsigned long flags;
> >>
> >> @@ -75,7 +75,7 @@ void context_tracking_user_enter(void)
> >>  	WARN_ON_ONCE(!current->mm);
> >>
> >>  	local_irq_save(flags);
> >> -	if ( __this_cpu_read(context_tracking.state) != IN_USER) {
> >> +	if ( __this_cpu_read(context_tracking.state) != state) {
> >>  		if (__this_cpu_read(context_tracking.active)) {
> >>  			trace_user_enter(0);
> >>  			/*
> >> @@ -101,7 +101,7 @@ void context_tracking_user_enter(void)
> >>  		 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
> >>  		 * is false because we know that CPU is not tickless.
> >>  		 */
> >> -		__this_cpu_write(context_tracking.state, IN_USER);
> >> +		__this_cpu_write(context_tracking.state, state);
> >>  	}
> >>  	local_irq_restore(flags);
> >>  }
> >> @@ -118,7 +118,7 @@ NOKPROBE_SYMBOL(context_tracking_user_enter);
> >>   * This call supports re-entrancy. This way it can be called from any exception
> >>   * handler without needing to know if we came from userspace or not.
> >>   */
> >> -void context_tracking_user_exit(void)
> >> +void context_tracking_user_exit(enum ctx_state state)
> >>  {
> >>  	unsigned long flags;
> >>
> >> @@ -129,7 +129,7 @@ void context_tracking_user_exit(void)
> >>  		return;
> >>
> >>  	local_irq_save(flags);
> >> -	if (__this_cpu_read(context_tracking.state) == IN_USER) {
> >> +	if (__this_cpu_read(context_tracking.state) == state) {
> >>  		if (__this_cpu_read(context_tracking.active)) {
> >>  			/*
> >>  			 * We are going to run code that may use RCU. Inform
> >> -- 
> >> 1.9.3
> >>
> > 
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest
  2015-02-05 20:23 [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
                   ` (4 preceding siblings ...)
  2015-02-05 20:23 ` [PATCH 5/5] nohz: add stub context_tracking_is_enabled riel
@ 2015-02-06 13:46 ` Frederic Weisbecker
  2015-02-06 13:50   ` Paolo Bonzini
  2015-02-06 14:56   ` Rik van Riel
  2015-02-06 15:00 ` Christian Borntraeger
  6 siblings, 2 replies; 34+ messages in thread
From: Frederic Weisbecker @ 2015-02-06 13:46 UTC (permalink / raw)
  To: riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini

On Thu, Feb 05, 2015 at 03:23:47PM -0500, riel@redhat.com wrote:
> When running a KVM guest on a system with NOHZ_FULL enabled

I just need to clarify the motivation first, does the above situation
really happen? Ok some distros enable NOHZ_FULL to let the user stop
the tick in userspace. So most of the time, CONFIG_NOHZ_FULL=y but
nohz full is runtime disabled (we need to pass a nohz_full= boot
parameter to enable it). And when it is runtime disabled, there should
be no rcu nocb CPU.

(Although not setting CPUs in nocb mode when nohz full is runtime disabled
is perhaps a recent change.)

So for the problem to arise, one need to enable nohz_full and run KVM
guest. And I never heard about such workloads. That said it's potentially
interesting to turn off the tick on the host when the guest runs.

>, and the
> KVM guest running with idle=poll mode, we still get wakeups of the
> rcuos/N threads.

So we need nohz_full on the host and idle=poll mode on the guest. Is it
likely to happen? (sorry, again I'm just trying to make sure we agree on
why we do this change).

> 
> This problem has already been solved for user space by telling the
> RCU subsystem that the CPU is in an extended quiescent state while
> running user space code.
> 
> This patch series extends that code a little bit to make it usable
> to track KVM guest space, too.
> 
> I tested the code by booting a KVM guest with idle=poll, on a system
> with NOHZ_FULL enabled on most CPUs, and a VCPU thread bound to a
> CPU. In a 10 second interval, rcuos/N threads on other CPUs got woken
> up several times, while the rcuos thread on the CPU running the bound
> and alwasy running VCPU thread never got woken up once.

So what you're describing is to set RCU in extended quiescent state, right?
This doesn't include stopping the tick while running in guest mode? Those
are indeed two different thing, although stopping the tick most often requires
to set RCU in extended quiescent state.

> 
> Thanks to Christian Borntraeger and Paul McKenney for reviewing the
> first version of this patch series, and helping optimize patch 4/5.
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest
  2015-02-06 13:46 ` [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest Frederic Weisbecker
@ 2015-02-06 13:50   ` Paolo Bonzini
  2015-02-06 16:19     ` Paul E. McKenney
  2015-02-06 18:09     ` Frederic Weisbecker
  2015-02-06 14:56   ` Rik van Riel
  1 sibling, 2 replies; 34+ messages in thread
From: Paolo Bonzini @ 2015-02-06 13:50 UTC (permalink / raw)
  To: Frederic Weisbecker, riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino



On 06/02/2015 14:46, Frederic Weisbecker wrote:
> > When running a KVM guest on a system with NOHZ_FULL enabled
> 
> I just need to clarify the motivation first, does the above situation
> really happen? Ok some distros enable NOHZ_FULL to let the user stop
> the tick in userspace. So most of the time, CONFIG_NOHZ_FULL=y but
> nohz full is runtime disabled (we need to pass a nohz_full= boot
> parameter to enable it). And when it is runtime disabled, there should
> be no rcu nocb CPU.
> 
> (Although not setting CPUs in nocb mode when nohz full is runtime disabled
> is perhaps a recent change.)
> 
> So for the problem to arise, one need to enable nohz_full and run KVM
> guest. And I never heard about such workloads.

Yeah, it's a new thing but Marcelo, Luiz and Rik have been having a lot
of fun with them (with PREEMPT_RT too).  They're getting pretty good
results given the right tuning.

I'll let Paul queue the patches for 3.21 then!

Paolo

> That said it's potentially
> interesting to turn off the tick on the host when the guest runs.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest
  2015-02-06 13:46 ` [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest Frederic Weisbecker
  2015-02-06 13:50   ` Paolo Bonzini
@ 2015-02-06 14:56   ` Rik van Riel
  2015-02-06 18:05     ` Frederic Weisbecker
  1 sibling, 1 reply; 34+ messages in thread
From: Rik van Riel @ 2015-02-06 14:56 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/06/2015 08:46 AM, Frederic Weisbecker wrote:
> On Thu, Feb 05, 2015 at 03:23:47PM -0500, riel@redhat.com wrote:
>> When running a KVM guest on a system with NOHZ_FULL enabled
> 
> I just need to clarify the motivation first, does the above
> situation really happen? Ok some distros enable NOHZ_FULL to let
> the user stop the tick in userspace. So most of the time,
> CONFIG_NOHZ_FULL=y but nohz full is runtime disabled (we need to
> pass a nohz_full= boot parameter to enable it). And when it is
> runtime disabled, there should be no rcu nocb CPU.
> 
> (Although not setting CPUs in nocb mode when nohz full is runtime
> disabled is perhaps a recent change.)
> 
> So for the problem to arise, one need to enable nohz_full and run
> KVM guest. And I never heard about such workloads. That said it's
> potentially interesting to turn off the tick on the host when the
> guest runs.
> 
>> , and the KVM guest running with idle=poll mode, we still get
>> wakeups of the rcuos/N threads.
> 
> So we need nohz_full on the host and idle=poll mode on the guest.
> Is it likely to happen? (sorry, again I'm just trying to make sure
> we agree on why we do this change).

We have users interested in doing just that, in order to run
KVM guests with the least amount of perturbation to the guest.


- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJU1NYrAAoJEM553pKExN6DMgcIAIKoBBxsFEt7yOP+k32uqm+W
S/VbP5dIHE5OnYeBoitgNkia1U4rsAX6AVAuVFvKc7Y8aixENGzubPWHe0NuHida
VIaQmK92Jym4FH8Xsnj09MhgLV+ZEG/cCzdUFZfShJq3hHwzedZx+cC7uQMB6kd4
iuo7CtgTjzTgBce29Fc147azXlJbfFfwFt3a6YVxbv25IYpDL9ETulh34h6NrNLz
nB0snDjq8FHKcyjlD3XnJpT/tbaZcrZctExq4JrespcBMe6prMnoWvVoXWX/fVVG
TIR1hp2xfKWoS4gc56PnLazIIB9SRmlC/SzSMwAaSgf1dWa5BcwpuMbYVFIEeME=
=+BrK
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest
  2015-02-05 20:23 [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
                   ` (5 preceding siblings ...)
  2015-02-06 13:46 ` [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest Frederic Weisbecker
@ 2015-02-06 15:00 ` Christian Borntraeger
  2015-02-06 18:20   ` Frederic Weisbecker
  6 siblings, 1 reply; 34+ messages in thread
From: Christian Borntraeger @ 2015-02-06 15:00 UTC (permalink / raw)
  To: riel, kvm
  Cc: linux-kernel, mtosatti, mingo, ak, oleg, masami.hiramatsu.pt,
	fweisbec, paulmck, lcapitulino, pbonzini, linux-s390

Am 05.02.2015 um 21:23 schrieb riel@redhat.com:
> When running a KVM guest on a system with NOHZ_FULL enabled, and the
> KVM guest running with idle=poll mode, we still get wakeups of the
> rcuos/N threads.
> 
> This problem has already been solved for user space by telling the
> RCU subsystem that the CPU is in an extended quiescent state while
> running user space code.
> 
> This patch series extends that code a little bit to make it usable
> to track KVM guest space, too.
> 
> I tested the code by booting a KVM guest with idle=poll, on a system
> with NOHZ_FULL enabled on most CPUs, and a VCPU thread bound to a
> CPU. In a 10 second interval, rcuos/N threads on other CPUs got woken
> up several times, while the rcuos thread on the CPU running the bound
> and alwasy running VCPU thread never got woken up once.
> 
> Thanks to Christian Borntraeger and Paul McKenney for reviewing the
> first version of this patch series, and helping optimize patch 4/5.

I gave it a quick run on s390/kvm and everything still seem to be 
running fine. A also I like the idea of this patch set.

We have seen several cases were the fact that we are in guest context
a full tick for cpu bound guests (10ms on s390) caused significant
latencies for host synchronize-rcu heavy workload - e.g. getting rid
of macvtap devices on guest shutdown, adding hundreds of irq routes
for many guest devices....

s390 has no context tracking infrastructure yet (no nohz_full), but
this series looks like that the current case (nohz_idle) still works.
With this in place, having hohz==full on s390 now even makes more
sense, as KVM hosts with cpu bound guests should have get much quicker
rcu response times when most host CPUs are in an extended quiescant
state.

Christian


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest
  2015-02-06 13:50   ` Paolo Bonzini
@ 2015-02-06 16:19     ` Paul E. McKenney
  2015-02-06 18:09     ` Frederic Weisbecker
  1 sibling, 0 replies; 34+ messages in thread
From: Paul E. McKenney @ 2015-02-06 16:19 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Frederic Weisbecker, riel, kvm, borntraeger, linux-kernel,
	mtosatti, mingo, ak, oleg, masami.hiramatsu.pt, lcapitulino

On Fri, Feb 06, 2015 at 02:50:44PM +0100, Paolo Bonzini wrote:
> 
> 
> On 06/02/2015 14:46, Frederic Weisbecker wrote:
> > > When running a KVM guest on a system with NOHZ_FULL enabled
> > 
> > I just need to clarify the motivation first, does the above situation
> > really happen? Ok some distros enable NOHZ_FULL to let the user stop
> > the tick in userspace. So most of the time, CONFIG_NOHZ_FULL=y but
> > nohz full is runtime disabled (we need to pass a nohz_full= boot
> > parameter to enable it). And when it is runtime disabled, there should
> > be no rcu nocb CPU.
> > 
> > (Although not setting CPUs in nocb mode when nohz full is runtime disabled
> > is perhaps a recent change.)
> > 
> > So for the problem to arise, one need to enable nohz_full and run KVM
> > guest. And I never heard about such workloads.
> 
> Yeah, it's a new thing but Marcelo, Luiz and Rik have been having a lot
> of fun with them (with PREEMPT_RT too).  They're getting pretty good
> results given the right tuning.
> 
> I'll let Paul queue the patches for 3.21 then!

Frederic, given the background from Paolo, Rik, and Christian, are you
OK with these patches?

							Thanx, Paul

> Paolo
> 
> > That said it's potentially
> > interesting to turn off the tick on the host when the guest runs.
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-05 20:23 ` [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit riel
  2015-02-05 23:55   ` Paul E. McKenney
@ 2015-02-06 17:22   ` Frederic Weisbecker
  2015-02-06 18:20     ` Rik van Riel
  1 sibling, 1 reply; 34+ messages in thread
From: Frederic Weisbecker @ 2015-02-06 17:22 UTC (permalink / raw)
  To: riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini

On Thu, Feb 05, 2015 at 03:23:48PM -0500, riel@redhat.com wrote:
> From: Rik van Riel <riel@redhat.com>
> 
> Add the expected ctx_state as a parameter to context_tracking_user_enter
> and context_tracking_user_exit, allowing the same functions to not just
> track kernel <> user space switching, but also kernel <> guest transitions.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>

You should consider using guest_enter() and guest_exit() instead. These are
context tracking APIs too but specifically for guest.

These can be uninlined if needed.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/5] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-05 20:23 ` [PATCH 4/5] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest riel
  2015-02-05 23:56   ` Paul E. McKenney
@ 2015-02-06 18:01   ` Frederic Weisbecker
  2015-02-06 23:24   ` Frederic Weisbecker
  2 siblings, 0 replies; 34+ messages in thread
From: Frederic Weisbecker @ 2015-02-06 18:01 UTC (permalink / raw)
  To: riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini

On Thu, Feb 05, 2015 at 03:23:51PM -0500, riel@redhat.com wrote:
> From: Rik van Riel <riel@redhat.com>
> 
> The host kernel is not doing anything while the CPU is executing
> a KVM guest VCPU, so it can be marked as being in an extended
> quiescent state, identical to that used when running user space
> code.
> 
> The only exception to that rule is when the host handles an
> interrupt, which is already handled by the irq code, which
> calls rcu_irq_enter and rcu_irq_exit.
> 
> The guest_enter and guest_exit functions already switch vtime
> accounting independent of context tracking, so leave those calls
> where they are, instead of moving them into the context tracking
> code.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
>  include/linux/context_tracking.h       | 8 +++++++-
>  include/linux/context_tracking_state.h | 1 +
>  include/linux/kvm_host.h               | 3 ++-
>  3 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> index bd9f000fc98d..a5d3bb44b897 100644
> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
>  static inline void exception_exit(enum ctx_state prev_ctx)
>  {
>  	if (context_tracking_is_enabled()) {
> -		if (prev_ctx == IN_USER)
> +		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
>  			context_tracking_user_enter(prev_ctx);
>  	}
>  }
> @@ -78,6 +78,9 @@ static inline void guest_enter(void)
>  		vtime_guest_enter(current);
>  	else
>  		current->flags |= PF_VCPU;
> +
> +	if (context_tracking_is_enabled())
> +		context_tracking_user_enter(IN_GUEST);

So you should probably just call rcu_user_enter() directly from
there. context_tracking_user_enter() is really about userspace
boundaries.

>  }
>  
>  static inline void guest_exit(void)
> @@ -86,6 +89,9 @@ static inline void guest_exit(void)
>  		vtime_guest_exit(current);
>  	else
>  		current->flags &= ~PF_VCPU;
> +
> +	if (context_tracking_is_enabled())
> +		context_tracking_user_exit(IN_GUEST);
>  }
>  
>  #else
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index 97a81225d037..f3ef027af749 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -15,6 +15,7 @@ struct context_tracking {
>  	enum ctx_state {
>  		IN_KERNEL = 0,
>  		IN_USER,
> +		IN_GUEST,
>  	} state;
>  };
>  
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 26f106022c88..c7828a6a9614 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -772,7 +772,8 @@ static inline void kvm_guest_enter(void)
>  	 * one time slice). Lets treat guest mode as quiescent state, just like
>  	 * we do with user-mode execution.
>  	 */
> -	rcu_virt_note_context_switch(smp_processor_id());
> +	if (!context_tracking_cpu_is_enabled())
> +		rcu_virt_note_context_switch(smp_processor_id());

Should we have a specific CONFIG for this feature? Or relying on full dynticks
to be enabled (and thus context tracking enabled) is enough?

Thanks.

>  }
>  
>  static inline void kvm_guest_exit(void)
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest
  2015-02-06 14:56   ` Rik van Riel
@ 2015-02-06 18:05     ` Frederic Weisbecker
  0 siblings, 0 replies; 34+ messages in thread
From: Frederic Weisbecker @ 2015-02-06 18:05 UTC (permalink / raw)
  To: Rik van Riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini

On Fri, Feb 06, 2015 at 09:56:43AM -0500, Rik van Riel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 02/06/2015 08:46 AM, Frederic Weisbecker wrote:
> > On Thu, Feb 05, 2015 at 03:23:47PM -0500, riel@redhat.com wrote:
> >> When running a KVM guest on a system with NOHZ_FULL enabled
> > 
> > I just need to clarify the motivation first, does the above
> > situation really happen? Ok some distros enable NOHZ_FULL to let
> > the user stop the tick in userspace. So most of the time,
> > CONFIG_NOHZ_FULL=y but nohz full is runtime disabled (we need to
> > pass a nohz_full= boot parameter to enable it). And when it is
> > runtime disabled, there should be no rcu nocb CPU.
> > 
> > (Although not setting CPUs in nocb mode when nohz full is runtime
> > disabled is perhaps a recent change.)
> > 
> > So for the problem to arise, one need to enable nohz_full and run
> > KVM guest. And I never heard about such workloads. That said it's
> > potentially interesting to turn off the tick on the host when the
> > guest runs.
> > 
> >> , and the KVM guest running with idle=poll mode, we still get
> >> wakeups of the rcuos/N threads.
> > 
> > So we need nohz_full on the host and idle=poll mode on the guest.
> > Is it likely to happen? (sorry, again I'm just trying to make sure
> > we agree on why we do this change).
> 
> We have users interested in doing just that, in order to run
> KVM guests with the least amount of perturbation to the guest.

So what are you interested in exactly? Only RCU extended quiescent state
or also full dynticks? Because your cover letter only points to disturbing
RCU nocb and quiescent states.

Also I'm curious why you run guests in idle=poll. Maybe that avoids host/guest
context switches?

Thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest
  2015-02-06 13:50   ` Paolo Bonzini
  2015-02-06 16:19     ` Paul E. McKenney
@ 2015-02-06 18:09     ` Frederic Weisbecker
  1 sibling, 0 replies; 34+ messages in thread
From: Frederic Weisbecker @ 2015-02-06 18:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: riel, kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino

On Fri, Feb 06, 2015 at 02:50:44PM +0100, Paolo Bonzini wrote:
> 
> 
> On 06/02/2015 14:46, Frederic Weisbecker wrote:
> > > When running a KVM guest on a system with NOHZ_FULL enabled
> > 
> > I just need to clarify the motivation first, does the above situation
> > really happen? Ok some distros enable NOHZ_FULL to let the user stop
> > the tick in userspace. So most of the time, CONFIG_NOHZ_FULL=y but
> > nohz full is runtime disabled (we need to pass a nohz_full= boot
> > parameter to enable it). And when it is runtime disabled, there should
> > be no rcu nocb CPU.
> > 
> > (Although not setting CPUs in nocb mode when nohz full is runtime disabled
> > is perhaps a recent change.)
> > 
> > So for the problem to arise, one need to enable nohz_full and run KVM
> > guest. And I never heard about such workloads.
> 
> Yeah, it's a new thing but Marcelo, Luiz and Rik have been having a lot
> of fun with them (with PREEMPT_RT too).  They're getting pretty good
> results given the right tuning.

Ok but, I'm still not sure about the details of what you're trying to do.
Whether it's only about RCU or it also involves ticks. What kind of tuning
you're doing and what kind of performance gain?

Thanks.

> 
> I'll let Paul queue the patches for 3.21 then!
> 
> Paolo
> 
> > That said it's potentially
> > interesting to turn off the tick on the host when the guest runs.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-06 17:22   ` Frederic Weisbecker
@ 2015-02-06 18:20     ` Rik van Riel
  2015-02-06 18:23       ` Frederic Weisbecker
  0 siblings, 1 reply; 34+ messages in thread
From: Rik van Riel @ 2015-02-06 18:20 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/06/2015 12:22 PM, Frederic Weisbecker wrote:
> On Thu, Feb 05, 2015 at 03:23:48PM -0500, riel@redhat.com wrote:
>> From: Rik van Riel <riel@redhat.com>
>> 
>> Add the expected ctx_state as a parameter to
>> context_tracking_user_enter and context_tracking_user_exit,
>> allowing the same functions to not just track kernel <> user
>> space switching, but also kernel <> guest transitions.
>> 
>> Signed-off-by: Rik van Riel <riel@redhat.com>
> 
> You should consider using guest_enter() and guest_exit() instead.
> These are context tracking APIs too but specifically for guest.

What do you mean instead?  KVM already uses those.

I just wanted to avoid duplicating the code...

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJU1QXlAAoJEM553pKExN6D+l4H/1PPmFioxed9XyL+rJZf0XSt
mATl5JcWGlNybL5c4Tnld/3FX5/vYwBXmgw2Rh5a84F+TJi8B+Hu2Uwetl6C6vUF
EK2+ExJ1rla4lpiO3frxPDdfdOHJFw2bR0fhEb4GHqcN2ecfSdXtL4hKwFru5h5s
IJ8dzNIW52vzqzmulkcvI1y+VkgQBwUXYbkiGyy/MPf4F0WvGC9g44eXHZNPRXoT
V34/nMJCpFHlZ7FVuHqGGstmPjv19VUAYNhUkrlU8DOpZMKxT58Sb1CGLfwsGqvZ
y0+pRca8eT+gX0vqg9YUBfoEBNy4MnHdQEwQ0EPZwPJkcQ3Leco3/1JLHyDogCg=
=3AJV
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest
  2015-02-06 15:00 ` Christian Borntraeger
@ 2015-02-06 18:20   ` Frederic Weisbecker
  0 siblings, 0 replies; 34+ messages in thread
From: Frederic Weisbecker @ 2015-02-06 18:20 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: riel, kvm, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini, linux-s390

On Fri, Feb 06, 2015 at 04:00:27PM +0100, Christian Borntraeger wrote:
> Am 05.02.2015 um 21:23 schrieb riel@redhat.com:
> > When running a KVM guest on a system with NOHZ_FULL enabled, and the
> > KVM guest running with idle=poll mode, we still get wakeups of the
> > rcuos/N threads.
> > 
> > This problem has already been solved for user space by telling the
> > RCU subsystem that the CPU is in an extended quiescent state while
> > running user space code.
> > 
> > This patch series extends that code a little bit to make it usable
> > to track KVM guest space, too.
> > 
> > I tested the code by booting a KVM guest with idle=poll, on a system
> > with NOHZ_FULL enabled on most CPUs, and a VCPU thread bound to a
> > CPU. In a 10 second interval, rcuos/N threads on other CPUs got woken
> > up several times, while the rcuos thread on the CPU running the bound
> > and alwasy running VCPU thread never got woken up once.
> > 
> > Thanks to Christian Borntraeger and Paul McKenney for reviewing the
> > first version of this patch series, and helping optimize patch 4/5.
> 
> I gave it a quick run on s390/kvm and everything still seem to be 
> running fine. A also I like the idea of this patch set.
> 
> We have seen several cases were the fact that we are in guest context
> a full tick for cpu bound guests (10ms on s390) caused significant
> latencies for host synchronize-rcu heavy workload - e.g. getting rid
> of macvtap devices on guest shutdown, adding hundreds of irq routes
> for many guest devices....
> 
> s390 has no context tracking infrastructure yet (no nohz_full), but
> this series looks like that the current case (nohz_idle) still works.
> With this in place, having hohz==full on s390 now even makes more
> sense, as KVM hosts with cpu bound guests should have get much quicker
> rcu response times when most host CPUs are in an extended quiescant
> state.

Sure, if you need any help for context tracking, don't hesitate to ask,
it can be a bit tricky to implement sometimes. Perhaps x86 isn't the
best example because it does quite some weird dances to minimize fast path
overhead. ARM is perhaps clearer.

> 
> Christian
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-06 18:20     ` Rik van Riel
@ 2015-02-06 18:23       ` Frederic Weisbecker
  2015-02-06 18:51         ` Rik van Riel
  0 siblings, 1 reply; 34+ messages in thread
From: Frederic Weisbecker @ 2015-02-06 18:23 UTC (permalink / raw)
  To: Rik van Riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini

On Fri, Feb 06, 2015 at 01:20:21PM -0500, Rik van Riel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 02/06/2015 12:22 PM, Frederic Weisbecker wrote:
> > On Thu, Feb 05, 2015 at 03:23:48PM -0500, riel@redhat.com wrote:
> >> From: Rik van Riel <riel@redhat.com>
> >> 
> >> Add the expected ctx_state as a parameter to
> >> context_tracking_user_enter and context_tracking_user_exit,
> >> allowing the same functions to not just track kernel <> user
> >> space switching, but also kernel <> guest transitions.
> >> 
> >> Signed-off-by: Rik van Riel <riel@redhat.com>
> > 
> > You should consider using guest_enter() and guest_exit() instead.
> > These are context tracking APIs too but specifically for guest.
> 
> What do you mean instead?  KVM already uses those.
> 
> I just wanted to avoid duplicating the code...

I mean you can call rcu_user APIs directly from guest_enter/exit.
You don't really need to call the context_tracking_user functions
since guest_enter/guest_exit already handle the vtime accounting.

> 
> - -- 
> All rights reversed
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
> 
> iQEcBAEBAgAGBQJU1QXlAAoJEM553pKExN6D+l4H/1PPmFioxed9XyL+rJZf0XSt
> mATl5JcWGlNybL5c4Tnld/3FX5/vYwBXmgw2Rh5a84F+TJi8B+Hu2Uwetl6C6vUF
> EK2+ExJ1rla4lpiO3frxPDdfdOHJFw2bR0fhEb4GHqcN2ecfSdXtL4hKwFru5h5s
> IJ8dzNIW52vzqzmulkcvI1y+VkgQBwUXYbkiGyy/MPf4F0WvGC9g44eXHZNPRXoT
> V34/nMJCpFHlZ7FVuHqGGstmPjv19VUAYNhUkrlU8DOpZMKxT58Sb1CGLfwsGqvZ
> y0+pRca8eT+gX0vqg9YUBfoEBNy4MnHdQEwQ0EPZwPJkcQ3Leco3/1JLHyDogCg=
> =3AJV
> -----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-06 18:23       ` Frederic Weisbecker
@ 2015-02-06 18:51         ` Rik van Riel
  2015-02-06 23:15           ` Frederic Weisbecker
  0 siblings, 1 reply; 34+ messages in thread
From: Rik van Riel @ 2015-02-06 18:51 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/06/2015 01:23 PM, Frederic Weisbecker wrote:
> On Fri, Feb 06, 2015 at 01:20:21PM -0500, Rik van Riel wrote: On
> 02/06/2015 12:22 PM, Frederic Weisbecker wrote:
>>>> On Thu, Feb 05, 2015 at 03:23:48PM -0500, riel@redhat.com
>>>> wrote:
>>>>> From: Rik van Riel <riel@redhat.com>
>>>>> 
>>>>> Add the expected ctx_state as a parameter to 
>>>>> context_tracking_user_enter and
>>>>> context_tracking_user_exit, allowing the same functions to
>>>>> not just track kernel <> user space switching, but also
>>>>> kernel <> guest transitions.
>>>>> 
>>>>> Signed-off-by: Rik van Riel <riel@redhat.com>
>>>> 
>>>> You should consider using guest_enter() and guest_exit()
>>>> instead. These are context tracking APIs too but specifically
>>>> for guest.
> 
> What do you mean instead?  KVM already uses those.
> 
> I just wanted to avoid duplicating the code...
> 
>> I mean you can call rcu_user APIs directly from
>> guest_enter/exit. You don't really need to call the
>> context_tracking_user functions since guest_enter/guest_exit
>> already handle the vtime accounting.

I would still have to modify exception_enter and exception_exit,
and with them context_tracking_user_enter and
context_tracking_user_exit.

We have to re-enable RCU when an exception happens.

I suspect exceptions in a guest just trigger VMEXIT, and we
figure later why the exception happened. However, if we were
to get an exception during the code where we transition into
or out of guest mode, we would still need exception_enter
and exception_exit...

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJU1Q1MAAoJEM553pKExN6DB90H/iVCnfrooAB15E5Qioa3Ty+X
hNMaIMX6zjYg++IFR5BhYLp9hp36o/98sv8RLTjZQix2q1ljivobmbABvx2MBNhx
NiPfU9DyBkhz3gwI4oTkggb383Wrcyt+RgvclI/96AbwkhrdzxmT1nnUc0kA98xC
6NTW2+imkYX31sY/2SFmYWnJMVZOjOIep3LCVh/hrWnQARd6mdyzzFr+v6Z/vyFe
8P2rbqlkN0nf1pGYz3VF6zqF8wVmOi1mx4mo0qy80Sax7jsZv9+gGfbF1HkHJnjg
FLsj/q/mcrH1GBK54a3s3P6ghpcFXfwibjhkGmrmA07XNHqLiNgKgmgPtArhU+s=
=9Ln1
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-06 18:51         ` Rik van Riel
@ 2015-02-06 23:15           ` Frederic Weisbecker
  2015-02-07  3:53             ` Rik van Riel
  0 siblings, 1 reply; 34+ messages in thread
From: Frederic Weisbecker @ 2015-02-06 23:15 UTC (permalink / raw)
  To: Rik van Riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini

On Fri, Feb 06, 2015 at 01:51:56PM -0500, Rik van Riel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 02/06/2015 01:23 PM, Frederic Weisbecker wrote:
> > On Fri, Feb 06, 2015 at 01:20:21PM -0500, Rik van Riel wrote: On
> > 02/06/2015 12:22 PM, Frederic Weisbecker wrote:
> >>>> On Thu, Feb 05, 2015 at 03:23:48PM -0500, riel@redhat.com
> >>>> wrote:
> >>>>> From: Rik van Riel <riel@redhat.com>
> >>>>> 
> >>>>> Add the expected ctx_state as a parameter to 
> >>>>> context_tracking_user_enter and
> >>>>> context_tracking_user_exit, allowing the same functions to
> >>>>> not just track kernel <> user space switching, but also
> >>>>> kernel <> guest transitions.
> >>>>> 
> >>>>> Signed-off-by: Rik van Riel <riel@redhat.com>
> >>>> 
> >>>> You should consider using guest_enter() and guest_exit()
> >>>> instead. These are context tracking APIs too but specifically
> >>>> for guest.
> > 
> > What do you mean instead?  KVM already uses those.
> > 
> > I just wanted to avoid duplicating the code...
> > 
> >> I mean you can call rcu_user APIs directly from
> >> guest_enter/exit. You don't really need to call the
> >> context_tracking_user functions since guest_enter/guest_exit
> >> already handle the vtime accounting.
> 
> I would still have to modify exception_enter and exception_exit,
> and with them context_tracking_user_enter and
> context_tracking_user_exit.
> 
> We have to re-enable RCU when an exception happens.
> 
> I suspect exceptions in a guest just trigger VMEXIT, and we
> figure later why the exception happened. However, if we were
> to get an exception during the code where we transition into
> or out of guest mode, we would still need exception_enter
> and exception_exit...

Ah that's a fair point. I didn't think about that. Ok then a real
IN_GUEST mode makes sense. And context_tracking_user_enter/exit() can
be reused as is indeed.

Just a few things then:

1) In this case rename context_tracking_user_enter/exit() to
context_tracking_enter() and context_tracking_exit(), since it's not
anymore about user only but about any generic context.

2) We have the "WARN_ON_ONCE(!current->mm);" condition that is a debug
check specific to userspace transitions because kernel threads aren't
expected to resume to userspace. Can we also expect that we never switch
to/from guest from a kernel thread? AFAICS this happens from an ioctl (thus
user task) in x86 for kvm. But I only know this case.

3) You might want to update a few comments that assume we only deal with
userspace transitions.

4) trace_user_enter/exit() should stay user-transitions specific.

Thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/5] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-05 20:23 ` [PATCH 4/5] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest riel
  2015-02-05 23:56   ` Paul E. McKenney
  2015-02-06 18:01   ` Frederic Weisbecker
@ 2015-02-06 23:24   ` Frederic Weisbecker
  2 siblings, 0 replies; 34+ messages in thread
From: Frederic Weisbecker @ 2015-02-06 23:24 UTC (permalink / raw)
  To: riel
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini

On Thu, Feb 05, 2015 at 03:23:51PM -0500, riel@redhat.com wrote:
> From: Rik van Riel <riel@redhat.com>
> 
> The host kernel is not doing anything while the CPU is executing
> a KVM guest VCPU, so it can be marked as being in an extended
> quiescent state, identical to that used when running user space
> code.
> 
> The only exception to that rule is when the host handles an
> interrupt, which is already handled by the irq code, which
> calls rcu_irq_enter and rcu_irq_exit.
> 
> The guest_enter and guest_exit functions already switch vtime
> accounting independent of context tracking, so leave those calls
> where they are, instead of moving them into the context tracking
> code.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
>  include/linux/context_tracking.h       | 8 +++++++-
>  include/linux/context_tracking_state.h | 1 +
>  include/linux/kvm_host.h               | 3 ++-
>  3 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> index bd9f000fc98d..a5d3bb44b897 100644
> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
>  static inline void exception_exit(enum ctx_state prev_ctx)
>  {
>  	if (context_tracking_is_enabled()) {
> -		if (prev_ctx == IN_USER)
> +		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)

That's nitpicking but != IN_KERNEL would be more generic. We are exiting an exception
and we know that the exception executes IN_KERNEL, so we want to restore any context
(whether IN_USER, IN_GUEST, or anything added in the future) prior the exception if that
was anything else than IN_KERNEL.

>  			context_tracking_user_enter(prev_ctx);
>  	}
>  }
> @@ -78,6 +78,9 @@ static inline void guest_enter(void)
>  		vtime_guest_enter(current);
>  	else
>  		current->flags |= PF_VCPU;
> +
> +	if (context_tracking_is_enabled())
> +		context_tracking_user_enter(IN_GUEST);
>  }
>  
>  static inline void guest_exit(void)
> @@ -86,6 +89,9 @@ static inline void guest_exit(void)
>  		vtime_guest_exit(current);
>  	else
>  		current->flags &= ~PF_VCPU;
> +
> +	if (context_tracking_is_enabled())
> +		context_tracking_user_exit(IN_GUEST);

I suggest you to restore RCU before anything else. I believe cputime
accounting doesn't use RCU but we never know with all the debug/tracing
code behind, the acct accounting...

Thanks.

>  }
>  
>  #else
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index 97a81225d037..f3ef027af749 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -15,6 +15,7 @@ struct context_tracking {
>  	enum ctx_state {
>  		IN_KERNEL = 0,
>  		IN_USER,
> +		IN_GUEST,
>  	} state;
>  };
>  
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 26f106022c88..c7828a6a9614 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -772,7 +772,8 @@ static inline void kvm_guest_enter(void)
>  	 * one time slice). Lets treat guest mode as quiescent state, just like
>  	 * we do with user-mode execution.
>  	 */
> -	rcu_virt_note_context_switch(smp_processor_id());
> +	if (!context_tracking_cpu_is_enabled())
> +		rcu_virt_note_context_switch(smp_processor_id());
>  }
>  
>  static inline void kvm_guest_exit(void)
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-06 23:15           ` Frederic Weisbecker
@ 2015-02-07  3:53             ` Rik van Riel
  2015-02-07  6:34               ` Paul E. McKenney
  0 siblings, 1 reply; 34+ messages in thread
From: Rik van Riel @ 2015-02-07  3:53 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, paulmck, lcapitulino, pbonzini

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/06/2015 06:15 PM, Frederic Weisbecker wrote:

> Just a few things then:
> 
> 1) In this case rename context_tracking_user_enter/exit() to 
> context_tracking_enter() and context_tracking_exit(), since it's
> not anymore about user only but about any generic context.
> 
> 2) We have the "WARN_ON_ONCE(!current->mm);" condition that is a
> debug check specific to userspace transitions because kernel
> threads aren't expected to resume to userspace. Can we also expect
> that we never switch to/from guest from a kernel thread? AFAICS
> this happens from an ioctl (thus user task) in x86 for kvm. But I
> only know this case.
> 
> 3) You might want to update a few comments that assume we only deal
> with userspace transitions.
> 
> 4) trace_user_enter/exit() should stay user-transitions specific.

Paul, would you like me to send follow-up patches with the cleanups
suggested by Frederic, or would you prefer me to send a new series
with the cleanups integrated?

Frederic, I will also add the cleanup you suggested for patch 4/5.

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJU1Yw+AAoJEM553pKExN6DhycH/ifPeaRaFcj/BBKaDf7BmKAJ
cGMplf/vMtJA5DCFfZTmRp5Yb/9f3XBk8MU4Z+oWZFPB/msA8WkibhZtRGXpXXl9
7XgDXaXUuo++Axhb3SYHXEDhkPkhmfdjlctyr5ZUu3gHqkeWl6utv0t4anIBfo3Z
NdWG8yEhgKU6OyFppf3CH0Cm46xPN+pUyAFMgK9HbSfDkR3a9rMZ32aQq8fyV15e
LV4qE+/lPi7lyoLqbHmmU+pqp6iBqyQ9uFIDCRAoBBXF5jh0jxEynRubBn2D1HZJ
FBi+dBWGhAjRN05tuurvkwbJtcmTpnsHyNrmzNlAeop0Upc/5Vta43zN/nu1AFA=
=Z9mE
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-07  3:53             ` Rik van Riel
@ 2015-02-07  6:34               ` Paul E. McKenney
  2015-02-07  7:14                 ` Paul E. McKenney
  0 siblings, 1 reply; 34+ messages in thread
From: Paul E. McKenney @ 2015-02-07  6:34 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Frederic Weisbecker, kvm, borntraeger, linux-kernel, mtosatti,
	mingo, ak, oleg, masami.hiramatsu.pt, lcapitulino, pbonzini

On Fri, Feb 06, 2015 at 10:53:34PM -0500, Rik van Riel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 02/06/2015 06:15 PM, Frederic Weisbecker wrote:
> 
> > Just a few things then:
> > 
> > 1) In this case rename context_tracking_user_enter/exit() to 
> > context_tracking_enter() and context_tracking_exit(), since it's
> > not anymore about user only but about any generic context.
> > 
> > 2) We have the "WARN_ON_ONCE(!current->mm);" condition that is a
> > debug check specific to userspace transitions because kernel
> > threads aren't expected to resume to userspace. Can we also expect
> > that we never switch to/from guest from a kernel thread? AFAICS
> > this happens from an ioctl (thus user task) in x86 for kvm. But I
> > only know this case.
> > 
> > 3) You might want to update a few comments that assume we only deal
> > with userspace transitions.
> > 
> > 4) trace_user_enter/exit() should stay user-transitions specific.
> 
> Paul, would you like me to send follow-up patches with the cleanups
> suggested by Frederic, or would you prefer me to send a new series
> with the cleanups integrated?

I would prefer a new series, in order to prevent possible future
confusion.

							Thanx, Paul

> Frederic, I will also add the cleanup you suggested for patch 4/5.
> 
> - -- 
> All rights reversed
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
> 
> iQEcBAEBAgAGBQJU1Yw+AAoJEM553pKExN6DhycH/ifPeaRaFcj/BBKaDf7BmKAJ
> cGMplf/vMtJA5DCFfZTmRp5Yb/9f3XBk8MU4Z+oWZFPB/msA8WkibhZtRGXpXXl9
> 7XgDXaXUuo++Axhb3SYHXEDhkPkhmfdjlctyr5ZUu3gHqkeWl6utv0t4anIBfo3Z
> NdWG8yEhgKU6OyFppf3CH0Cm46xPN+pUyAFMgK9HbSfDkR3a9rMZ32aQq8fyV15e
> LV4qE+/lPi7lyoLqbHmmU+pqp6iBqyQ9uFIDCRAoBBXF5jh0jxEynRubBn2D1HZJ
> FBi+dBWGhAjRN05tuurvkwbJtcmTpnsHyNrmzNlAeop0Upc/5Vta43zN/nu1AFA=
> =Z9mE
> -----END PGP SIGNATURE-----
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-07  6:34               ` Paul E. McKenney
@ 2015-02-07  7:14                 ` Paul E. McKenney
  2015-02-07  8:30                   ` Frederic Weisbecker
  0 siblings, 1 reply; 34+ messages in thread
From: Paul E. McKenney @ 2015-02-07  7:14 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Frederic Weisbecker, kvm, borntraeger, linux-kernel, mtosatti,
	mingo, ak, oleg, masami.hiramatsu.pt, lcapitulino, pbonzini

On Fri, Feb 06, 2015 at 10:34:21PM -0800, Paul E. McKenney wrote:
> On Fri, Feb 06, 2015 at 10:53:34PM -0500, Rik van Riel wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > On 02/06/2015 06:15 PM, Frederic Weisbecker wrote:
> > 
> > > Just a few things then:
> > > 
> > > 1) In this case rename context_tracking_user_enter/exit() to 
> > > context_tracking_enter() and context_tracking_exit(), since it's
> > > not anymore about user only but about any generic context.
> > > 
> > > 2) We have the "WARN_ON_ONCE(!current->mm);" condition that is a
> > > debug check specific to userspace transitions because kernel
> > > threads aren't expected to resume to userspace. Can we also expect
> > > that we never switch to/from guest from a kernel thread? AFAICS
> > > this happens from an ioctl (thus user task) in x86 for kvm. But I
> > > only know this case.
> > > 
> > > 3) You might want to update a few comments that assume we only deal
> > > with userspace transitions.
> > > 
> > > 4) trace_user_enter/exit() should stay user-transitions specific.
> > 
> > Paul, would you like me to send follow-up patches with the cleanups
> > suggested by Frederic, or would you prefer me to send a new series
> > with the cleanups integrated?
> 
> I would prefer a new series, in order to prevent possible future
> confusion.

Of course, if Frederic would rather push them himself, I am fine with
that.  And in that case, you should ask him for his preferences, which
just might differ from mine.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-07  7:14                 ` Paul E. McKenney
@ 2015-02-07  8:30                   ` Frederic Weisbecker
  2015-02-07 11:29                     ` Rik van Riel
  2015-02-07 20:06                     ` Paul E. McKenney
  0 siblings, 2 replies; 34+ messages in thread
From: Frederic Weisbecker @ 2015-02-07  8:30 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Rik van Riel, kvm, borntraeger, linux-kernel, mtosatti, mingo,
	ak, oleg, masami.hiramatsu.pt, lcapitulino, pbonzini

On Fri, Feb 06, 2015 at 11:14:53PM -0800, Paul E. McKenney wrote:
> On Fri, Feb 06, 2015 at 10:34:21PM -0800, Paul E. McKenney wrote:
> > On Fri, Feb 06, 2015 at 10:53:34PM -0500, Rik van Riel wrote:
> > > -----BEGIN PGP SIGNED MESSAGE-----
> > > Hash: SHA1
> > > 
> > > On 02/06/2015 06:15 PM, Frederic Weisbecker wrote:
> > > 
> > > > Just a few things then:
> > > > 
> > > > 1) In this case rename context_tracking_user_enter/exit() to 
> > > > context_tracking_enter() and context_tracking_exit(), since it's
> > > > not anymore about user only but about any generic context.
> > > > 
> > > > 2) We have the "WARN_ON_ONCE(!current->mm);" condition that is a
> > > > debug check specific to userspace transitions because kernel
> > > > threads aren't expected to resume to userspace. Can we also expect
> > > > that we never switch to/from guest from a kernel thread? AFAICS
> > > > this happens from an ioctl (thus user task) in x86 for kvm. But I
> > > > only know this case.
> > > > 
> > > > 3) You might want to update a few comments that assume we only deal
> > > > with userspace transitions.
> > > > 
> > > > 4) trace_user_enter/exit() should stay user-transitions specific.
> > > 
> > > Paul, would you like me to send follow-up patches with the cleanups
> > > suggested by Frederic, or would you prefer me to send a new series
> > > with the cleanups integrated?
> > 
> > I would prefer a new series, in order to prevent possible future
> > confusion.
> 
> Of course, if Frederic would rather push them himself, I am fine with
> that.  And in that case, you should ask him for his preferences, which
> just might differ from mine.  ;-)

I prefer a new series too. Now whether you or me take the patches, I don't mind
either way :-)

Also I wonder how this feature is going to be enabled. Will it be enabled on
full dynticks or should it be a seperate feature depending on full dynticks?
Or even just CONFIG_RCU_USER_EQS? Because I'm still unclear about how and what
this is used, if it involves full dynticks or only RCU extended quiescent states.

Thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-07  8:30                   ` Frederic Weisbecker
@ 2015-02-07 11:29                     ` Rik van Riel
  2015-02-07 20:06                     ` Paul E. McKenney
  1 sibling, 0 replies; 34+ messages in thread
From: Rik van Riel @ 2015-02-07 11:29 UTC (permalink / raw)
  To: Frederic Weisbecker, Paul E. McKenney
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, lcapitulino, pbonzini

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/07/2015 03:30 AM, Frederic Weisbecker wrote:

> I prefer a new series too. Now whether you or me take the patches,
> I don't mind either way :-)

I'll make it, no problem.

> Also I wonder how this feature is going to be enabled. Will it be
> enabled on full dynticks or should it be a seperate feature
> depending on full dynticks? Or even just CONFIG_RCU_USER_EQS?
> Because I'm still unclear about how and what this is used, if it
> involves full dynticks or only RCU extended quiescent states.

It involves full dynticks and CONFIG_RCU_USER_EQS.

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJU1fcsAAoJEM553pKExN6D7NEH/RZM/gqZ7CCACr4T3/Esd8GL
IeYmZui+GDyrzj63/xX7ZgU+aqPbkfbEJ3ueQkabjtzIHhkurBM19XZ8CwWb42S9
5kAi51MjLrNnLPdvYCcu2q15TKSygU+V5wvxVohxHC9fi+tE/1+FOrVATky68uO4
6izXTm8EXbDLRg0tB5Mq/sRqBXGHfDw19vVQqMkQ47vzIw4oNHpLBSTv7GXHhN7u
GH0QMzcDUUZ8IcyOSxLhRPOUX3XrV7C4U8ilP0ZJQ287sqtsMpQWtNZK6jmJN1tv
niCrHQAOH++MuuF3x2fulpO3fSTbgwW3bGeMKh2ITHk0ODG6iIh1htmg4EFA2Bg=
=AyIN
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-07  8:30                   ` Frederic Weisbecker
  2015-02-07 11:29                     ` Rik van Riel
@ 2015-02-07 20:06                     ` Paul E. McKenney
  2015-02-09 15:42                       ` Rik van Riel
  1 sibling, 1 reply; 34+ messages in thread
From: Paul E. McKenney @ 2015-02-07 20:06 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Rik van Riel, kvm, borntraeger, linux-kernel, mtosatti, mingo,
	ak, oleg, masami.hiramatsu.pt, lcapitulino, pbonzini

On Sat, Feb 07, 2015 at 09:30:41AM +0100, Frederic Weisbecker wrote:
> On Fri, Feb 06, 2015 at 11:14:53PM -0800, Paul E. McKenney wrote:
> > On Fri, Feb 06, 2015 at 10:34:21PM -0800, Paul E. McKenney wrote:
> > > On Fri, Feb 06, 2015 at 10:53:34PM -0500, Rik van Riel wrote:
> > > > -----BEGIN PGP SIGNED MESSAGE-----
> > > > Hash: SHA1
> > > > 
> > > > On 02/06/2015 06:15 PM, Frederic Weisbecker wrote:
> > > > 
> > > > > Just a few things then:
> > > > > 
> > > > > 1) In this case rename context_tracking_user_enter/exit() to 
> > > > > context_tracking_enter() and context_tracking_exit(), since it's
> > > > > not anymore about user only but about any generic context.
> > > > > 
> > > > > 2) We have the "WARN_ON_ONCE(!current->mm);" condition that is a
> > > > > debug check specific to userspace transitions because kernel
> > > > > threads aren't expected to resume to userspace. Can we also expect
> > > > > that we never switch to/from guest from a kernel thread? AFAICS
> > > > > this happens from an ioctl (thus user task) in x86 for kvm. But I
> > > > > only know this case.
> > > > > 
> > > > > 3) You might want to update a few comments that assume we only deal
> > > > > with userspace transitions.
> > > > > 
> > > > > 4) trace_user_enter/exit() should stay user-transitions specific.
> > > > 
> > > > Paul, would you like me to send follow-up patches with the cleanups
> > > > suggested by Frederic, or would you prefer me to send a new series
> > > > with the cleanups integrated?
> > > 
> > > I would prefer a new series, in order to prevent possible future
> > > confusion.
> > 
> > Of course, if Frederic would rather push them himself, I am fine with
> > that.  And in that case, you should ask him for his preferences, which
> > just might differ from mine.  ;-)
> 
> I prefer a new series too. Now whether you or me take the patches, I don't mind
> either way :-)
> 
> Also I wonder how this feature is going to be enabled. Will it be enabled on
> full dynticks or should it be a seperate feature depending on full dynticks?
> Or even just CONFIG_RCU_USER_EQS? Because I'm still unclear about how and what
> this is used, if it involves full dynticks or only RCU extended quiescent states.

Well, we certainly need it documented.  And validation considerations
would push for keeping the number of possible combinations low, while
paranoia about added feature would push for having it be separately
enabled.  And if distros are going to enable this at build time, we
either need -serious- validation or a way to disable at boot time.

On the desired/required combinations of features, let's see...

If I understand this completely, which I probably don't, we have the
following considerations:

o	NO_HZ_FULL: Needed to get rid of the scheduling-clock interrupt
	during guest execution, though I am not sure whether we really
	have that completely wired up with this patch set.  Regardless,
	Rik, for your use case, do you care about whether or not the
	guest gets interrupted by the host's scheduling-clock interrupts?
	(Based on discussion in this thread, my guess is "yes".)

o	RCU_NOCB_CPUS: Implied by NO_HZ_FULL, but only on CPUs actually
	enabled for NO_HZ_FULL operation, either by NO_HZ_FULL_ALL
	at build time or by nohz_full= at boot time.  Needed to avoid
	interrupting the guest with host RCU callback invocation.
	Rik, does your use case care about guests being interrupted
	by RCU callback invocation?  (Based on discussion in this thread,
	my guess is "yes".)

o	RCU_USER_EQS: Implied by NO_HZ_FULL, and I would have to go look
	to see what relation this has to nohz_full=.  Needed for RCU to be
	able to recognize userspace-execution quiescent states on a given
	CPU without disturbing that CPU.  Unless I am missing something
	subtle, you have to have this for this patch series to make sense.

If my guesses are correct, the best approach would be to have this
new mode of operation implied by NO_HZ_FULL.  The patches seem simple
enough that killer validation should be practical, which would avoid
further complication of the Kconfig combinatorial space.

So, are my guesses correct?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit
  2015-02-07 20:06                     ` Paul E. McKenney
@ 2015-02-09 15:42                       ` Rik van Riel
  0 siblings, 0 replies; 34+ messages in thread
From: Rik van Riel @ 2015-02-09 15:42 UTC (permalink / raw)
  To: paulmck, Frederic Weisbecker
  Cc: kvm, borntraeger, linux-kernel, mtosatti, mingo, ak, oleg,
	masami.hiramatsu.pt, lcapitulino, pbonzini

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/07/2015 03:06 PM, Paul E. McKenney wrote:
> On Sat, Feb 07, 2015 at 09:30:41AM +0100, Frederic Weisbecker
> wrote:
>> On Fri, Feb 06, 2015 at 11:14:53PM -0800, Paul E. McKenney
>> wrote:
>>> On Fri, Feb 06, 2015 at 10:34:21PM -0800, Paul E. McKenney
>>> wrote:
>>>> On Fri, Feb 06, 2015 at 10:53:34PM -0500, Rik van Riel
>>>> wrote:
>>>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>>>> 
>>>>> On 02/06/2015 06:15 PM, Frederic Weisbecker wrote:
>>>>> 
>>>>>> Just a few things then:
>>>>>> 
>>>>>> 1) In this case rename context_tracking_user_enter/exit()
>>>>>> to context_tracking_enter() and context_tracking_exit(),
>>>>>> since it's not anymore about user only but about any
>>>>>> generic context.
>>>>>> 
>>>>>> 2) We have the "WARN_ON_ONCE(!current->mm);" condition
>>>>>> that is a debug check specific to userspace transitions
>>>>>> because kernel threads aren't expected to resume to
>>>>>> userspace. Can we also expect that we never switch
>>>>>> to/from guest from a kernel thread? AFAICS this happens
>>>>>> from an ioctl (thus user task) in x86 for kvm. But I only
>>>>>> know this case.
>>>>>> 
>>>>>> 3) You might want to update a few comments that assume we
>>>>>> only deal with userspace transitions.
>>>>>> 
>>>>>> 4) trace_user_enter/exit() should stay user-transitions
>>>>>> specific.
>>>>> 
>>>>> Paul, would you like me to send follow-up patches with the
>>>>> cleanups suggested by Frederic, or would you prefer me to
>>>>> send a new series with the cleanups integrated?
>>>> 
>>>> I would prefer a new series, in order to prevent possible
>>>> future confusion.
>>> 
>>> Of course, if Frederic would rather push them himself, I am
>>> fine with that.  And in that case, you should ask him for his
>>> preferences, which just might differ from mine.  ;-)
>> 
>> I prefer a new series too. Now whether you or me take the
>> patches, I don't mind either way :-)
>> 
>> Also I wonder how this feature is going to be enabled. Will it be
>> enabled on full dynticks or should it be a seperate feature
>> depending on full dynticks? Or even just CONFIG_RCU_USER_EQS?
>> Because I'm still unclear about how and what this is used, if it
>> involves full dynticks or only RCU extended quiescent states.
> 
> Well, we certainly need it documented.  And validation
> considerations would push for keeping the number of possible
> combinations low, while paranoia about added feature would push for
> having it be separately enabled.  And if distros are going to
> enable this at build time, we either need -serious- validation or a
> way to disable at boot time.
> 
> On the desired/required combinations of features, let's see...
> 
> If I understand this completely, which I probably don't, we have
> the following considerations:
> 
> o	NO_HZ_FULL: Needed to get rid of the scheduling-clock interrupt 
> during guest execution, though I am not sure whether we really have
> that completely wired up with this patch set.  Regardless, Rik, for
> your use case, do you care about whether or not the guest gets
> interrupted by the host's scheduling-clock interrupts? (Based on
> discussion in this thread, my guess is "yes".)
> 
> o	RCU_NOCB_CPUS: Implied by NO_HZ_FULL, but only on CPUs actually 
> enabled for NO_HZ_FULL operation, either by NO_HZ_FULL_ALL at build
> time or by nohz_full= at boot time.  Needed to avoid interrupting
> the guest with host RCU callback invocation. Rik, does your use
> case care about guests being interrupted by RCU callback
> invocation?  (Based on discussion in this thread, my guess is
> "yes".)
> 
> o	RCU_USER_EQS: Implied by NO_HZ_FULL, and I would have to go look 
> to see what relation this has to nohz_full=.  Needed for RCU to be 
> able to recognize userspace-execution quiescent states on a given 
> CPU without disturbing that CPU.  Unless I am missing something 
> subtle, you have to have this for this patch series to make sense.
> 
> If my guesses are correct, the best approach would be to have this 
> new mode of operation implied by NO_HZ_FULL.

I agree. It makes sense to have all three, and all three are enabled
in the configuration we use. I cannot think of a case where someone
would significantly benefit from just one or two of the above, except
maybe for debugging reasons.

Having NO_HZ_FULL enable all the above, either through a boot time
commandline option, or just by default, would make sense.

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJU2NVpAAoJEM553pKExN6DxxUH/RwpZI6dRYvIQbtY2y93ax5/
Lba4QbmZ6n6AnGXrtlpwEQMSMvLawKqT9ZFSwzKeSarX6Uu4aRCdi8td34ruu9rg
hfhv8hD1z15deYc0UPKUCbZrYrIi9uaG/FpioafDmPH+P4T2bFdvn7d/bKIoiaBM
T1QA+LNddRxOhtayrIEDH1BnPKgXw9V8f7/mGQPmRf+oRz+Hgn6DPpEm0kTbqn+L
RkhHNPemJ8bMaIwntAwzEklgnhkON9zOBe/XFof0lC+SdhtlAVkXPvX+cXiZMQZt
1rEqxK1+S9beeKVX65mLtxZg2omz46qz7SQRUGf3If2wHZXQtIRnvtlyCsDu/AI=
=gj2E
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2015-02-09 15:42 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-05 20:23 [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
2015-02-05 20:23 ` [PATCH 1/5] rcu,nohz: add state parameter to context_tracking_user_enter/exit riel
2015-02-05 23:55   ` Paul E. McKenney
2015-02-06 10:15     ` Paolo Bonzini
2015-02-06 13:41       ` Paul E. McKenney
2015-02-06 17:22   ` Frederic Weisbecker
2015-02-06 18:20     ` Rik van Riel
2015-02-06 18:23       ` Frederic Weisbecker
2015-02-06 18:51         ` Rik van Riel
2015-02-06 23:15           ` Frederic Weisbecker
2015-02-07  3:53             ` Rik van Riel
2015-02-07  6:34               ` Paul E. McKenney
2015-02-07  7:14                 ` Paul E. McKenney
2015-02-07  8:30                   ` Frederic Weisbecker
2015-02-07 11:29                     ` Rik van Riel
2015-02-07 20:06                     ` Paul E. McKenney
2015-02-09 15:42                       ` Rik van Riel
2015-02-05 20:23 ` [PATCH 2/5] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER riel
2015-02-05 20:23 ` [PATCH 3/5] nohz,kvm: export context_tracking_user_enter/exit riel
2015-02-05 23:55   ` Paul E. McKenney
2015-02-05 20:23 ` [PATCH 4/5] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest riel
2015-02-05 23:56   ` Paul E. McKenney
2015-02-06 18:01   ` Frederic Weisbecker
2015-02-06 23:24   ` Frederic Weisbecker
2015-02-05 20:23 ` [PATCH 5/5] nohz: add stub context_tracking_is_enabled riel
2015-02-05 23:56   ` Paul E. McKenney
2015-02-06 13:46 ` [PATCH v2 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest Frederic Weisbecker
2015-02-06 13:50   ` Paolo Bonzini
2015-02-06 16:19     ` Paul E. McKenney
2015-02-06 18:09     ` Frederic Weisbecker
2015-02-06 14:56   ` Rik van Riel
2015-02-06 18:05     ` Frederic Weisbecker
2015-02-06 15:00 ` Christian Borntraeger
2015-02-06 18:20   ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).