LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH RESEND] PM / sleep: Fix racing timers
@ 2014-09-22 17:07 Soren Brinkmann
  2014-09-22 23:01 ` Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Soren Brinkmann @ 2014-09-22 17:07 UTC (permalink / raw)
  To: Thomas Gleixner, Rafael J. Wysocki
  Cc: John Stultz, Pavel Machek, Len Brown, linux-kernel, linux-pm,
	Soren Brinkmann

On platforms that do not power off during suspend, successfully entering
suspend races with timers.

The race happening in a couple of location is:

  1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
     ...
  2. syscore_suspend()
      -> timekeeping_suspend()
       -> clockevents_notify(SUSPEND)
        -> tick_suspend()   	(timers are turned off here)
     ...
  3. wfi            		(wait for wake-IRQ here)

Between steps 1 and 2 the timers can still generate interrupts that are
not handled and stay pending until step 3. That pending IRQ causes an
immediate - spurious - wake.

The solution is to move the clockevents suspend/resume notification
out of the syscore_suspend step and explictly call them at the appropriate
time in the suspend/hibernation paths. I.e. timers are suspend _before_
IRQs get disabled. And accordingly in the resume path.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
---
Hi,

there was not a lot of discussion on the last submission. Just one comment from
Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
response, does not apply, IMHO, since the platform does not re-enable
interrupts. Hence, I still believe fixing this in the core is the appropriate
way.

	Thanks,
	Sören

Changes since RFC:
 - remove resume notification from syscore_resume path.
---
 kernel/power/hibernate.c  | 9 +++++++++
 kernel/power/suspend.c    | 5 +++++
 kernel/time/timekeeping.c | 3 ---
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index a9dfa79b6bab..76f15644aec8 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -285,6 +285,8 @@ static int create_image(int platform_mode)
 	if (error || hibernation_test(TEST_CPUS))
 		goto Enable_cpus;
 
+	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
+
 	local_irq_disable();
 
 	error = syscore_suspend();
@@ -316,6 +318,7 @@ static int create_image(int platform_mode)
 	syscore_resume();
 
  Enable_irqs:
+	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
 	local_irq_enable();
 
  Enable_cpus:
@@ -438,6 +441,8 @@ static int resume_target_kernel(bool platform_mode)
 	if (error)
 		goto Enable_cpus;
 
+	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
+
 	local_irq_disable();
 
 	error = syscore_suspend();
@@ -472,6 +477,7 @@ static int resume_target_kernel(bool platform_mode)
 	syscore_resume();
 
  Enable_irqs:
+	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
 	local_irq_enable();
 
  Enable_cpus:
@@ -550,6 +556,8 @@ int hibernation_platform_enter(void)
 	if (error)
 		goto Platform_finish;
 
+	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
+
 	local_irq_disable();
 	syscore_suspend();
 	if (pm_wakeup_pending()) {
@@ -563,6 +571,7 @@ int hibernation_platform_enter(void)
 
  Power_up:
 	syscore_resume();
+	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
 	local_irq_enable();
 	enable_nonboot_cpus();
 
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 18c62195660f..ded3c7d6731f 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -12,6 +12,7 @@
 #include <linux/delay.h>
 #include <linux/errno.h>
 #include <linux/init.h>
+#include <linux/clockchips.h>
 #include <linux/console.h>
 #include <linux/cpu.h>
 #include <linux/cpuidle.h>
@@ -294,6 +295,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
 	if (error || suspend_test(TEST_CPUS))
 		goto Enable_cpus;
 
+	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
+
 	arch_suspend_disable_irqs();
 	BUG_ON(!irqs_disabled());
 
@@ -311,6 +314,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
 		syscore_resume();
 	}
 
+	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
+
 	arch_suspend_enable_irqs();
 	BUG_ON(irqs_disabled());
 
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index ec1791fae965..5b6696772ffc 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1189,8 +1189,6 @@ static void timekeeping_resume(void)
 
 	touch_softlockup_watchdog();
 
-	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
-
 	/* Resume hrtimers */
 	hrtimers_resume();
 }
@@ -1243,7 +1241,6 @@ static int timekeeping_suspend(void)
 	write_seqcount_end(&tk_core.seq);
 	raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
 
-	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
 	clocksource_suspend();
 	clockevents_suspend();
 
-- 
2.1.0.1.g27b9230


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2014-09-22 17:07 [PATCH RESEND] PM / sleep: Fix racing timers Soren Brinkmann
@ 2014-09-22 23:01 ` Rafael J. Wysocki
  2014-10-02 16:01   ` Sören Brinkmann
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2014-09-22 23:01 UTC (permalink / raw)
  To: Soren Brinkmann
  Cc: Thomas Gleixner, John Stultz, Pavel Machek, Len Brown,
	linux-kernel, linux-pm

On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> On platforms that do not power off during suspend, successfully entering
> suspend races with timers.
> 
> The race happening in a couple of location is:
> 
>   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
>      ...
>   2. syscore_suspend()
>       -> timekeeping_suspend()
>        -> clockevents_notify(SUSPEND)
>         -> tick_suspend()   	(timers are turned off here)
>      ...
>   3. wfi            		(wait for wake-IRQ here)
> 
> Between steps 1 and 2 the timers can still generate interrupts that are
> not handled and stay pending until step 3. That pending IRQ causes an
> immediate - spurious - wake.
> 
> The solution is to move the clockevents suspend/resume notification
> out of the syscore_suspend step and explictly call them at the appropriate
> time in the suspend/hibernation paths. I.e. timers are suspend _before_
> IRQs get disabled. And accordingly in the resume path.
> 
> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> ---
> Hi,
> 
> there was not a lot of discussion on the last submission. Just one comment from
> Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> response, does not apply, IMHO, since the platform does not re-enable
> interrupts.

Well, you just don't agree with it.

The problem with your approach is that timer interrupts aren't actually as
special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
similar issues to appear under specific conditions.

The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
interrupts would be to use a wait_event() loop like the one in freeze_enter()
(on top of the current linux-next or the pm-genirq branch of linux-pm.git),
but wait for pm_abort_suspend to become true, to implement system suspend.

Rafael


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2014-09-22 23:01 ` Rafael J. Wysocki
@ 2014-10-02 16:01   ` Sören Brinkmann
  2014-10-28 20:40     ` Sören Brinkmann
  2014-11-06  0:33     ` Rafael J. Wysocki
  0 siblings, 2 replies; 14+ messages in thread
From: Sören Brinkmann @ 2014-10-02 16:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, John Stultz, Pavel Machek, Len Brown,
	linux-kernel, linux-pm

Hi Rafael,

On Tue, 2014-09-23 at 01:01AM +0200, Rafael J. Wysocki wrote:
> On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> > On platforms that do not power off during suspend, successfully entering
> > suspend races with timers.
> > 
> > The race happening in a couple of location is:
> > 
> >   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> >      ...
> >   2. syscore_suspend()
> >       -> timekeeping_suspend()
> >        -> clockevents_notify(SUSPEND)
> >         -> tick_suspend()   	(timers are turned off here)
> >      ...
> >   3. wfi            		(wait for wake-IRQ here)
> > 
> > Between steps 1 and 2 the timers can still generate interrupts that are
> > not handled and stay pending until step 3. That pending IRQ causes an
> > immediate - spurious - wake.
> > 
> > The solution is to move the clockevents suspend/resume notification
> > out of the syscore_suspend step and explictly call them at the appropriate
> > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > IRQs get disabled. And accordingly in the resume path.
> > 
> > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > ---
> > Hi,
> > 
> > there was not a lot of discussion on the last submission. Just one comment from
> > Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> > response, does not apply, IMHO, since the platform does not re-enable
> > interrupts.
> 
> Well, you just don't agree with it.
> 
> The problem with your approach is that timer interrupts aren't actually as
> special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
> similar issues to appear under specific conditions.
> 
> The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
> interrupts would be to use a wait_event() loop like the one in freeze_enter()
> (on top of the current linux-next or the pm-genirq branch of linux-pm.git),
> but wait for pm_abort_suspend to become true, to implement system suspend.

sorry, it took me a while since I needed to get some dependencies ported
to the pm-genirq base. Once I had that, it reproduced my original issue.
So far so good. I then looked into finding a solution following your
guidance. I'm not sure I really found what you had in mind, but below is
what I came up with, which seems to do it.
Please let me know how far off I am.

	Thanks,
	Sören

-------8<------------------8<----------------8<----------------8<---------------
diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
index c2744b30d5d9..a4f9914571f1 100644
--- a/drivers/base/power/wakeup.c
+++ b/drivers/base/power/wakeup.c
@@ -25,7 +25,7 @@
 bool events_check_enabled __read_mostly;
 
 /* If set and the system is suspending, terminate the suspend. */
-static bool pm_abort_suspend __read_mostly;
+bool pm_abort_suspend __read_mostly;
 
 /*
  * Combined counters of registered wakeup events and wakeup events in progress.
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 6dadb25cb0d8..e6a6de8f76d0 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -33,6 +33,7 @@
 
 static const char *pm_labels[] = { "mem", "standby", "freeze", };
 const char *pm_states[PM_SUSPEND_MAX];
+extern bool pm_abort_suspend;
 
 static const struct platform_suspend_ops *suspend_ops;
 static const struct platform_freeze_ops *freeze_ops;
@@ -294,25 +295,27 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
 	if (error || suspend_test(TEST_CPUS))
 		goto Enable_cpus;
 
-	arch_suspend_disable_irqs();
-	BUG_ON(!irqs_disabled());
-
-	error = syscore_suspend();
-	if (!error) {
-		*wakeup = pm_wakeup_pending();
-		if (!(suspend_test(TEST_CORE) || *wakeup)) {
-			trace_suspend_resume(TPS("machine_suspend"),
-				state, true);
-			error = suspend_ops->enter(state);
-			trace_suspend_resume(TPS("machine_suspend"),
-				state, false);
-			events_check_enabled = false;
+	while (!pm_abort_suspend) {
+		arch_suspend_disable_irqs();
+		BUG_ON(!irqs_disabled());
+
+		error = syscore_suspend();
+		if (!error) {
+			*wakeup = pm_wakeup_pending();
+			if (!(suspend_test(TEST_CORE) || *wakeup)) {
+				trace_suspend_resume(TPS("machine_suspend"),
+					state, true);
+				error = suspend_ops->enter(state);
+				trace_suspend_resume(TPS("machine_suspend"),
+					state, false);
+				events_check_enabled = false;
+			}
+			syscore_resume();
 		}
-		syscore_resume();
-	}
 
-	arch_suspend_enable_irqs();
-	BUG_ON(irqs_disabled());
+		arch_suspend_enable_irqs();
+		BUG_ON(irqs_disabled());
+	}
 
  Enable_cpus:
 	enable_nonboot_cpus();

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2014-10-02 16:01   ` Sören Brinkmann
@ 2014-10-28 20:40     ` Sören Brinkmann
  2014-11-06  0:33     ` Rafael J. Wysocki
  1 sibling, 0 replies; 14+ messages in thread
From: Sören Brinkmann @ 2014-10-28 20:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, John Stultz, Pavel Machek, Len Brown,
	linux-kernel, linux-pm

Hi Rafael,

any opinion on this?

	Thanks,
	Sören

On Thu, 2014-10-02 at 09:01AM -0700, Sören Brinkmann wrote:
> Hi Rafael,
> 
> On Tue, 2014-09-23 at 01:01AM +0200, Rafael J. Wysocki wrote:
> > On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> > > On platforms that do not power off during suspend, successfully entering
> > > suspend races with timers.
> > > 
> > > The race happening in a couple of location is:
> > > 
> > >   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> > >      ...
> > >   2. syscore_suspend()
> > >       -> timekeeping_suspend()
> > >        -> clockevents_notify(SUSPEND)
> > >         -> tick_suspend()   	(timers are turned off here)
> > >      ...
> > >   3. wfi            		(wait for wake-IRQ here)
> > > 
> > > Between steps 1 and 2 the timers can still generate interrupts that are
> > > not handled and stay pending until step 3. That pending IRQ causes an
> > > immediate - spurious - wake.
> > > 
> > > The solution is to move the clockevents suspend/resume notification
> > > out of the syscore_suspend step and explictly call them at the appropriate
> > > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > > IRQs get disabled. And accordingly in the resume path.
> > > 
> > > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > > ---
> > > Hi,
> > > 
> > > there was not a lot of discussion on the last submission. Just one comment from
> > > Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> > > response, does not apply, IMHO, since the platform does not re-enable
> > > interrupts.
> > 
> > Well, you just don't agree with it.
> > 
> > The problem with your approach is that timer interrupts aren't actually as
> > special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
> > similar issues to appear under specific conditions.
> > 
> > The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
> > interrupts would be to use a wait_event() loop like the one in freeze_enter()
> > (on top of the current linux-next or the pm-genirq branch of linux-pm.git),
> > but wait for pm_abort_suspend to become true, to implement system suspend.
> 
> sorry, it took me a while since I needed to get some dependencies ported
> to the pm-genirq base. Once I had that, it reproduced my original issue.
> So far so good. I then looked into finding a solution following your
> guidance. I'm not sure I really found what you had in mind, but below is
> what I came up with, which seems to do it.
> Please let me know how far off I am.
> 
> 	Thanks,
> 	Sören
> 
> -------8<------------------8<----------------8<----------------8<---------------
> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> index c2744b30d5d9..a4f9914571f1 100644
> --- a/drivers/base/power/wakeup.c
> +++ b/drivers/base/power/wakeup.c
> @@ -25,7 +25,7 @@
>  bool events_check_enabled __read_mostly;
>  
>  /* If set and the system is suspending, terminate the suspend. */
> -static bool pm_abort_suspend __read_mostly;
> +bool pm_abort_suspend __read_mostly;
>  
>  /*
>   * Combined counters of registered wakeup events and wakeup events in progress.
> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index 6dadb25cb0d8..e6a6de8f76d0 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -33,6 +33,7 @@
>  
>  static const char *pm_labels[] = { "mem", "standby", "freeze", };
>  const char *pm_states[PM_SUSPEND_MAX];
> +extern bool pm_abort_suspend;
>  
>  static const struct platform_suspend_ops *suspend_ops;
>  static const struct platform_freeze_ops *freeze_ops;
> @@ -294,25 +295,27 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
>  	if (error || suspend_test(TEST_CPUS))
>  		goto Enable_cpus;
>  
> -	arch_suspend_disable_irqs();
> -	BUG_ON(!irqs_disabled());
> -
> -	error = syscore_suspend();
> -	if (!error) {
> -		*wakeup = pm_wakeup_pending();
> -		if (!(suspend_test(TEST_CORE) || *wakeup)) {
> -			trace_suspend_resume(TPS("machine_suspend"),
> -				state, true);
> -			error = suspend_ops->enter(state);
> -			trace_suspend_resume(TPS("machine_suspend"),
> -				state, false);
> -			events_check_enabled = false;
> +	while (!pm_abort_suspend) {
> +		arch_suspend_disable_irqs();
> +		BUG_ON(!irqs_disabled());
> +
> +		error = syscore_suspend();
> +		if (!error) {
> +			*wakeup = pm_wakeup_pending();
> +			if (!(suspend_test(TEST_CORE) || *wakeup)) {
> +				trace_suspend_resume(TPS("machine_suspend"),
> +					state, true);
> +				error = suspend_ops->enter(state);
> +				trace_suspend_resume(TPS("machine_suspend"),
> +					state, false);
> +				events_check_enabled = false;
> +			}
> +			syscore_resume();
>  		}
> -		syscore_resume();
> -	}
>  
> -	arch_suspend_enable_irqs();
> -	BUG_ON(irqs_disabled());
> +		arch_suspend_enable_irqs();
> +		BUG_ON(irqs_disabled());
> +	}
>  
>   Enable_cpus:
>  	enable_nonboot_cpus();
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2014-10-02 16:01   ` Sören Brinkmann
  2014-10-28 20:40     ` Sören Brinkmann
@ 2014-11-06  0:33     ` Rafael J. Wysocki
  2014-11-08 23:56       ` Sören Brinkmann
  1 sibling, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2014-11-06  0:33 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Thomas Gleixner, John Stultz, Pavel Machek, Len Brown,
	linux-kernel, linux-pm

On Thursday, October 02, 2014 09:01:15 AM Sören Brinkmann wrote:
> Hi Rafael,

Hi,

Sorry for the huge delay.

> On Tue, 2014-09-23 at 01:01AM +0200, Rafael J. Wysocki wrote:
> > On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> > > On platforms that do not power off during suspend, successfully entering
> > > suspend races with timers.
> > > 
> > > The race happening in a couple of location is:
> > > 
> > >   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> > >      ...
> > >   2. syscore_suspend()
> > >       -> timekeeping_suspend()
> > >        -> clockevents_notify(SUSPEND)
> > >         -> tick_suspend()   	(timers are turned off here)
> > >      ...
> > >   3. wfi            		(wait for wake-IRQ here)
> > > 
> > > Between steps 1 and 2 the timers can still generate interrupts that are
> > > not handled and stay pending until step 3. That pending IRQ causes an
> > > immediate - spurious - wake.
> > > 
> > > The solution is to move the clockevents suspend/resume notification
> > > out of the syscore_suspend step and explictly call them at the appropriate
> > > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > > IRQs get disabled. And accordingly in the resume path.
> > > 
> > > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > > ---
> > > Hi,
> > > 
> > > there was not a lot of discussion on the last submission. Just one comment from
> > > Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> > > response, does not apply, IMHO, since the platform does not re-enable
> > > interrupts.
> > 
> > Well, you just don't agree with it.
> > 
> > The problem with your approach is that timer interrupts aren't actually as
> > special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
> > similar issues to appear under specific conditions.
> > 
> > The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
> > interrupts would be to use a wait_event() loop like the one in freeze_enter()
> > (on top of the current linux-next or the pm-genirq branch of linux-pm.git),
> > but wait for pm_abort_suspend to become true, to implement system suspend.
> 
> sorry, it took me a while since I needed to get some dependencies ported
> to the pm-genirq base. Once I had that, it reproduced my original issue.
> So far so good. I then looked into finding a solution following your
> guidance. I'm not sure I really found what you had in mind, but below is
> what I came up with, which seems to do it.
> Please let me know how far off I am.
> 
> 	Thanks,
> 	Sören
> 
> -------8<------------------8<----------------8<----------------8<---------------
> diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> index c2744b30d5d9..a4f9914571f1 100644
> --- a/drivers/base/power/wakeup.c
> +++ b/drivers/base/power/wakeup.c
> @@ -25,7 +25,7 @@
>  bool events_check_enabled __read_mostly;
>  
>  /* If set and the system is suspending, terminate the suspend. */
> -static bool pm_abort_suspend __read_mostly;
> +bool pm_abort_suspend __read_mostly;
>  
>  /*
>   * Combined counters of registered wakeup events and wakeup events in progress.
> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index 6dadb25cb0d8..e6a6de8f76d0 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -33,6 +33,7 @@
>  
>  static const char *pm_labels[] = { "mem", "standby", "freeze", };
>  const char *pm_states[PM_SUSPEND_MAX];
> +extern bool pm_abort_suspend;
>  
>  static const struct platform_suspend_ops *suspend_ops;
>  static const struct platform_freeze_ops *freeze_ops;
> @@ -294,25 +295,27 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
>  	if (error || suspend_test(TEST_CPUS))
>  		goto Enable_cpus;
>  
> -	arch_suspend_disable_irqs();
> -	BUG_ON(!irqs_disabled());
> -
> -	error = syscore_suspend();
> -	if (!error) {
> -		*wakeup = pm_wakeup_pending();
> -		if (!(suspend_test(TEST_CORE) || *wakeup)) {
> -			trace_suspend_resume(TPS("machine_suspend"),
> -				state, true);
> -			error = suspend_ops->enter(state);
> -			trace_suspend_resume(TPS("machine_suspend"),
> -				state, false);
> -			events_check_enabled = false;
> +	while (!pm_abort_suspend) {

That won't work in general, because pm_abort_suspend may not be set on some
platforms on wakeup.  It is only set if a wakeup interrupt triggers which
may not be the case on ACPI systems if the BIOS has woken up the system.

But that could be addressed by making those platforms simply set pm_wakeup_pending
in their BIOS exit path.

> +		arch_suspend_disable_irqs();
> +		BUG_ON(!irqs_disabled());
> +
> +		error = syscore_suspend();

Also it shouldn't be necessary to do syscore_suspend()/syscore_resume() in
every iteration of the loop.

> +		if (!error) {
> +			*wakeup = pm_wakeup_pending();

Plus pm_wakeup_pending() returns true if pm_abort_suspend is set

> +			if (!(suspend_test(TEST_CORE) || *wakeup)) {
> +				trace_suspend_resume(TPS("machine_suspend"),
> +					state, true);

Did you try to add the loop here instead of above?  Like:

			for (;;) {
				*wakeup = pm_wakeup_pending();
				if (*wakeup)
					break;

> +				error = suspend_ops->enter(state);

			}

> +				trace_suspend_resume(TPS("machine_suspend"),
> +					state, false);
> +				events_check_enabled = false;
> +			}
> +			syscore_resume();
>  		}
> -		syscore_resume();
> -	}
>  
> -	arch_suspend_enable_irqs();
> -	BUG_ON(irqs_disabled());
> +		arch_suspend_enable_irqs();
> +		BUG_ON(irqs_disabled());
> +	}
>  
>   Enable_cpus:
>  	enable_nonboot_cpus();
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2014-11-06  0:33     ` Rafael J. Wysocki
@ 2014-11-08 23:56       ` Sören Brinkmann
  2015-01-09 21:50         ` Sören Brinkmann
  0 siblings, 1 reply; 14+ messages in thread
From: Sören Brinkmann @ 2014-11-08 23:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, John Stultz, Pavel Machek, Len Brown,
	linux-kernel, linux-pm

Hi Rafael,

On Thu, 2014-11-06 at 01:33AM +0100, Rafael J. Wysocki wrote:
> On Thursday, October 02, 2014 09:01:15 AM Sören Brinkmann wrote:
> > Hi Rafael,
> 
> Hi,
> 
> Sorry for the huge delay.
> 
> > On Tue, 2014-09-23 at 01:01AM +0200, Rafael J. Wysocki wrote:
> > > On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> > > > On platforms that do not power off during suspend, successfully entering
> > > > suspend races with timers.
> > > > 
> > > > The race happening in a couple of location is:
> > > > 
> > > >   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> > > >      ...
> > > >   2. syscore_suspend()
> > > >       -> timekeeping_suspend()
> > > >        -> clockevents_notify(SUSPEND)
> > > >         -> tick_suspend()   	(timers are turned off here)
> > > >      ...
> > > >   3. wfi            		(wait for wake-IRQ here)
> > > > 
> > > > Between steps 1 and 2 the timers can still generate interrupts that are
> > > > not handled and stay pending until step 3. That pending IRQ causes an
> > > > immediate - spurious - wake.
> > > > 
> > > > The solution is to move the clockevents suspend/resume notification
> > > > out of the syscore_suspend step and explictly call them at the appropriate
> > > > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > > > IRQs get disabled. And accordingly in the resume path.
> > > > 
> > > > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > > > ---
> > > > Hi,
> > > > 
> > > > there was not a lot of discussion on the last submission. Just one comment from
> > > > Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> > > > response, does not apply, IMHO, since the platform does not re-enable
> > > > interrupts.
> > > 
> > > Well, you just don't agree with it.
> > > 
> > > The problem with your approach is that timer interrupts aren't actually as
> > > special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
> > > similar issues to appear under specific conditions.
> > > 
> > > The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
> > > interrupts would be to use a wait_event() loop like the one in freeze_enter()
> > > (on top of the current linux-next or the pm-genirq branch of linux-pm.git),
> > > but wait for pm_abort_suspend to become true, to implement system suspend.
> > 
> > sorry, it took me a while since I needed to get some dependencies ported
> > to the pm-genirq base. Once I had that, it reproduced my original issue.
> > So far so good. I then looked into finding a solution following your
> > guidance. I'm not sure I really found what you had in mind, but below is
> > what I came up with, which seems to do it.
> > Please let me know how far off I am.
> > 
> > 	Thanks,
> > 	Sören
> > 
> > -------8<------------------8<----------------8<----------------8<---------------
> > diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> > index c2744b30d5d9..a4f9914571f1 100644
> > --- a/drivers/base/power/wakeup.c
> > +++ b/drivers/base/power/wakeup.c
> > @@ -25,7 +25,7 @@
> >  bool events_check_enabled __read_mostly;
> >  
> >  /* If set and the system is suspending, terminate the suspend. */
> > -static bool pm_abort_suspend __read_mostly;
> > +bool pm_abort_suspend __read_mostly;
> >  
> >  /*
> >   * Combined counters of registered wakeup events and wakeup events in progress.
> > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> > index 6dadb25cb0d8..e6a6de8f76d0 100644
> > --- a/kernel/power/suspend.c
> > +++ b/kernel/power/suspend.c
> > @@ -33,6 +33,7 @@
> >  
> >  static const char *pm_labels[] = { "mem", "standby", "freeze", };
> >  const char *pm_states[PM_SUSPEND_MAX];
> > +extern bool pm_abort_suspend;
> >  
> >  static const struct platform_suspend_ops *suspend_ops;
> >  static const struct platform_freeze_ops *freeze_ops;
> > @@ -294,25 +295,27 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
> >  	if (error || suspend_test(TEST_CPUS))
> >  		goto Enable_cpus;
> >  
> > -	arch_suspend_disable_irqs();
> > -	BUG_ON(!irqs_disabled());
> > -
> > -	error = syscore_suspend();
> > -	if (!error) {
> > -		*wakeup = pm_wakeup_pending();
> > -		if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > -			trace_suspend_resume(TPS("machine_suspend"),
> > -				state, true);
> > -			error = suspend_ops->enter(state);
> > -			trace_suspend_resume(TPS("machine_suspend"),
> > -				state, false);
> > -			events_check_enabled = false;
> > +	while (!pm_abort_suspend) {
> 
> That won't work in general, because pm_abort_suspend may not be set on some
> platforms on wakeup.  It is only set if a wakeup interrupt triggers which
> may not be the case on ACPI systems if the BIOS has woken up the system.
> 
> But that could be addressed by making those platforms simply set pm_wakeup_pending
> in their BIOS exit path.
> 
> > +		arch_suspend_disable_irqs();
> > +		BUG_ON(!irqs_disabled());
> > +
> > +		error = syscore_suspend();
> 
> Also it shouldn't be necessary to do syscore_suspend()/syscore_resume() in
> every iteration of the loop.
> 
> > +		if (!error) {
> > +			*wakeup = pm_wakeup_pending();
> 
> Plus pm_wakeup_pending() returns true if pm_abort_suspend is set
> 
> > +			if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > +				trace_suspend_resume(TPS("machine_suspend"),
> > +					state, true);
> 
> Did you try to add the loop here instead of above?  Like:
> 
> 			for (;;) {
> 				*wakeup = pm_wakeup_pending();
> 				if (*wakeup)
> 					break;

I think, that doesn't work. I chose the start/end points of the loop
to include the IRQ enable/disable calls. AFAICT, pm_abort_suspend is
set in an ISR. Without enabling interrupts the abort condition of
this loop never becomes true.

	Thanks,
	Sören

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2014-11-08 23:56       ` Sören Brinkmann
@ 2015-01-09 21:50         ` Sören Brinkmann
  2015-01-11 23:20           ` Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Sören Brinkmann @ 2015-01-09 21:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, John Stultz, Pavel Machek, Len Brown,
	linux-kernel, linux-pm

On Sat, 2014-11-08 at 03:56PM -0800, Sören Brinkmann wrote:
> Hi Rafael,
> 
> On Thu, 2014-11-06 at 01:33AM +0100, Rafael J. Wysocki wrote:
> > On Thursday, October 02, 2014 09:01:15 AM Sören Brinkmann wrote:
> > > Hi Rafael,
> > 
> > Hi,
> > 
> > Sorry for the huge delay.
> > 
> > > On Tue, 2014-09-23 at 01:01AM +0200, Rafael J. Wysocki wrote:
> > > > On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> > > > > On platforms that do not power off during suspend, successfully entering
> > > > > suspend races with timers.
> > > > > 
> > > > > The race happening in a couple of location is:
> > > > > 
> > > > >   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> > > > >      ...
> > > > >   2. syscore_suspend()
> > > > >       -> timekeeping_suspend()
> > > > >        -> clockevents_notify(SUSPEND)
> > > > >         -> tick_suspend()   	(timers are turned off here)
> > > > >      ...
> > > > >   3. wfi            		(wait for wake-IRQ here)
> > > > > 
> > > > > Between steps 1 and 2 the timers can still generate interrupts that are
> > > > > not handled and stay pending until step 3. That pending IRQ causes an
> > > > > immediate - spurious - wake.
> > > > > 
> > > > > The solution is to move the clockevents suspend/resume notification
> > > > > out of the syscore_suspend step and explictly call them at the appropriate
> > > > > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > > > > IRQs get disabled. And accordingly in the resume path.
> > > > > 
> > > > > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > > > > ---
> > > > > Hi,
> > > > > 
> > > > > there was not a lot of discussion on the last submission. Just one comment from
> > > > > Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> > > > > response, does not apply, IMHO, since the platform does not re-enable
> > > > > interrupts.
> > > > 
> > > > Well, you just don't agree with it.
> > > > 
> > > > The problem with your approach is that timer interrupts aren't actually as
> > > > special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
> > > > similar issues to appear under specific conditions.
> > > > 
> > > > The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
> > > > interrupts would be to use a wait_event() loop like the one in freeze_enter()
> > > > (on top of the current linux-next or the pm-genirq branch of linux-pm.git),
> > > > but wait for pm_abort_suspend to become true, to implement system suspend.
> > > 
> > > sorry, it took me a while since I needed to get some dependencies ported
> > > to the pm-genirq base. Once I had that, it reproduced my original issue.
> > > So far so good. I then looked into finding a solution following your
> > > guidance. I'm not sure I really found what you had in mind, but below is
> > > what I came up with, which seems to do it.
> > > Please let me know how far off I am.
> > > 
> > > 	Thanks,
> > > 	Sören
> > > 
> > > -------8<------------------8<----------------8<----------------8<---------------
> > > diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> > > index c2744b30d5d9..a4f9914571f1 100644
> > > --- a/drivers/base/power/wakeup.c
> > > +++ b/drivers/base/power/wakeup.c
> > > @@ -25,7 +25,7 @@
> > >  bool events_check_enabled __read_mostly;
> > >  
> > >  /* If set and the system is suspending, terminate the suspend. */
> > > -static bool pm_abort_suspend __read_mostly;
> > > +bool pm_abort_suspend __read_mostly;
> > >  
> > >  /*
> > >   * Combined counters of registered wakeup events and wakeup events in progress.
> > > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> > > index 6dadb25cb0d8..e6a6de8f76d0 100644
> > > --- a/kernel/power/suspend.c
> > > +++ b/kernel/power/suspend.c
> > > @@ -33,6 +33,7 @@
> > >  
> > >  static const char *pm_labels[] = { "mem", "standby", "freeze", };
> > >  const char *pm_states[PM_SUSPEND_MAX];
> > > +extern bool pm_abort_suspend;
> > >  
> > >  static const struct platform_suspend_ops *suspend_ops;
> > >  static const struct platform_freeze_ops *freeze_ops;
> > > @@ -294,25 +295,27 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
> > >  	if (error || suspend_test(TEST_CPUS))
> > >  		goto Enable_cpus;
> > >  
> > > -	arch_suspend_disable_irqs();
> > > -	BUG_ON(!irqs_disabled());
> > > -
> > > -	error = syscore_suspend();
> > > -	if (!error) {
> > > -		*wakeup = pm_wakeup_pending();
> > > -		if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > -			trace_suspend_resume(TPS("machine_suspend"),
> > > -				state, true);
> > > -			error = suspend_ops->enter(state);
> > > -			trace_suspend_resume(TPS("machine_suspend"),
> > > -				state, false);
> > > -			events_check_enabled = false;
> > > +	while (!pm_abort_suspend) {
> > 
> > That won't work in general, because pm_abort_suspend may not be set on some
> > platforms on wakeup.  It is only set if a wakeup interrupt triggers which
> > may not be the case on ACPI systems if the BIOS has woken up the system.
> > 
> > But that could be addressed by making those platforms simply set pm_wakeup_pending
> > in their BIOS exit path.
> > 
> > > +		arch_suspend_disable_irqs();
> > > +		BUG_ON(!irqs_disabled());
> > > +
> > > +		error = syscore_suspend();
> > 
> > Also it shouldn't be necessary to do syscore_suspend()/syscore_resume() in
> > every iteration of the loop.
> > 
> > > +		if (!error) {
> > > +			*wakeup = pm_wakeup_pending();
> > 
> > Plus pm_wakeup_pending() returns true if pm_abort_suspend is set
> > 
> > > +			if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > +				trace_suspend_resume(TPS("machine_suspend"),
> > > +					state, true);
> > 
> > Did you try to add the loop here instead of above?  Like:
> > 
> > 			for (;;) {
> > 				*wakeup = pm_wakeup_pending();
> > 				if (*wakeup)
> > 					break;
> 
> I think, that doesn't work. I chose the start/end points of the loop
> to include the IRQ enable/disable calls. AFAICT, pm_abort_suspend is
> set in an ISR. Without enabling interrupts the abort condition of
> this loop never becomes true.

Any further ideas how to resolve this?

	Sören

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2015-01-09 21:50         ` Sören Brinkmann
@ 2015-01-11 23:20           ` Rafael J. Wysocki
  2015-01-12 15:43             ` Lorenzo Pieralisi
  2015-01-23 17:46             ` Sören Brinkmann
  0 siblings, 2 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2015-01-11 23:20 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Thomas Gleixner, John Stultz, Pavel Machek, Len Brown,
	linux-kernel, linux-pm

On Friday, January 09, 2015 01:50:59 PM Sören Brinkmann wrote:
> On Sat, 2014-11-08 at 03:56PM -0800, Sören Brinkmann wrote:
> > Hi Rafael,
> > 
> > On Thu, 2014-11-06 at 01:33AM +0100, Rafael J. Wysocki wrote:
> > > On Thursday, October 02, 2014 09:01:15 AM Sören Brinkmann wrote:
> > > > Hi Rafael,
> > > 
> > > Hi,
> > > 
> > > Sorry for the huge delay.
> > > 
> > > > On Tue, 2014-09-23 at 01:01AM +0200, Rafael J. Wysocki wrote:
> > > > > On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> > > > > > On platforms that do not power off during suspend, successfully entering
> > > > > > suspend races with timers.
> > > > > > 
> > > > > > The race happening in a couple of location is:
> > > > > > 
> > > > > >   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> > > > > >      ...
> > > > > >   2. syscore_suspend()
> > > > > >       -> timekeeping_suspend()
> > > > > >        -> clockevents_notify(SUSPEND)
> > > > > >         -> tick_suspend()   	(timers are turned off here)
> > > > > >      ...
> > > > > >   3. wfi            		(wait for wake-IRQ here)
> > > > > > 
> > > > > > Between steps 1 and 2 the timers can still generate interrupts that are
> > > > > > not handled and stay pending until step 3. That pending IRQ causes an
> > > > > > immediate - spurious - wake.
> > > > > > 
> > > > > > The solution is to move the clockevents suspend/resume notification
> > > > > > out of the syscore_suspend step and explictly call them at the appropriate
> > > > > > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > > > > > IRQs get disabled. And accordingly in the resume path.
> > > > > > 
> > > > > > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > > > > > ---
> > > > > > Hi,
> > > > > > 
> > > > > > there was not a lot of discussion on the last submission. Just one comment from
> > > > > > Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> > > > > > response, does not apply, IMHO, since the platform does not re-enable
> > > > > > interrupts.
> > > > > 
> > > > > Well, you just don't agree with it.
> > > > > 
> > > > > The problem with your approach is that timer interrupts aren't actually as
> > > > > special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
> > > > > similar issues to appear under specific conditions.
> > > > > 
> > > > > The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
> > > > > interrupts would be to use a wait_event() loop like the one in freeze_enter()
> > > > > (on top of the current linux-next or the pm-genirq branch of linux-pm.git),
> > > > > but wait for pm_abort_suspend to become true, to implement system suspend.
> > > > 
> > > > sorry, it took me a while since I needed to get some dependencies ported
> > > > to the pm-genirq base. Once I had that, it reproduced my original issue.
> > > > So far so good. I then looked into finding a solution following your
> > > > guidance. I'm not sure I really found what you had in mind, but below is
> > > > what I came up with, which seems to do it.
> > > > Please let me know how far off I am.
> > > > 
> > > > 	Thanks,
> > > > 	Sören
> > > > 
> > > > -------8<------------------8<----------------8<----------------8<---------------
> > > > diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> > > > index c2744b30d5d9..a4f9914571f1 100644
> > > > --- a/drivers/base/power/wakeup.c
> > > > +++ b/drivers/base/power/wakeup.c
> > > > @@ -25,7 +25,7 @@
> > > >  bool events_check_enabled __read_mostly;
> > > >  
> > > >  /* If set and the system is suspending, terminate the suspend. */
> > > > -static bool pm_abort_suspend __read_mostly;
> > > > +bool pm_abort_suspend __read_mostly;
> > > >  
> > > >  /*
> > > >   * Combined counters of registered wakeup events and wakeup events in progress.
> > > > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> > > > index 6dadb25cb0d8..e6a6de8f76d0 100644
> > > > --- a/kernel/power/suspend.c
> > > > +++ b/kernel/power/suspend.c
> > > > @@ -33,6 +33,7 @@
> > > >  
> > > >  static const char *pm_labels[] = { "mem", "standby", "freeze", };
> > > >  const char *pm_states[PM_SUSPEND_MAX];
> > > > +extern bool pm_abort_suspend;
> > > >  
> > > >  static const struct platform_suspend_ops *suspend_ops;
> > > >  static const struct platform_freeze_ops *freeze_ops;
> > > > @@ -294,25 +295,27 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
> > > >  	if (error || suspend_test(TEST_CPUS))
> > > >  		goto Enable_cpus;
> > > >  
> > > > -	arch_suspend_disable_irqs();
> > > > -	BUG_ON(!irqs_disabled());
> > > > -
> > > > -	error = syscore_suspend();
> > > > -	if (!error) {
> > > > -		*wakeup = pm_wakeup_pending();
> > > > -		if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > -			trace_suspend_resume(TPS("machine_suspend"),
> > > > -				state, true);
> > > > -			error = suspend_ops->enter(state);
> > > > -			trace_suspend_resume(TPS("machine_suspend"),
> > > > -				state, false);
> > > > -			events_check_enabled = false;
> > > > +	while (!pm_abort_suspend) {
> > > 
> > > That won't work in general, because pm_abort_suspend may not be set on some
> > > platforms on wakeup.  It is only set if a wakeup interrupt triggers which
> > > may not be the case on ACPI systems if the BIOS has woken up the system.
> > > 
> > > But that could be addressed by making those platforms simply set pm_wakeup_pending
> > > in their BIOS exit path.
> > > 
> > > > +		arch_suspend_disable_irqs();
> > > > +		BUG_ON(!irqs_disabled());
> > > > +
> > > > +		error = syscore_suspend();
> > > 
> > > Also it shouldn't be necessary to do syscore_suspend()/syscore_resume() in
> > > every iteration of the loop.
> > > 
> > > > +		if (!error) {
> > > > +			*wakeup = pm_wakeup_pending();
> > > 
> > > Plus pm_wakeup_pending() returns true if pm_abort_suspend is set
> > > 
> > > > +			if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > +				trace_suspend_resume(TPS("machine_suspend"),
> > > > +					state, true);
> > > 
> > > Did you try to add the loop here instead of above?  Like:
> > > 
> > > 			for (;;) {
> > > 				*wakeup = pm_wakeup_pending();
> > > 				if (*wakeup)
> > > 					break;
> > 
> > I think, that doesn't work. I chose the start/end points of the loop
> > to include the IRQ enable/disable calls. AFAICT, pm_abort_suspend is
> > set in an ISR. Without enabling interrupts the abort condition of
> > this loop never becomes true.
> 
> Any further ideas how to resolve this?

Sorry about the delay, lost track of this.

You're right, the IRQ disabling/enabling needs to happen in the loop.

So the direction of the patch looks OK, but it needs to ensure that pm_wakeup_pending
is set properly by all platforms.  Also it should be sufficient to check
pm_wakeup_pending() to detect wakeup.

Have you tested the patch? 


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2015-01-11 23:20           ` Rafael J. Wysocki
@ 2015-01-12 15:43             ` Lorenzo Pieralisi
  2015-01-12 15:55               ` Sören Brinkmann
  2015-01-23 17:46             ` Sören Brinkmann
  1 sibling, 1 reply; 14+ messages in thread
From: Lorenzo Pieralisi @ 2015-01-12 15:43 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Sören Brinkmann, Thomas Gleixner, John Stultz, Pavel Machek,
	Len Brown, linux-kernel, linux-pm

Hi Rafael, Soren,

On Sun, Jan 11, 2015 at 11:20:36PM +0000, Rafael J. Wysocki wrote:
> On Friday, January 09, 2015 01:50:59 PM Sören Brinkmann wrote:
> > On Sat, 2014-11-08 at 03:56PM -0800, Sören Brinkmann wrote:
> > > Hi Rafael,
> > > 
> > > On Thu, 2014-11-06 at 01:33AM +0100, Rafael J. Wysocki wrote:
> > > > On Thursday, October 02, 2014 09:01:15 AM Sören Brinkmann wrote:
> > > > > Hi Rafael,
> > > > 
> > > > Hi,
> > > > 
> > > > Sorry for the huge delay.
> > > > 
> > > > > On Tue, 2014-09-23 at 01:01AM +0200, Rafael J. Wysocki wrote:
> > > > > > On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> > > > > > > On platforms that do not power off during suspend, successfully entering
> > > > > > > suspend races with timers.
> > > > > > > 
> > > > > > > The race happening in a couple of location is:
> > > > > > > 
> > > > > > >   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> > > > > > >      ...
> > > > > > >   2. syscore_suspend()
> > > > > > >       -> timekeeping_suspend()
> > > > > > >        -> clockevents_notify(SUSPEND)
> > > > > > >         -> tick_suspend()   	(timers are turned off here)
> > > > > > >      ...
> > > > > > >   3. wfi            		(wait for wake-IRQ here)
> > > > > > > 
> > > > > > > Between steps 1 and 2 the timers can still generate interrupts that are
> > > > > > > not handled and stay pending until step 3. That pending IRQ causes an
> > > > > > > immediate - spurious - wake.
> > > > > > > 
> > > > > > > The solution is to move the clockevents suspend/resume notification
> > > > > > > out of the syscore_suspend step and explictly call them at the appropriate
> > > > > > > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > > > > > > IRQs get disabled. And accordingly in the resume path.
> > > > > > > 
> > > > > > > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > > > > > > ---
> > > > > > > Hi,
> > > > > > > 
> > > > > > > there was not a lot of discussion on the last submission. Just one comment from
> > > > > > > Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> > > > > > > response, does not apply, IMHO, since the platform does not re-enable
> > > > > > > interrupts.
> > > > > > 
> > > > > > Well, you just don't agree with it.
> > > > > > 
> > > > > > The problem with your approach is that timer interrupts aren't actually as
> > > > > > special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
> > > > > > similar issues to appear under specific conditions.
> > > > > > 
> > > > > > The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
> > > > > > interrupts would be to use a wait_event() loop like the one in freeze_enter()
> > > > > > (on top of the current linux-next or the pm-genirq branch of linux-pm.git),
> > > > > > but wait for pm_abort_suspend to become true, to implement system suspend.
> > > > > 
> > > > > sorry, it took me a while since I needed to get some dependencies ported
> > > > > to the pm-genirq base. Once I had that, it reproduced my original issue.
> > > > > So far so good. I then looked into finding a solution following your
> > > > > guidance. I'm not sure I really found what you had in mind, but below is
> > > > > what I came up with, which seems to do it.
> > > > > Please let me know how far off I am.
> > > > > 
> > > > > 	Thanks,
> > > > > 	Sören
> > > > > 
> > > > > -------8<------------------8<----------------8<----------------8<---------------
> > > > > diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> > > > > index c2744b30d5d9..a4f9914571f1 100644
> > > > > --- a/drivers/base/power/wakeup.c
> > > > > +++ b/drivers/base/power/wakeup.c
> > > > > @@ -25,7 +25,7 @@
> > > > >  bool events_check_enabled __read_mostly;
> > > > >  
> > > > >  /* If set and the system is suspending, terminate the suspend. */
> > > > > -static bool pm_abort_suspend __read_mostly;
> > > > > +bool pm_abort_suspend __read_mostly;
> > > > >  
> > > > >  /*
> > > > >   * Combined counters of registered wakeup events and wakeup events in progress.
> > > > > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> > > > > index 6dadb25cb0d8..e6a6de8f76d0 100644
> > > > > --- a/kernel/power/suspend.c
> > > > > +++ b/kernel/power/suspend.c
> > > > > @@ -33,6 +33,7 @@
> > > > >  
> > > > >  static const char *pm_labels[] = { "mem", "standby", "freeze", };
> > > > >  const char *pm_states[PM_SUSPEND_MAX];
> > > > > +extern bool pm_abort_suspend;
> > > > >  
> > > > >  static const struct platform_suspend_ops *suspend_ops;
> > > > >  static const struct platform_freeze_ops *freeze_ops;
> > > > > @@ -294,25 +295,27 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
> > > > >  	if (error || suspend_test(TEST_CPUS))
> > > > >  		goto Enable_cpus;
> > > > >  
> > > > > -	arch_suspend_disable_irqs();
> > > > > -	BUG_ON(!irqs_disabled());
> > > > > -
> > > > > -	error = syscore_suspend();
> > > > > -	if (!error) {
> > > > > -		*wakeup = pm_wakeup_pending();
> > > > > -		if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > > -			trace_suspend_resume(TPS("machine_suspend"),
> > > > > -				state, true);
> > > > > -			error = suspend_ops->enter(state);
> > > > > -			trace_suspend_resume(TPS("machine_suspend"),
> > > > > -				state, false);
> > > > > -			events_check_enabled = false;
> > > > > +	while (!pm_abort_suspend) {
> > > > 
> > > > That won't work in general, because pm_abort_suspend may not be set on some
> > > > platforms on wakeup.  It is only set if a wakeup interrupt triggers which
> > > > may not be the case on ACPI systems if the BIOS has woken up the system.
> > > > 
> > > > But that could be addressed by making those platforms simply set pm_wakeup_pending
> > > > in their BIOS exit path.
> > > > 
> > > > > +		arch_suspend_disable_irqs();
> > > > > +		BUG_ON(!irqs_disabled());
> > > > > +
> > > > > +		error = syscore_suspend();
> > > > 
> > > > Also it shouldn't be necessary to do syscore_suspend()/syscore_resume() in
> > > > every iteration of the loop.
> > > > 
> > > > > +		if (!error) {
> > > > > +			*wakeup = pm_wakeup_pending();
> > > > 
> > > > Plus pm_wakeup_pending() returns true if pm_abort_suspend is set
> > > > 
> > > > > +			if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > > +				trace_suspend_resume(TPS("machine_suspend"),
> > > > > +					state, true);
> > > > 
> > > > Did you try to add the loop here instead of above?  Like:
> > > > 
> > > > 			for (;;) {
> > > > 				*wakeup = pm_wakeup_pending();
> > > > 				if (*wakeup)
> > > > 					break;
> > > 
> > > I think, that doesn't work. I chose the start/end points of the loop
> > > to include the IRQ enable/disable calls. AFAICT, pm_abort_suspend is
> > > set in an ISR. Without enabling interrupts the abort condition of
> > > this loop never becomes true.
> > 
> > Any further ideas how to resolve this?
> 
> Sorry about the delay, lost track of this.
> 
> You're right, the IRQ disabling/enabling needs to happen in the loop.
> 
> So the direction of the patch looks OK, but it needs to ensure that pm_wakeup_pending
> is set properly by all platforms.  Also it should be sufficient to check
> pm_wakeup_pending() to detect wakeup.
> 
> Have you tested the patch? 

Before considering this patch a solution, can I ask you to rewind
the discussion a bit since I have a question.

I thought that "suspending" the tick through syscore meant shutting down
the respective clock_event_device, and that's how it is implemented in the
kernel.

Now, do we expect a shutdown clock_event_device to still signal pending
IRQs ? I do not think that should be the case, at least that's not what
happens for eg arm arch timers - ie disabling them implicitly gates
the signal connected to the IRQ line.

So the question is more related to the zynq platform and how their clock
event device (which is ?) is shutdown, and what's the correct behaviour we
are expecting.

FWIW, the problem here is not related to the simple wfi state on zynq,
even some other ARM platforms with power management capabilities would wake
up from the state entered by executing wfi (ie possibly through reset) if
there is a pending timer IRQ, the question is more "should the IRQ be
allowed to be there" instead IMHO.

I still think that Stephen's query related to what timer is causing
the wake-up is worth investigating.

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2015-01-12 15:43             ` Lorenzo Pieralisi
@ 2015-01-12 15:55               ` Sören Brinkmann
  2015-01-12 16:14                 ` Lorenzo Pieralisi
  0 siblings, 1 reply; 14+ messages in thread
From: Sören Brinkmann @ 2015-01-12 15:55 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Rafael J. Wysocki, Thomas Gleixner, John Stultz, Pavel Machek,
	Len Brown, linux-kernel, linux-pm

On Mon, 2015-01-12 at 03:43PM +0000, Lorenzo Pieralisi wrote:
> Hi Rafael, Soren,
> 
> On Sun, Jan 11, 2015 at 11:20:36PM +0000, Rafael J. Wysocki wrote:
> > On Friday, January 09, 2015 01:50:59 PM Sören Brinkmann wrote:
> > > On Sat, 2014-11-08 at 03:56PM -0800, Sören Brinkmann wrote:
> > > > Hi Rafael,
> > > > 
> > > > On Thu, 2014-11-06 at 01:33AM +0100, Rafael J. Wysocki wrote:
> > > > > On Thursday, October 02, 2014 09:01:15 AM Sören Brinkmann wrote:
> > > > > > Hi Rafael,
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > Sorry for the huge delay.
> > > > > 
> > > > > > On Tue, 2014-09-23 at 01:01AM +0200, Rafael J. Wysocki wrote:
> > > > > > > On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> > > > > > > > On platforms that do not power off during suspend, successfully entering
> > > > > > > > suspend races with timers.
> > > > > > > > 
> > > > > > > > The race happening in a couple of location is:
> > > > > > > > 
> > > > > > > >   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> > > > > > > >      ...
> > > > > > > >   2. syscore_suspend()
> > > > > > > >       -> timekeeping_suspend()
> > > > > > > >        -> clockevents_notify(SUSPEND)
> > > > > > > >         -> tick_suspend()   	(timers are turned off here)
> > > > > > > >      ...
> > > > > > > >   3. wfi            		(wait for wake-IRQ here)
> > > > > > > > 
> > > > > > > > Between steps 1 and 2 the timers can still generate interrupts that are
> > > > > > > > not handled and stay pending until step 3. That pending IRQ causes an
> > > > > > > > immediate - spurious - wake.
> > > > > > > > 
> > > > > > > > The solution is to move the clockevents suspend/resume notification
> > > > > > > > out of the syscore_suspend step and explictly call them at the appropriate
> > > > > > > > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > > > > > > > IRQs get disabled. And accordingly in the resume path.
> > > > > > > > 
> > > > > > > > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > > > > > > > ---
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > there was not a lot of discussion on the last submission. Just one comment from
> > > > > > > > Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> > > > > > > > response, does not apply, IMHO, since the platform does not re-enable
> > > > > > > > interrupts.
> > > > > > > 
> > > > > > > Well, you just don't agree with it.
> > > > > > > 
> > > > > > > The problem with your approach is that timer interrupts aren't actually as
> > > > > > > special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
> > > > > > > similar issues to appear under specific conditions.
> > > > > > > 
> > > > > > > The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
> > > > > > > interrupts would be to use a wait_event() loop like the one in freeze_enter()
> > > > > > > (on top of the current linux-next or the pm-genirq branch of linux-pm.git),
> > > > > > > but wait for pm_abort_suspend to become true, to implement system suspend.
> > > > > > 
> > > > > > sorry, it took me a while since I needed to get some dependencies ported
> > > > > > to the pm-genirq base. Once I had that, it reproduced my original issue.
> > > > > > So far so good. I then looked into finding a solution following your
> > > > > > guidance. I'm not sure I really found what you had in mind, but below is
> > > > > > what I came up with, which seems to do it.
> > > > > > Please let me know how far off I am.
> > > > > > 
> > > > > > 	Thanks,
> > > > > > 	Sören
> > > > > > 
> > > > > > -------8<------------------8<----------------8<----------------8<---------------
> > > > > > diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> > > > > > index c2744b30d5d9..a4f9914571f1 100644
> > > > > > --- a/drivers/base/power/wakeup.c
> > > > > > +++ b/drivers/base/power/wakeup.c
> > > > > > @@ -25,7 +25,7 @@
> > > > > >  bool events_check_enabled __read_mostly;
> > > > > >  
> > > > > >  /* If set and the system is suspending, terminate the suspend. */
> > > > > > -static bool pm_abort_suspend __read_mostly;
> > > > > > +bool pm_abort_suspend __read_mostly;
> > > > > >  
> > > > > >  /*
> > > > > >   * Combined counters of registered wakeup events and wakeup events in progress.
> > > > > > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> > > > > > index 6dadb25cb0d8..e6a6de8f76d0 100644
> > > > > > --- a/kernel/power/suspend.c
> > > > > > +++ b/kernel/power/suspend.c
> > > > > > @@ -33,6 +33,7 @@
> > > > > >  
> > > > > >  static const char *pm_labels[] = { "mem", "standby", "freeze", };
> > > > > >  const char *pm_states[PM_SUSPEND_MAX];
> > > > > > +extern bool pm_abort_suspend;
> > > > > >  
> > > > > >  static const struct platform_suspend_ops *suspend_ops;
> > > > > >  static const struct platform_freeze_ops *freeze_ops;
> > > > > > @@ -294,25 +295,27 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
> > > > > >  	if (error || suspend_test(TEST_CPUS))
> > > > > >  		goto Enable_cpus;
> > > > > >  
> > > > > > -	arch_suspend_disable_irqs();
> > > > > > -	BUG_ON(!irqs_disabled());
> > > > > > -
> > > > > > -	error = syscore_suspend();
> > > > > > -	if (!error) {
> > > > > > -		*wakeup = pm_wakeup_pending();
> > > > > > -		if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > > > -			trace_suspend_resume(TPS("machine_suspend"),
> > > > > > -				state, true);
> > > > > > -			error = suspend_ops->enter(state);
> > > > > > -			trace_suspend_resume(TPS("machine_suspend"),
> > > > > > -				state, false);
> > > > > > -			events_check_enabled = false;
> > > > > > +	while (!pm_abort_suspend) {
> > > > > 
> > > > > That won't work in general, because pm_abort_suspend may not be set on some
> > > > > platforms on wakeup.  It is only set if a wakeup interrupt triggers which
> > > > > may not be the case on ACPI systems if the BIOS has woken up the system.
> > > > > 
> > > > > But that could be addressed by making those platforms simply set pm_wakeup_pending
> > > > > in their BIOS exit path.
> > > > > 
> > > > > > +		arch_suspend_disable_irqs();
> > > > > > +		BUG_ON(!irqs_disabled());
> > > > > > +
> > > > > > +		error = syscore_suspend();
> > > > > 
> > > > > Also it shouldn't be necessary to do syscore_suspend()/syscore_resume() in
> > > > > every iteration of the loop.
> > > > > 
> > > > > > +		if (!error) {
> > > > > > +			*wakeup = pm_wakeup_pending();
> > > > > 
> > > > > Plus pm_wakeup_pending() returns true if pm_abort_suspend is set
> > > > > 
> > > > > > +			if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > > > +				trace_suspend_resume(TPS("machine_suspend"),
> > > > > > +					state, true);
> > > > > 
> > > > > Did you try to add the loop here instead of above?  Like:
> > > > > 
> > > > > 			for (;;) {
> > > > > 				*wakeup = pm_wakeup_pending();
> > > > > 				if (*wakeup)
> > > > > 					break;
> > > > 
> > > > I think, that doesn't work. I chose the start/end points of the loop
> > > > to include the IRQ enable/disable calls. AFAICT, pm_abort_suspend is
> > > > set in an ISR. Without enabling interrupts the abort condition of
> > > > this loop never becomes true.
> > > 
> > > Any further ideas how to resolve this?
> > 
> > Sorry about the delay, lost track of this.
> > 
> > You're right, the IRQ disabling/enabling needs to happen in the loop.
> > 
> > So the direction of the patch looks OK, but it needs to ensure that pm_wakeup_pending
> > is set properly by all platforms.  Also it should be sufficient to check
> > pm_wakeup_pending() to detect wakeup.
> > 
> > Have you tested the patch? 
> 
> Before considering this patch a solution, can I ask you to rewind
> the discussion a bit since I have a question.
> 
> I thought that "suspending" the tick through syscore meant shutting down
> the respective clock_event_device, and that's how it is implemented in the
> kernel.
> 
> Now, do we expect a shutdown clock_event_device to still signal pending
> IRQs ? I do not think that should be the case, at least that's not what
> happens for eg arm arch timers - ie disabling them implicitly gates
> the signal connected to the IRQ line.
> 
> So the question is more related to the zynq platform and how their clock
> event device (which is ?) is shutdown, and what's the correct behaviour we
> are expecting.

As outlined in the commit message, there is a race condition in the core
code. Looking at the timers is just fighting the symptoms.

> FWIW, the problem here is not related to the simple wfi state on zynq,
> even some other ARM platforms with power management capabilities would wake
> up from the state entered by executing wfi (ie possibly through reset) if
> there is a pending timer IRQ, the question is more "should the IRQ be
> allowed to be there" instead IMHO.
> 
> I still think that Stephen's query related to what timer is causing
> the wake-up is worth investigating.

As I reported earlier, I see these spurious wakes with the cadence_ttc
as well as the ARM twd timers.
	
	Soren

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2015-01-12 15:55               ` Sören Brinkmann
@ 2015-01-12 16:14                 ` Lorenzo Pieralisi
  2015-01-23 17:44                   ` Sören Brinkmann
  0 siblings, 1 reply; 14+ messages in thread
From: Lorenzo Pieralisi @ 2015-01-12 16:14 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Rafael J. Wysocki, Thomas Gleixner, John Stultz, Pavel Machek,
	Len Brown, linux-kernel, linux-pm

On Mon, Jan 12, 2015 at 03:55:05PM +0000, Sören Brinkmann wrote:
> On Mon, 2015-01-12 at 03:43PM +0000, Lorenzo Pieralisi wrote:
> > Hi Rafael, Soren,
> >
> > On Sun, Jan 11, 2015 at 11:20:36PM +0000, Rafael J. Wysocki wrote:
> > > On Friday, January 09, 2015 01:50:59 PM Sören Brinkmann wrote:
> > > > On Sat, 2014-11-08 at 03:56PM -0800, Sören Brinkmann wrote:
> > > > > Hi Rafael,
> > > > >
> > > > > On Thu, 2014-11-06 at 01:33AM +0100, Rafael J. Wysocki wrote:
> > > > > > On Thursday, October 02, 2014 09:01:15 AM Sören Brinkmann wrote:
> > > > > > > Hi Rafael,
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Sorry for the huge delay.
> > > > > >
> > > > > > > On Tue, 2014-09-23 at 01:01AM +0200, Rafael J. Wysocki wrote:
> > > > > > > > On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> > > > > > > > > On platforms that do not power off during suspend, successfully entering
> > > > > > > > > suspend races with timers.
> > > > > > > > >
> > > > > > > > > The race happening in a couple of location is:
> > > > > > > > >
> > > > > > > > >   1. disable IRQs               (e.g. arch_suspend_disable_irqs())
> > > > > > > > >      ...
> > > > > > > > >   2. syscore_suspend()
> > > > > > > > >       -> timekeeping_suspend()
> > > > > > > > >        -> clockevents_notify(SUSPEND)
> > > > > > > > >         -> tick_suspend()       (timers are turned off here)
> > > > > > > > >      ...
> > > > > > > > >   3. wfi                        (wait for wake-IRQ here)
> > > > > > > > >
> > > > > > > > > Between steps 1 and 2 the timers can still generate interrupts that are
> > > > > > > > > not handled and stay pending until step 3. That pending IRQ causes an
> > > > > > > > > immediate - spurious - wake.
> > > > > > > > >
> > > > > > > > > The solution is to move the clockevents suspend/resume notification
> > > > > > > > > out of the syscore_suspend step and explictly call them at the appropriate
> > > > > > > > > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > > > > > > > > IRQs get disabled. And accordingly in the resume path.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > > > > > > > > ---
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > there was not a lot of discussion on the last submission. Just one comment from
> > > > > > > > > Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> > > > > > > > > response, does not apply, IMHO, since the platform does not re-enable
> > > > > > > > > interrupts.
> > > > > > > >
> > > > > > > > Well, you just don't agree with it.
> > > > > > > >
> > > > > > > > The problem with your approach is that timer interrupts aren't actually as
> > > > > > > > special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
> > > > > > > > similar issues to appear under specific conditions.
> > > > > > > >
> > > > > > > > The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
> > > > > > > > interrupts would be to use a wait_event() loop like the one in freeze_enter()
> > > > > > > > (on top of the current linux-next or the pm-genirq branch of linux-pm.git),
> > > > > > > > but wait for pm_abort_suspend to become true, to implement system suspend.
> > > > > > >
> > > > > > > sorry, it took me a while since I needed to get some dependencies ported
> > > > > > > to the pm-genirq base. Once I had that, it reproduced my original issue.
> > > > > > > So far so good. I then looked into finding a solution following your
> > > > > > > guidance. I'm not sure I really found what you had in mind, but below is
> > > > > > > what I came up with, which seems to do it.
> > > > > > > Please let me know how far off I am.
> > > > > > >
> > > > > > >     Thanks,
> > > > > > >     Sören
> > > > > > >
> > > > > > > -------8<------------------8<----------------8<----------------8<---------------
> > > > > > > diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> > > > > > > index c2744b30d5d9..a4f9914571f1 100644
> > > > > > > --- a/drivers/base/power/wakeup.c
> > > > > > > +++ b/drivers/base/power/wakeup.c
> > > > > > > @@ -25,7 +25,7 @@
> > > > > > >  bool events_check_enabled __read_mostly;
> > > > > > >
> > > > > > >  /* If set and the system is suspending, terminate the suspend. */
> > > > > > > -static bool pm_abort_suspend __read_mostly;
> > > > > > > +bool pm_abort_suspend __read_mostly;
> > > > > > >
> > > > > > >  /*
> > > > > > >   * Combined counters of registered wakeup events and wakeup events in progress.
> > > > > > > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> > > > > > > index 6dadb25cb0d8..e6a6de8f76d0 100644
> > > > > > > --- a/kernel/power/suspend.c
> > > > > > > +++ b/kernel/power/suspend.c
> > > > > > > @@ -33,6 +33,7 @@
> > > > > > >
> > > > > > >  static const char *pm_labels[] = { "mem", "standby", "freeze", };
> > > > > > >  const char *pm_states[PM_SUSPEND_MAX];
> > > > > > > +extern bool pm_abort_suspend;
> > > > > > >
> > > > > > >  static const struct platform_suspend_ops *suspend_ops;
> > > > > > >  static const struct platform_freeze_ops *freeze_ops;
> > > > > > > @@ -294,25 +295,27 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
> > > > > > >     if (error || suspend_test(TEST_CPUS))
> > > > > > >             goto Enable_cpus;
> > > > > > >
> > > > > > > -   arch_suspend_disable_irqs();
> > > > > > > -   BUG_ON(!irqs_disabled());
> > > > > > > -
> > > > > > > -   error = syscore_suspend();
> > > > > > > -   if (!error) {
> > > > > > > -           *wakeup = pm_wakeup_pending();
> > > > > > > -           if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > > > > -                   trace_suspend_resume(TPS("machine_suspend"),
> > > > > > > -                           state, true);
> > > > > > > -                   error = suspend_ops->enter(state);
> > > > > > > -                   trace_suspend_resume(TPS("machine_suspend"),
> > > > > > > -                           state, false);
> > > > > > > -                   events_check_enabled = false;
> > > > > > > +   while (!pm_abort_suspend) {
> > > > > >
> > > > > > That won't work in general, because pm_abort_suspend may not be set on some
> > > > > > platforms on wakeup.  It is only set if a wakeup interrupt triggers which
> > > > > > may not be the case on ACPI systems if the BIOS has woken up the system.
> > > > > >
> > > > > > But that could be addressed by making those platforms simply set pm_wakeup_pending
> > > > > > in their BIOS exit path.
> > > > > >
> > > > > > > +           arch_suspend_disable_irqs();
> > > > > > > +           BUG_ON(!irqs_disabled());
> > > > > > > +
> > > > > > > +           error = syscore_suspend();
> > > > > >
> > > > > > Also it shouldn't be necessary to do syscore_suspend()/syscore_resume() in
> > > > > > every iteration of the loop.
> > > > > >
> > > > > > > +           if (!error) {
> > > > > > > +                   *wakeup = pm_wakeup_pending();
> > > > > >
> > > > > > Plus pm_wakeup_pending() returns true if pm_abort_suspend is set
> > > > > >
> > > > > > > +                   if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > > > > +                           trace_suspend_resume(TPS("machine_suspend"),
> > > > > > > +                                   state, true);
> > > > > >
> > > > > > Did you try to add the loop here instead of above?  Like:
> > > > > >
> > > > > >                       for (;;) {
> > > > > >                               *wakeup = pm_wakeup_pending();
> > > > > >                               if (*wakeup)
> > > > > >                                       break;
> > > > >
> > > > > I think, that doesn't work. I chose the start/end points of the loop
> > > > > to include the IRQ enable/disable calls. AFAICT, pm_abort_suspend is
> > > > > set in an ISR. Without enabling interrupts the abort condition of
> > > > > this loop never becomes true.
> > > >
> > > > Any further ideas how to resolve this?
> > >
> > > Sorry about the delay, lost track of this.
> > >
> > > You're right, the IRQ disabling/enabling needs to happen in the loop.
> > >
> > > So the direction of the patch looks OK, but it needs to ensure that pm_wakeup_pending
> > > is set properly by all platforms.  Also it should be sufficient to check
> > > pm_wakeup_pending() to detect wakeup.
> > >
> > > Have you tested the patch?
> >
> > Before considering this patch a solution, can I ask you to rewind
> > the discussion a bit since I have a question.
> >
> > I thought that "suspending" the tick through syscore meant shutting down
> > the respective clock_event_device, and that's how it is implemented in the
> > kernel.
> >
> > Now, do we expect a shutdown clock_event_device to still signal pending
> > IRQs ? I do not think that should be the case, at least that's not what
> > happens for eg arm arch timers - ie disabling them implicitly gates
> > the signal connected to the IRQ line.
> >
> > So the question is more related to the zynq platform and how their clock
> > event device (which is ?) is shutdown, and what's the correct behaviour we
> > are expecting.
> 
> As outlined in the commit message, there is a race condition in the core
> code. Looking at the timers is just fighting the symptoms.

I gathered there is a race condition between 1 and 2 in your code path.
What I am asking you is why are we getting a pending IRQ at step 3 when the
clock event device is supposed to be shutdown. My question is:

Should a clock event device in shutdown mode (ie disabled) still signal
IRQs to the interrupt controller (and consequently to the core) ?

It is for me to understand if that's the behaviour we are expecting.

> > FWIW, the problem here is not related to the simple wfi state on zynq,
> > even some other ARM platforms with power management capabilities would wake
> > up from the state entered by executing wfi (ie possibly through reset) if
> > there is a pending timer IRQ, the question is more "should the IRQ be
> > allowed to be there" instead IMHO.
> >
> > I still think that Stephen's query related to what timer is causing
> > the wake-up is worth investigating.
> 
> As I reported earlier, I see these spurious wakes with the cadence_ttc
> as well as the ARM twd timers.

I thought that a shutdown clock event device explicitly disables IRQ
assertion, that's why I am inquiring, I do not understand how this
can happen - how can you have a pending timer IRQ at step 3 above.

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2015-01-12 16:14                 ` Lorenzo Pieralisi
@ 2015-01-23 17:44                   ` Sören Brinkmann
  2015-01-24 16:15                     ` Thomas Gleixner
  0 siblings, 1 reply; 14+ messages in thread
From: Sören Brinkmann @ 2015-01-23 17:44 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Rafael J. Wysocki, Thomas Gleixner, John Stultz, Pavel Machek,
	Len Brown, linux-kernel, linux-pm

Sorry for the delay, but a lot of stuff came in between and I need to
rebase a couple of branches and re-test things.
Some comments inline below:

On Mon, 2015-01-12 at 04:14PM +0000, Lorenzo Pieralisi wrote:
> On Mon, Jan 12, 2015 at 03:55:05PM +0000, Sören Brinkmann wrote:
> > On Mon, 2015-01-12 at 03:43PM +0000, Lorenzo Pieralisi wrote:
> > > Hi Rafael, Soren,
> > >
> > > On Sun, Jan 11, 2015 at 11:20:36PM +0000, Rafael J. Wysocki wrote:
> > > > On Friday, January 09, 2015 01:50:59 PM Sören Brinkmann wrote:
> > > > > On Sat, 2014-11-08 at 03:56PM -0800, Sören Brinkmann wrote:
> > > > > > Hi Rafael,
> > > > > >
> > > > > > On Thu, 2014-11-06 at 01:33AM +0100, Rafael J. Wysocki wrote:
> > > > > > > On Thursday, October 02, 2014 09:01:15 AM Sören Brinkmann wrote:
> > > > > > > > Hi Rafael,
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Sorry for the huge delay.
> > > > > > >
> > > > > > > > On Tue, 2014-09-23 at 01:01AM +0200, Rafael J. Wysocki wrote:
> > > > > > > > > On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> > > > > > > > > > On platforms that do not power off during suspend, successfully entering
> > > > > > > > > > suspend races with timers.
> > > > > > > > > >
> > > > > > > > > > The race happening in a couple of location is:
> > > > > > > > > >
> > > > > > > > > >   1. disable IRQs               (e.g. arch_suspend_disable_irqs())
> > > > > > > > > >      ...
> > > > > > > > > >   2. syscore_suspend()
> > > > > > > > > >       -> timekeeping_suspend()
> > > > > > > > > >        -> clockevents_notify(SUSPEND)
> > > > > > > > > >         -> tick_suspend()       (timers are turned off here)
> > > > > > > > > >      ...
> > > > > > > > > >   3. wfi                        (wait for wake-IRQ here)
> > > > > > > > > >
> > > > > > > > > > Between steps 1 and 2 the timers can still generate interrupts that are
> > > > > > > > > > not handled and stay pending until step 3. That pending IRQ causes an
> > > > > > > > > > immediate - spurious - wake.
> > > > > > > > > >
> > > > > > > > > > The solution is to move the clockevents suspend/resume notification
> > > > > > > > > > out of the syscore_suspend step and explictly call them at the appropriate
> > > > > > > > > > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > > > > > > > > > IRQs get disabled. And accordingly in the resume path.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > > > > > > > > > ---
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > there was not a lot of discussion on the last submission. Just one comment from
> > > > > > > > > > Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> > > > > > > > > > response, does not apply, IMHO, since the platform does not re-enable
> > > > > > > > > > interrupts.
> > > > > > > > >
> > > > > > > > > Well, you just don't agree with it.
> > > > > > > > >
> > > > > > > > > The problem with your approach is that timer interrupts aren't actually as
> > > > > > > > > special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
> > > > > > > > > similar issues to appear under specific conditions.
> > > > > > > > >
> > > > > > > > > The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
> > > > > > > > > interrupts would be to use a wait_event() loop like the one in freeze_enter()
> > > > > > > > > (on top of the current linux-next or the pm-genirq branch of linux-pm.git),
> > > > > > > > > but wait for pm_abort_suspend to become true, to implement system suspend.
> > > > > > > >
> > > > > > > > sorry, it took me a while since I needed to get some dependencies ported
> > > > > > > > to the pm-genirq base. Once I had that, it reproduced my original issue.
> > > > > > > > So far so good. I then looked into finding a solution following your
> > > > > > > > guidance. I'm not sure I really found what you had in mind, but below is
> > > > > > > > what I came up with, which seems to do it.
> > > > > > > > Please let me know how far off I am.
> > > > > > > >
> > > > > > > >     Thanks,
> > > > > > > >     Sören
> > > > > > > >
> > > > > > > > -------8<------------------8<----------------8<----------------8<---------------
> > > > > > > > diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> > > > > > > > index c2744b30d5d9..a4f9914571f1 100644
> > > > > > > > --- a/drivers/base/power/wakeup.c
> > > > > > > > +++ b/drivers/base/power/wakeup.c
> > > > > > > > @@ -25,7 +25,7 @@
> > > > > > > >  bool events_check_enabled __read_mostly;
> > > > > > > >
> > > > > > > >  /* If set and the system is suspending, terminate the suspend. */
> > > > > > > > -static bool pm_abort_suspend __read_mostly;
> > > > > > > > +bool pm_abort_suspend __read_mostly;
> > > > > > > >
> > > > > > > >  /*
> > > > > > > >   * Combined counters of registered wakeup events and wakeup events in progress.
> > > > > > > > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> > > > > > > > index 6dadb25cb0d8..e6a6de8f76d0 100644
> > > > > > > > --- a/kernel/power/suspend.c
> > > > > > > > +++ b/kernel/power/suspend.c
> > > > > > > > @@ -33,6 +33,7 @@
> > > > > > > >
> > > > > > > >  static const char *pm_labels[] = { "mem", "standby", "freeze", };
> > > > > > > >  const char *pm_states[PM_SUSPEND_MAX];
> > > > > > > > +extern bool pm_abort_suspend;
> > > > > > > >
> > > > > > > >  static const struct platform_suspend_ops *suspend_ops;
> > > > > > > >  static const struct platform_freeze_ops *freeze_ops;
> > > > > > > > @@ -294,25 +295,27 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
> > > > > > > >     if (error || suspend_test(TEST_CPUS))
> > > > > > > >             goto Enable_cpus;
> > > > > > > >
> > > > > > > > -   arch_suspend_disable_irqs();
> > > > > > > > -   BUG_ON(!irqs_disabled());
> > > > > > > > -
> > > > > > > > -   error = syscore_suspend();
> > > > > > > > -   if (!error) {
> > > > > > > > -           *wakeup = pm_wakeup_pending();
> > > > > > > > -           if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > > > > > -                   trace_suspend_resume(TPS("machine_suspend"),
> > > > > > > > -                           state, true);
> > > > > > > > -                   error = suspend_ops->enter(state);
> > > > > > > > -                   trace_suspend_resume(TPS("machine_suspend"),
> > > > > > > > -                           state, false);
> > > > > > > > -                   events_check_enabled = false;
> > > > > > > > +   while (!pm_abort_suspend) {
> > > > > > >
> > > > > > > That won't work in general, because pm_abort_suspend may not be set on some
> > > > > > > platforms on wakeup.  It is only set if a wakeup interrupt triggers which
> > > > > > > may not be the case on ACPI systems if the BIOS has woken up the system.
> > > > > > >
> > > > > > > But that could be addressed by making those platforms simply set pm_wakeup_pending
> > > > > > > in their BIOS exit path.
> > > > > > >
> > > > > > > > +           arch_suspend_disable_irqs();
> > > > > > > > +           BUG_ON(!irqs_disabled());
> > > > > > > > +
> > > > > > > > +           error = syscore_suspend();
> > > > > > >
> > > > > > > Also it shouldn't be necessary to do syscore_suspend()/syscore_resume() in
> > > > > > > every iteration of the loop.
> > > > > > >
> > > > > > > > +           if (!error) {
> > > > > > > > +                   *wakeup = pm_wakeup_pending();
> > > > > > >
> > > > > > > Plus pm_wakeup_pending() returns true if pm_abort_suspend is set
> > > > > > >
> > > > > > > > +                   if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > > > > > +                           trace_suspend_resume(TPS("machine_suspend"),
> > > > > > > > +                                   state, true);
> > > > > > >
> > > > > > > Did you try to add the loop here instead of above?  Like:
> > > > > > >
> > > > > > >                       for (;;) {
> > > > > > >                               *wakeup = pm_wakeup_pending();
> > > > > > >                               if (*wakeup)
> > > > > > >                                       break;
> > > > > >
> > > > > > I think, that doesn't work. I chose the start/end points of the loop
> > > > > > to include the IRQ enable/disable calls. AFAICT, pm_abort_suspend is
> > > > > > set in an ISR. Without enabling interrupts the abort condition of
> > > > > > this loop never becomes true.
> > > > >
> > > > > Any further ideas how to resolve this?
> > > >
> > > > Sorry about the delay, lost track of this.
> > > >
> > > > You're right, the IRQ disabling/enabling needs to happen in the loop.
> > > >
> > > > So the direction of the patch looks OK, but it needs to ensure that pm_wakeup_pending
> > > > is set properly by all platforms.  Also it should be sufficient to check
> > > > pm_wakeup_pending() to detect wakeup.
> > > >
> > > > Have you tested the patch?
> > >
> > > Before considering this patch a solution, can I ask you to rewind
> > > the discussion a bit since I have a question.
> > >
> > > I thought that "suspending" the tick through syscore meant shutting down
> > > the respective clock_event_device, and that's how it is implemented in the
> > > kernel.
> > >
> > > Now, do we expect a shutdown clock_event_device to still signal pending
> > > IRQs ? I do not think that should be the case, at least that's not what
> > > happens for eg arm arch timers - ie disabling them implicitly gates
> > > the signal connected to the IRQ line.
> > >
> > > So the question is more related to the zynq platform and how their clock
> > > event device (which is ?) is shutdown, and what's the correct behaviour we
> > > are expecting.
> > 
> > As outlined in the commit message, there is a race condition in the core
> > code. Looking at the timers is just fighting the symptoms.
> 
> I gathered there is a race condition between 1 and 2 in your code path.
> What I am asking you is why are we getting a pending IRQ at step 3 when the
> clock event device is supposed to be shutdown. My question is:
> 
> Should a clock event device in shutdown mode (ie disabled) still signal
> IRQs to the interrupt controller (and consequently to the core) ?
> 
> It is for me to understand if that's the behaviour we are expecting.

Yeah, I think that would be the other way to resolve this. But since I
see this with already two different clock event drivers, it seems much
harder to enforce. I think pretty much no driver expects it has to clear
any interrupts unless its ISR is triggered.

> 
> > > FWIW, the problem here is not related to the simple wfi state on zynq,
> > > even some other ARM platforms with power management capabilities would wake
> > > up from the state entered by executing wfi (ie possibly through reset) if
> > > there is a pending timer IRQ, the question is more "should the IRQ be
> > > allowed to be there" instead IMHO.
> > >
> > > I still think that Stephen's query related to what timer is causing
> > > the wake-up is worth investigating.
> > 
> > As I reported earlier, I see these spurious wakes with the cadence_ttc
> > as well as the ARM twd timers.
> 
> I thought that a shutdown clock event device explicitly disables IRQ
> assertion, that's why I am inquiring, I do not understand how this
> can happen - how can you have a pending timer IRQ at step 3 above.

It does I guess, but the shutdown of the IRQ happens after disabling
IRQs. During that window the IRQ can become pening without anybody
handling it. Shutting the timer down usually just ensures that no new
interrupts are signaled but it apparently doesn't check whether any are
still pending (which isn't necessary in almost all cases since the ISR
would take care of it).

	Sören

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2015-01-11 23:20           ` Rafael J. Wysocki
  2015-01-12 15:43             ` Lorenzo Pieralisi
@ 2015-01-23 17:46             ` Sören Brinkmann
  1 sibling, 0 replies; 14+ messages in thread
From: Sören Brinkmann @ 2015-01-23 17:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, John Stultz, Pavel Machek, Len Brown,
	linux-kernel, linux-pm

On Mon, 2015-01-12 at 12:20AM +0100, Rafael J. Wysocki wrote:
> On Friday, January 09, 2015 01:50:59 PM Sören Brinkmann wrote:
> > On Sat, 2014-11-08 at 03:56PM -0800, Sören Brinkmann wrote:
> > > Hi Rafael,
> > > 
> > > On Thu, 2014-11-06 at 01:33AM +0100, Rafael J. Wysocki wrote:
> > > > On Thursday, October 02, 2014 09:01:15 AM Sören Brinkmann wrote:
> > > > > Hi Rafael,
> > > > 
> > > > Hi,
> > > > 
> > > > Sorry for the huge delay.
> > > > 
> > > > > On Tue, 2014-09-23 at 01:01AM +0200, Rafael J. Wysocki wrote:
> > > > > > On Monday, September 22, 2014 10:07:03 AM Soren Brinkmann wrote:
> > > > > > > On platforms that do not power off during suspend, successfully entering
> > > > > > > suspend races with timers.
> > > > > > > 
> > > > > > > The race happening in a couple of location is:
> > > > > > > 
> > > > > > >   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> > > > > > >      ...
> > > > > > >   2. syscore_suspend()
> > > > > > >       -> timekeeping_suspend()
> > > > > > >        -> clockevents_notify(SUSPEND)
> > > > > > >         -> tick_suspend()   	(timers are turned off here)
> > > > > > >      ...
> > > > > > >   3. wfi            		(wait for wake-IRQ here)
> > > > > > > 
> > > > > > > Between steps 1 and 2 the timers can still generate interrupts that are
> > > > > > > not handled and stay pending until step 3. That pending IRQ causes an
> > > > > > > immediate - spurious - wake.
> > > > > > > 
> > > > > > > The solution is to move the clockevents suspend/resume notification
> > > > > > > out of the syscore_suspend step and explictly call them at the appropriate
> > > > > > > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > > > > > > IRQs get disabled. And accordingly in the resume path.
> > > > > > > 
> > > > > > > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> > > > > > > ---
> > > > > > > Hi,
> > > > > > > 
> > > > > > > there was not a lot of discussion on the last submission. Just one comment from
> > > > > > > Rafael (https://lkml.org/lkml/2014/8/26/780), which - as I outlined in my
> > > > > > > response, does not apply, IMHO, since the platform does not re-enable
> > > > > > > interrupts.
> > > > > > 
> > > > > > Well, you just don't agree with it.
> > > > > > 
> > > > > > The problem with your approach is that timer interrupts aren't actually as
> > > > > > special as you think and any other IRQF_NO_SUSPEND interrupts would have caused
> > > > > > similar issues to appear under specific conditions.
> > > > > > 
> > > > > > The solution I would suggest and that actually covers all IRQF_NO_SUSPEND
> > > > > > interrupts would be to use a wait_event() loop like the one in freeze_enter()
> > > > > > (on top of the current linux-next or the pm-genirq branch of linux-pm.git),
> > > > > > but wait for pm_abort_suspend to become true, to implement system suspend.
> > > > > 
> > > > > sorry, it took me a while since I needed to get some dependencies ported
> > > > > to the pm-genirq base. Once I had that, it reproduced my original issue.
> > > > > So far so good. I then looked into finding a solution following your
> > > > > guidance. I'm not sure I really found what you had in mind, but below is
> > > > > what I came up with, which seems to do it.
> > > > > Please let me know how far off I am.
> > > > > 
> > > > > 	Thanks,
> > > > > 	Sören
> > > > > 
> > > > > -------8<------------------8<----------------8<----------------8<---------------
> > > > > diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
> > > > > index c2744b30d5d9..a4f9914571f1 100644
> > > > > --- a/drivers/base/power/wakeup.c
> > > > > +++ b/drivers/base/power/wakeup.c
> > > > > @@ -25,7 +25,7 @@
> > > > >  bool events_check_enabled __read_mostly;
> > > > >  
> > > > >  /* If set and the system is suspending, terminate the suspend. */
> > > > > -static bool pm_abort_suspend __read_mostly;
> > > > > +bool pm_abort_suspend __read_mostly;
> > > > >  
> > > > >  /*
> > > > >   * Combined counters of registered wakeup events and wakeup events in progress.
> > > > > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> > > > > index 6dadb25cb0d8..e6a6de8f76d0 100644
> > > > > --- a/kernel/power/suspend.c
> > > > > +++ b/kernel/power/suspend.c
> > > > > @@ -33,6 +33,7 @@
> > > > >  
> > > > >  static const char *pm_labels[] = { "mem", "standby", "freeze", };
> > > > >  const char *pm_states[PM_SUSPEND_MAX];
> > > > > +extern bool pm_abort_suspend;
> > > > >  
> > > > >  static const struct platform_suspend_ops *suspend_ops;
> > > > >  static const struct platform_freeze_ops *freeze_ops;
> > > > > @@ -294,25 +295,27 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
> > > > >  	if (error || suspend_test(TEST_CPUS))
> > > > >  		goto Enable_cpus;
> > > > >  
> > > > > -	arch_suspend_disable_irqs();
> > > > > -	BUG_ON(!irqs_disabled());
> > > > > -
> > > > > -	error = syscore_suspend();
> > > > > -	if (!error) {
> > > > > -		*wakeup = pm_wakeup_pending();
> > > > > -		if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > > -			trace_suspend_resume(TPS("machine_suspend"),
> > > > > -				state, true);
> > > > > -			error = suspend_ops->enter(state);
> > > > > -			trace_suspend_resume(TPS("machine_suspend"),
> > > > > -				state, false);
> > > > > -			events_check_enabled = false;
> > > > > +	while (!pm_abort_suspend) {
> > > > 
> > > > That won't work in general, because pm_abort_suspend may not be set on some
> > > > platforms on wakeup.  It is only set if a wakeup interrupt triggers which
> > > > may not be the case on ACPI systems if the BIOS has woken up the system.
> > > > 
> > > > But that could be addressed by making those platforms simply set pm_wakeup_pending
> > > > in their BIOS exit path.
> > > > 
> > > > > +		arch_suspend_disable_irqs();
> > > > > +		BUG_ON(!irqs_disabled());
> > > > > +
> > > > > +		error = syscore_suspend();
> > > > 
> > > > Also it shouldn't be necessary to do syscore_suspend()/syscore_resume() in
> > > > every iteration of the loop.
> > > > 
> > > > > +		if (!error) {
> > > > > +			*wakeup = pm_wakeup_pending();
> > > > 
> > > > Plus pm_wakeup_pending() returns true if pm_abort_suspend is set
> > > > 
> > > > > +			if (!(suspend_test(TEST_CORE) || *wakeup)) {
> > > > > +				trace_suspend_resume(TPS("machine_suspend"),
> > > > > +					state, true);
> > > > 
> > > > Did you try to add the loop here instead of above?  Like:
> > > > 
> > > > 			for (;;) {
> > > > 				*wakeup = pm_wakeup_pending();
> > > > 				if (*wakeup)
> > > > 					break;
> > > 
> > > I think, that doesn't work. I chose the start/end points of the loop
> > > to include the IRQ enable/disable calls. AFAICT, pm_abort_suspend is
> > > set in an ISR. Without enabling interrupts the abort condition of
> > > this loop never becomes true.
> > 
> > Any further ideas how to resolve this?
> 
> Sorry about the delay, lost track of this.
> 
> You're right, the IRQ disabling/enabling needs to happen in the loop.
> 
> So the direction of the patch looks OK, but it needs to ensure that pm_wakeup_pending
> is set properly by all platforms.  Also it should be sufficient to check
> pm_wakeup_pending() to detect wakeup.
> 
> Have you tested the patch? 

I have to look into this, rebase my stuff and test again. May take a
while, but I'll stay on it.
Regarding testing: I don't have platforms to really test this on other
than my Zynq platforms where I see this issue on. It worked for me
there, but rebasing just broke something.

	Thanks,
	Sören

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RESEND] PM / sleep: Fix racing timers
  2015-01-23 17:44                   ` Sören Brinkmann
@ 2015-01-24 16:15                     ` Thomas Gleixner
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2015-01-24 16:15 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: Lorenzo Pieralisi, Rafael J. Wysocki, John Stultz, Pavel Machek,
	Len Brown, linux-kernel, linux-pm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3564 bytes --]

On Fri, 23 Jan 2015, Sören Brinkmann wrote:
> On Mon, 2015-01-12 at 04:14PM +0000, Lorenzo Pieralisi wrote:
> > I thought that a shutdown clock event device explicitly disables IRQ
> > assertion, that's why I am inquiring, I do not understand how this
> > can happen - how can you have a pending timer IRQ at step 3 above.
> 
> It does I guess, but the shutdown of the IRQ happens after disabling
> IRQs. During that window the IRQ can become pening without anybody
> handling it. Shutting the timer down usually just ensures that no new
> interrupts are signaled but it apparently doesn't check whether any are
> still pending (which isn't necessary in almost all cases since the ISR
> would take care of it).

Well, the issue is that lots of platforms are just not affected by
this because they either power off or simply do not propagate
interrupts which are not explicitely marked as wakeup sources.

After thinking more about it, we really should try to handle it at the
core code, especially because the freezer folks who try to shutdown
the tick completely run into similar issues.

But, I don't think that just moving the suspend() call before
disabling interrupts is sufficient. I rather think it's outright
dangerous.

suspend_ce()
   local_irq_disable();

---> Interrupt becomes pending

   shutdown(ce);
   local_irq_enable();

---> Interrupt fires

There are implementations which actually switch of clocks and such at
shutdown. So depending on the code in the actual interrupt handler (not the
clockevent->handler, though that might have issues as well) we might
touch a disfunctional device with possibly fatal consequences.

Right now this cannot happen as the invocation of the interrupt
handler is guaranteed to happen _after_ the device has been
resumed. Just because it does not explode in your implementation does
not mean its safe to inflict on everyone.

So if we try to handle this issue at the core level, then we really
must deal with the interrupt itself.

Timer interrupts are always marked IRQF_NO_SUSPEND, because we still
need them until syscore_suspend(). So in case we really shut down the
clockevent device via tick_suspend() then we must disable the
interrupt there as well and reenable it in tick_resume().

But that's not really possible at the core level today. Timer
interrupts are delivered by various mechanisms:

      - plain interrupts
      - percpu interrupts
      - per cpu direct vectors
      - more complex stuff

and we have no idea which type we are dealing with. So the only option
would be to have a callback and let the suspend code do:

   if (cedev->disable_irq)
      cedev->disable_irq(cddev);

and the reverse operation for resume. So we can provide default
implementations for the callbacks:

   void ce_[dis|en]able_irq(...)
   void ce_[dis|en]able_percpu_irq(...)

which would rely on cedev->irq. Those two would cover most of the
implementations. The ones with special requirements can implement
their own.

Now at the irq level, we'd need a new function for the device irqs:

    disable_and_mask_irq_nosync(irq)

which makes sure that the line is masked at the hardware level. That's
necessary because the lazy interrupt disable is not sufficient. The
percpu irqs are already masking at the hardware level.

One might argue that we could handle this in set_mode() as well, but
we are thinking about going away from that multiplex call. Aside of
that, ensuring the symmetry of disable/enable invocations would be
problematic with the existing set_mode() implementations and usage.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-01-24 16:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-22 17:07 [PATCH RESEND] PM / sleep: Fix racing timers Soren Brinkmann
2014-09-22 23:01 ` Rafael J. Wysocki
2014-10-02 16:01   ` Sören Brinkmann
2014-10-28 20:40     ` Sören Brinkmann
2014-11-06  0:33     ` Rafael J. Wysocki
2014-11-08 23:56       ` Sören Brinkmann
2015-01-09 21:50         ` Sören Brinkmann
2015-01-11 23:20           ` Rafael J. Wysocki
2015-01-12 15:43             ` Lorenzo Pieralisi
2015-01-12 15:55               ` Sören Brinkmann
2015-01-12 16:14                 ` Lorenzo Pieralisi
2015-01-23 17:44                   ` Sören Brinkmann
2015-01-24 16:15                     ` Thomas Gleixner
2015-01-23 17:46             ` Sören Brinkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).