LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] PM: sleep: core: Avoid setting power.must_resume to false
@ 2021-07-26 15:24 Prasad Sodagudi
  2021-08-03 17:16 ` Greg KH
  0 siblings, 1 reply; 13+ messages in thread
From: Prasad Sodagudi @ 2021-07-26 15:24 UTC (permalink / raw)
  To: rjw, len.brown, pavel, gregkh; +Cc: linux-pm, linux-kernel, Prasad Sodagudi

There are variables(power.may_skip_resume and dev->power.must_resume)
and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after
a system wide suspend transition.

Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
its "noirq" and "early" resume callbacks to be skipped if the device
can be left in suspend after a system-wide transition into the working
state. PM core determines that the driver's "noirq" and "early" resume
callbacks should be skipped or not with dev_pm_skip_resume() function by
checking power.may_skip_resume variable.

power.must_resume variable is getting set to false in __device_suspend()
function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
dev->power.usage_count variables. This is leading to failure to call
resume handler for some of the devices suspended in early suspend phase.
So check device's DPM_FLAG_MAY_SKIP_RESUME flag before
setting power.must_resume variable.

Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
---
 drivers/base/power/main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index d568772..8eebc4d 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1642,7 +1642,11 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
 	}
 
 	dev->power.may_skip_resume = true;
-	dev->power.must_resume = false;
+	if ((atomic_read(&dev->power.usage_count) <= 1) &&
+			(dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
+		dev->power.must_resume = false;
+	else
+		dev->power.must_resume = true;
 
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PM: sleep: core: Avoid setting power.must_resume to false
  2021-07-26 15:24 [PATCH] PM: sleep: core: Avoid setting power.must_resume to false Prasad Sodagudi
@ 2021-08-03 17:16 ` Greg KH
  2021-08-06 15:07   ` psodagud
  0 siblings, 1 reply; 13+ messages in thread
From: Greg KH @ 2021-08-03 17:16 UTC (permalink / raw)
  To: Prasad Sodagudi; +Cc: rjw, len.brown, pavel, linux-pm, linux-kernel

On Mon, Jul 26, 2021 at 08:24:34AM -0700, Prasad Sodagudi wrote:
> There are variables(power.may_skip_resume and dev->power.must_resume)
> and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after
> a system wide suspend transition.
> 
> Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
> its "noirq" and "early" resume callbacks to be skipped if the device
> can be left in suspend after a system-wide transition into the working
> state. PM core determines that the driver's "noirq" and "early" resume
> callbacks should be skipped or not with dev_pm_skip_resume() function by
> checking power.may_skip_resume variable.
> 
> power.must_resume variable is getting set to false in __device_suspend()
> function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
> dev->power.usage_count variables. This is leading to failure to call
> resume handler for some of the devices suspended in early suspend phase.
> So check device's DPM_FLAG_MAY_SKIP_RESUME flag before
> setting power.must_resume variable.
> 
> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
> ---
>  drivers/base/power/main.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index d568772..8eebc4d 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -1642,7 +1642,11 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
>  	}
>  
>  	dev->power.may_skip_resume = true;
> -	dev->power.must_resume = false;
> +	if ((atomic_read(&dev->power.usage_count) <= 1) &&
> +			(dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))

What is preventing that atomic value from changing _right_ after you
just read this?

and very odd indentation, checkpatch didn't complain about this?

What commit does this fix?  Does it need to be backported to older
kernels?

Wait, how is your "noirq" device even getting called here?  Shouldn't
__device_suspend_noirq() be called instead?  Why isn't that the path for
your device?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PM: sleep: core: Avoid setting power.must_resume to false
  2021-08-03 17:16 ` Greg KH
@ 2021-08-06 15:07   ` psodagud
  2021-08-06 15:16     ` Greg KH
  2021-08-07  6:00     ` Greg KH
  0 siblings, 2 replies; 13+ messages in thread
From: psodagud @ 2021-08-06 15:07 UTC (permalink / raw)
  To: Greg KH; +Cc: rjw, len.brown, pavel, linux-pm, linux-kernel

On 2021-08-03 10:16, Greg KH wrote:
> On Mon, Jul 26, 2021 at 08:24:34AM -0700, Prasad Sodagudi wrote:
>> There are variables(power.may_skip_resume and dev->power.must_resume)
>> and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices 
>> after
>> a system wide suspend transition.
>> 
>> Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
>> its "noirq" and "early" resume callbacks to be skipped if the device
>> can be left in suspend after a system-wide transition into the working
>> state. PM core determines that the driver's "noirq" and "early" resume
>> callbacks should be skipped or not with dev_pm_skip_resume() function 
>> by
>> checking power.may_skip_resume variable.
>> 
>> power.must_resume variable is getting set to false in 
>> __device_suspend()
>> function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
>> dev->power.usage_count variables. This is leading to failure to call
>> resume handler for some of the devices suspended in early suspend 
>> phase.
>> So check device's DPM_FLAG_MAY_SKIP_RESUME flag before
>> setting power.must_resume variable.
>> 
>> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
>> ---
>>  drivers/base/power/main.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
>> index d568772..8eebc4d 100644
>> --- a/drivers/base/power/main.c
>> +++ b/drivers/base/power/main.c
>> @@ -1642,7 +1642,11 @@ static int __device_suspend(struct device *dev, 
>> pm_message_t state, bool async)
>>  	}
>> 
>>  	dev->power.may_skip_resume = true;
>> -	dev->power.must_resume = false;
>> +	if ((atomic_read(&dev->power.usage_count) <= 1) &&
>> +			(dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
> 
> What is preventing that atomic value from changing _right_ after you
> just read this?
> 
> and very odd indentation, checkpatch didn't complain about this?
Sure. I will fix indentation problem once Rafael reviewed this patch.

> What commit does this fix?  Does it need to be backported to older
> kernels?

No. LTS - 5.4 do not have this problem.

> Wait, how is your "noirq" device even getting called here?  Shouldn't
> __device_suspend_noirq() be called instead?  Why isn't that the path 
> for
> your device?

Hi Gregh and Rafael,

This is regarding suspend/resume(s2idle) scenario of devices and 
differences between the LTS kernel 5.4 and 5.10 with respect to devices 
suspend and resume. Observing that devices suspended in suspend_late 
stage are not getting resumed in resume_early stage.
1)	LTS kernel 5.4 kernel do not have this problem but 5.10 kernel shows 
this problem.
2)	Commit - 6e176bf8d46194353163c2cb660808bc633b45d9 (PM: sleep: core: 
Do not skip callbacks in the resume phase) is skipping the driver 
early_resume callbacks.
@@ -804,15 +793,25 @@ static int device_resume_early(struct device *dev, 
pm_message_t state, bool asyn
         } else if (dev->bus && dev->bus->pm) {
                 info = "early bus ";
                 callback = pm_late_early_op(dev->bus->pm, state);
-       } else if (dev->driver && dev->driver->pm) {
+       }
+       if (callback)
+               goto Run;
+
+       if (dev_pm_may_skip_resume(dev))
+               goto Skip;
In device_resume_early function dev->power.must_resume is used to skip 
the resume call back. It looks this function is expecting that, 
__device_suspend_noirq() would set dev->power.must_resume = true for the 
devices which does not have DPM_FLAG_MAY_SKIP_RESUME flag set.

static int __device_suspend_noirq(struct device *dev, pm_message_t 
state, bool async)
{
…
…
         /*
          * Skipping the resume of devices that were in use right before 
the
          * system suspend (as indicated by their PM-runtime usage 
counters)
          * would be suboptimal.  Also resume them if doing that is not 
allowed
          * to be skipped.
          */
         if (atomic_read(&dev->power.usage_count) > 1 ||
             !(dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME) &&
               dev->power.may_skip_resume))
                 dev->power.must_resume = true;


3)	Problematic scenario is as follows -  During the device 
suspend/resume scenario all the devices in  the suspend_late stage are 
successful and some device can fail to suspend in 
suspend_noirq(device_suspend_noirq-> __device_suspend_noirq) phase.
As a device failed in dpm_noirq_suspend_devices phase, dpm_resume_noirq 
is getting called to resume devices in dpm_late_early_list in the noirq 
phase.
4)	During the Devices_early_resume stage 
dpm_resume_early()-->device_resume_early() functions skipping the 
devices early resume callbacks.
  799         if (dev_pm_skip_resume(dev))
800	              goto Skip;

5)	Devices suspended in suspend_late stage are not getting resumed in 
Devices_early_resume stage because of Commit - 
6e176bf8d46194353163c2cb660808bc633b45d9 (PM: sleep: core: Do not skip 
callbacks in the resume phase) is skipping the driver early_resume 
callbacks when dev->power.must_resume is false.

6)	Below portion of the code in __device_suspend_noirq is not getting 
executed for some drivers successfully suspended in suspend_late stage 
and there is no chance to set must_resume to true.  So these devices are 
always having dev->power.must_resume=false.
For example -
i)	Devices A, B, C have suspend_late and resume_early handlers.
ii)	Devices X, Y, Z  have suspend_noirq and resume_noirq handlers.
Devices are getting suspended in this order – A, B, X , C , Y and Z and 
device X return failure for suspend_noirq callback. In this scenario, 
device C would never execute below portion of the code to set 
dev->power.must_resume = true and device – C would not get resumed in 
resume_early  stage.

1192 static int __device_suspend_noirq(struct device *dev, pm_message_t 
state, bool async)
1193 {
….
….
1245         /*
1246          * Skipping the resume of devices that were in use right 
before the
1247          * system suspend (as indicated by their PM-runtime usage 
counters)
1248          * would be suboptimal.  Also resume them if doing that is 
not allowed
1249          * to be skipped.
1250          */
1251         if (atomic_read(&dev->power.usage_count) > 1 ||
1252             !(dev_pm_test_driver_flags(dev, 
DPM_FLAG_MAY_SKIP_RESUME) &&
1253               dev->power.may_skip_resume))
1254                 dev->power.must_resume = true;
1255
1256         if (dev->power.must_resume)
1257                 dpm_superior_set_must_resume(dev);
1258

> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PM: sleep: core: Avoid setting power.must_resume to false
  2021-08-06 15:07   ` psodagud
@ 2021-08-06 15:16     ` Greg KH
  2021-08-07  6:00     ` Greg KH
  1 sibling, 0 replies; 13+ messages in thread
From: Greg KH @ 2021-08-06 15:16 UTC (permalink / raw)
  To: psodagud; +Cc: rjw, len.brown, pavel, linux-pm, linux-kernel

On Fri, Aug 06, 2021 at 08:07:08AM -0700, psodagud@codeaurora.org wrote:
> On 2021-08-03 10:16, Greg KH wrote:
> > On Mon, Jul 26, 2021 at 08:24:34AM -0700, Prasad Sodagudi wrote:
> > > There are variables(power.may_skip_resume and dev->power.must_resume)
> > > and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices
> > > after
> > > a system wide suspend transition.
> > > 
> > > Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
> > > its "noirq" and "early" resume callbacks to be skipped if the device
> > > can be left in suspend after a system-wide transition into the working
> > > state. PM core determines that the driver's "noirq" and "early" resume
> > > callbacks should be skipped or not with dev_pm_skip_resume()
> > > function by
> > > checking power.may_skip_resume variable.
> > > 
> > > power.must_resume variable is getting set to false in
> > > __device_suspend()
> > > function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
> > > dev->power.usage_count variables. This is leading to failure to call
> > > resume handler for some of the devices suspended in early suspend
> > > phase.
> > > So check device's DPM_FLAG_MAY_SKIP_RESUME flag before
> > > setting power.must_resume variable.
> > > 
> > > Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
> > > ---
> > >  drivers/base/power/main.c | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > > index d568772..8eebc4d 100644
> > > --- a/drivers/base/power/main.c
> > > +++ b/drivers/base/power/main.c
> > > @@ -1642,7 +1642,11 @@ static int __device_suspend(struct device
> > > *dev, pm_message_t state, bool async)
> > >  	}
> > > 
> > >  	dev->power.may_skip_resume = true;
> > > -	dev->power.must_resume = false;
> > > +	if ((atomic_read(&dev->power.usage_count) <= 1) &&
> > > +			(dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
> > 
> > What is preventing that atomic value from changing _right_ after you
> > just read this?
> > 
> > and very odd indentation, checkpatch didn't complain about this?
> Sure. I will fix indentation problem once Rafael reviewed this patch.

Neither of us can take this as-is, so why wait?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PM: sleep: core: Avoid setting power.must_resume to false
  2021-08-06 15:07   ` psodagud
  2021-08-06 15:16     ` Greg KH
@ 2021-08-07  6:00     ` Greg KH
  2021-08-08 16:10       ` [PATCH v2] " Prasad Sodagudi
  1 sibling, 1 reply; 13+ messages in thread
From: Greg KH @ 2021-08-07  6:00 UTC (permalink / raw)
  To: psodagud; +Cc: rjw, len.brown, pavel, linux-pm, linux-kernel

On Fri, Aug 06, 2021 at 08:07:08AM -0700, psodagud@codeaurora.org wrote:
> On 2021-08-03 10:16, Greg KH wrote:
> > On Mon, Jul 26, 2021 at 08:24:34AM -0700, Prasad Sodagudi wrote:
> > > There are variables(power.may_skip_resume and dev->power.must_resume)
> > > and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices
> > > after
> > > a system wide suspend transition.
> > > 
> > > Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
> > > its "noirq" and "early" resume callbacks to be skipped if the device
> > > can be left in suspend after a system-wide transition into the working
> > > state. PM core determines that the driver's "noirq" and "early" resume
> > > callbacks should be skipped or not with dev_pm_skip_resume()
> > > function by
> > > checking power.may_skip_resume variable.
> > > 
> > > power.must_resume variable is getting set to false in
> > > __device_suspend()
> > > function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
> > > dev->power.usage_count variables. This is leading to failure to call
> > > resume handler for some of the devices suspended in early suspend
> > > phase.
> > > So check device's DPM_FLAG_MAY_SKIP_RESUME flag before
> > > setting power.must_resume variable.
> > > 
> > > Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
> > > ---
> > >  drivers/base/power/main.c | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > > index d568772..8eebc4d 100644
> > > --- a/drivers/base/power/main.c
> > > +++ b/drivers/base/power/main.c
> > > @@ -1642,7 +1642,11 @@ static int __device_suspend(struct device
> > > *dev, pm_message_t state, bool async)
> > >  	}
> > > 
> > >  	dev->power.may_skip_resume = true;
> > > -	dev->power.must_resume = false;
> > > +	if ((atomic_read(&dev->power.usage_count) <= 1) &&
> > > +			(dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
> > 
> > What is preventing that atomic value from changing _right_ after you
> > just read this?
> > 
> > and very odd indentation, checkpatch didn't complain about this?
> Sure. I will fix indentation problem once Rafael reviewed this patch.
> 
> > What commit does this fix?  Does it need to be backported to older
> > kernels?
> 
> No. LTS - 5.4 do not have this problem.
> 
> > Wait, how is your "noirq" device even getting called here?  Shouldn't
> > __device_suspend_noirq() be called instead?  Why isn't that the path for
> > your device?
> 
> Hi Gregh and Rafael,
> 
> This is regarding suspend/resume(s2idle) scenario of devices and differences
> between the LTS kernel 5.4 and 5.10 with respect to devices suspend and
> resume. Observing that devices suspended in suspend_late stage are not
> getting resumed in resume_early stage.
> 1)	LTS kernel 5.4 kernel do not have this problem but 5.10 kernel shows this
> problem.
> 2)	Commit - 6e176bf8d46194353163c2cb660808bc633b45d9 (PM: sleep: core: Do
> not skip callbacks in the resume phase) is skipping the driver early_resume
> callbacks.
> @@ -804,15 +793,25 @@ static int device_resume_early(struct device *dev,
> pm_message_t state, bool asyn
>         } else if (dev->bus && dev->bus->pm) {
>                 info = "early bus ";
>                 callback = pm_late_early_op(dev->bus->pm, state);
> -       } else if (dev->driver && dev->driver->pm) {
> +       }
> +       if (callback)
> +               goto Run;
> +
> +       if (dev_pm_may_skip_resume(dev))
> +               goto Skip;
> In device_resume_early function dev->power.must_resume is used to skip the
> resume call back. It looks this function is expecting that,
> __device_suspend_noirq() would set dev->power.must_resume = true for the
> devices which does not have DPM_FLAG_MAY_SKIP_RESUME flag set.
> 
> static int __device_suspend_noirq(struct device *dev, pm_message_t state,
> bool async)
> {
> …
> …
>         /*
>          * Skipping the resume of devices that were in use right before the
>          * system suspend (as indicated by their PM-runtime usage counters)
>          * would be suboptimal.  Also resume them if doing that is not
> allowed
>          * to be skipped.
>          */
>         if (atomic_read(&dev->power.usage_count) > 1 ||
>             !(dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME) &&
>               dev->power.may_skip_resume))
>                 dev->power.must_resume = true;
> 
> 
> 3)	Problematic scenario is as follows -  During the device suspend/resume
> scenario all the devices in  the suspend_late stage are successful and some
> device can fail to suspend in suspend_noirq(device_suspend_noirq->
> __device_suspend_noirq) phase.
> As a device failed in dpm_noirq_suspend_devices phase, dpm_resume_noirq is
> getting called to resume devices in dpm_late_early_list in the noirq phase.
> 4)	During the Devices_early_resume stage
> dpm_resume_early()-->device_resume_early() functions skipping the devices
> early resume callbacks.
>  799         if (dev_pm_skip_resume(dev))
> 800	              goto Skip;
> 
> 5)	Devices suspended in suspend_late stage are not getting resumed in
> Devices_early_resume stage because of Commit -
> 6e176bf8d46194353163c2cb660808bc633b45d9 (PM: sleep: core: Do not skip
> callbacks in the resume phase) is skipping the driver early_resume callbacks
> when dev->power.must_resume is false.
> 
> 6)	Below portion of the code in __device_suspend_noirq is not getting
> executed for some drivers successfully suspended in suspend_late stage and
> there is no chance to set must_resume to true.  So these devices are always
> having dev->power.must_resume=false.
> For example -
> i)	Devices A, B, C have suspend_late and resume_early handlers.
> ii)	Devices X, Y, Z  have suspend_noirq and resume_noirq handlers.
> Devices are getting suspended in this order – A, B, X , C , Y and Z and
> device X return failure for suspend_noirq callback. In this scenario, device
> C would never execute below portion of the code to set
> dev->power.must_resume = true and device – C would not get resumed in
> resume_early  stage.
> 
> 1192 static int __device_suspend_noirq(struct device *dev, pm_message_t
> state, bool async)
> 1193 {
> ….
> ….
> 1245         /*
> 1246          * Skipping the resume of devices that were in use right before
> the
> 1247          * system suspend (as indicated by their PM-runtime usage
> counters)
> 1248          * would be suboptimal.  Also resume them if doing that is not
> allowed
> 1249          * to be skipped.
> 1250          */
> 1251         if (atomic_read(&dev->power.usage_count) > 1 ||
> 1252             !(dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)
> &&
> 1253               dev->power.may_skip_resume))
> 1254                 dev->power.must_resume = true;
> 1255
> 1256         if (dev->power.must_resume)
> 1257                 dpm_superior_set_must_resume(dev);
> 1258

Ok, that explains it a bit better, thank you.  Can you please try to
expand on your changelog text when you resubmit this to include this
information and properly identify what commit caused this problem to
happen by adding a Fixes: tag?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] PM: sleep: core: Avoid setting power.must_resume to false
  2021-08-07  6:00     ` Greg KH
@ 2021-08-08 16:10       ` Prasad Sodagudi
  2021-08-08 17:36         ` Greg KH
  0 siblings, 1 reply; 13+ messages in thread
From: Prasad Sodagudi @ 2021-08-08 16:10 UTC (permalink / raw)
  To: gregkh, rjw; +Cc: len.brown, linux-kernel, linux-pm, pavel, psodagud

There are variables(power.may_skip_resume and dev->power.must_resume)
and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after
a system wide suspend transition.

Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
its "noirq" and "early" resume callbacks to be skipped if the device
can be left in suspend after a system-wide transition into the working
state. PM core determines that the driver's "noirq" and "early" resume
callbacks should be skipped or not with dev_pm_skip_resume() function by
checking power.may_skip_resume variable.

power.must_resume variable is getting set to false in __device_suspend()
function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
dev->power.usage_count variables. In problematic scenario, where
all the devices in the suspend_late stage are successful and some
device can fail to suspend in suspend_noirq phase. So some devices
successfully suspended in suspend_late stage are not getting chance
to execute __device_suspend_noirq() to set dev->power.must_resume
variable to true and not getting resumed in early_resume phase.

Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before
setting power.must_resume variable in __device_suspend function.

Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
---
 drivers/base/power/main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index d568772..9ee6987 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1642,7 +1642,11 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
 	}
 
 	dev->power.may_skip_resume = true;
-	dev->power.must_resume = false;
+	if ((atomic_read(&dev->power.usage_count) <= 1) &&
+	     (dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
+		dev->power.must_resume = false;
+	else
+		dev->power.must_resume = true;
 
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] PM: sleep: core: Avoid setting power.must_resume to false
  2021-08-08 16:10       ` [PATCH v2] " Prasad Sodagudi
@ 2021-08-08 17:36         ` Greg KH
  0 siblings, 0 replies; 13+ messages in thread
From: Greg KH @ 2021-08-08 17:36 UTC (permalink / raw)
  To: Prasad Sodagudi; +Cc: rjw, len.brown, linux-kernel, linux-pm, pavel

On Sun, Aug 08, 2021 at 09:10:27AM -0700, Prasad Sodagudi wrote:
> There are variables(power.may_skip_resume and dev->power.must_resume)
> and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after
> a system wide suspend transition.
> 
> Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
> its "noirq" and "early" resume callbacks to be skipped if the device
> can be left in suspend after a system-wide transition into the working
> state. PM core determines that the driver's "noirq" and "early" resume
> callbacks should be skipped or not with dev_pm_skip_resume() function by
> checking power.may_skip_resume variable.
> 
> power.must_resume variable is getting set to false in __device_suspend()
> function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
> dev->power.usage_count variables. In problematic scenario, where
> all the devices in the suspend_late stage are successful and some
> device can fail to suspend in suspend_noirq phase. So some devices
> successfully suspended in suspend_late stage are not getting chance
> to execute __device_suspend_noirq() to set dev->power.must_resume
> variable to true and not getting resumed in early_resume phase.
> 
> Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before
> setting power.must_resume variable in __device_suspend function.
> 
> Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")
> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
> ---
>  drivers/base/power/main.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index d568772..9ee6987 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -1642,7 +1642,11 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
>  	}
>  
>  	dev->power.may_skip_resume = true;
> -	dev->power.must_resume = false;
> +	if ((atomic_read(&dev->power.usage_count) <= 1) &&
> +	     (dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
> +		dev->power.must_resume = false;
> +	else
> +		dev->power.must_resume = true;
>  
>  	dpm_watchdog_set(&wd, dev);
>  	device_lock(dev);
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
> 

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- This looks like a new version of a previously submitted patch, but you
  did not list below the --- line any changes from the previous version.
  Please read the section entitled "The canonical patch format" in the
  kernel file, Documentation/SubmittingPatches for what needs to be done
  here to properly describe this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] PM: sleep: core: Avoid setting power.must_resume to false
  2021-08-10 13:42 Prasad Sodagudi
  2021-08-10 13:42 ` Prasad Sodagudi
@ 2021-08-10 15:06 ` Greg KH
  1 sibling, 0 replies; 13+ messages in thread
From: Greg KH @ 2021-08-10 15:06 UTC (permalink / raw)
  To: Prasad Sodagudi; +Cc: rjw, len.brown, linux-kernel, linux-pm, pavel

On Tue, Aug 10, 2021 at 06:42:11AM -0700, Prasad Sodagudi wrote:
> This is regarding suspend/resume(s2idle) scenario of devices and difference
> between the LTS kernels 5.4 and 5.10 with respect to devices suspend and
> resume. Observing that devices suspended in suspend_late stage are not
> getting resumed in resume_early stage.
> 1) LTS kernel 5.4 kernel do not have this problem but 5.10 kernel
> shows this problem.
> 2) 'commit 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")'
> is skipping the driver early_resume callbacks.
> 
> In device_resume_early function dev->power.must_resume is used to skip the
> resume call back. It looks this function is expecting that,
> __device_suspend_noirq() would set dev->power.must_resume = true for the
> devices which does not have DPM_FLAG_MAY_SKIP_RESUME flag set.
> 
> 3) Problematic scenario is as follows -  During the device suspend/resume
> scenario all the devices in  the suspend_late stage are successful and some
> device can fail to suspend in suspend_noirq(device_suspend_noirq->
> __device_suspend_noirq) phase.
> As a device failed in dpm_noirq_suspend_devices phase, dpm_resume_noirq is
> getting called to resume devices in dpm_late_early_list in the noirq phase.
> 
> 4) During the Devices_early_resume stage
> dpm_resume_early()-->device_resume_early() functions skipping the devices
> early resume callbacks.
> 799         if (dev_pm_skip_resume(dev))
> 800                  goto Skip;
> 
> 5) Devices suspended in suspend_late stage are not getting resumed in
> Devices_early_resume stage because of
> 'commit 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")'
> is skipping the driver early_resume callbacks when dev->power.must_resume is false.
> 
> 
> Changelog:
> v1 -> v2:
>  - Fixed indentation comments.
>  - Commit text updated to include scenario.
> 
> Prasad Sodagudi (1):
>   PM: sleep: core: Avoid setting power.must_resume to false
> 
>  drivers/base/power/main.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
> 

I do not see a patch here, what happened?

:(

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] PM: sleep: core: Avoid setting power.must_resume to false
  2021-08-10 13:42 ` Prasad Sodagudi
@ 2021-08-10 15:05   ` Greg KH
  0 siblings, 0 replies; 13+ messages in thread
From: Greg KH @ 2021-08-10 15:05 UTC (permalink / raw)
  To: Prasad Sodagudi; +Cc: rjw, len.brown, linux-kernel, linux-pm, pavel

On Tue, Aug 10, 2021 at 06:42:12AM -0700, Prasad Sodagudi wrote:
> There are variables(power.may_skip_resume and dev->power.must_resume)
> and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after
> a system wide suspend transition.
> 
> Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
> its "noirq" and "early" resume callbacks to be skipped if the device
> can be left in suspend after a system-wide transition into the working
> state. PM core determines that the driver's "noirq" and "early" resume
> callbacks should be skipped or not with dev_pm_skip_resume() function by
> checking power.may_skip_resume variable.
> 
> power.must_resume variable is getting set to false in __device_suspend()
> function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
> dev->power.usage_count variables. In problematic scenario, where
> all the devices in the suspend_late stage are successful and some
> device can fail to suspend in suspend_noirq phase. So some devices
> successfully suspended in suspend_late stage are not getting chance
> to execute __device_suspend_noirq() to set dev->power.must_resume
> variable to true and not getting resumed in early_resume phase.
> 
> Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before
> setting power.must_resume variable in __device_suspend function.
> 
> Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")
> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
> ---
>  drivers/base/power/main.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index d568772..9ee6987 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -1642,7 +1642,11 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
>  	}
>  
>  	dev->power.may_skip_resume = true;
> -	dev->power.must_resume = false;
> +	if ((atomic_read(&dev->power.usage_count) <= 1) &&
> +	     (dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
> +		dev->power.must_resume = false;
> +	else
> +		dev->power.must_resume = true;
>  
>  	dpm_watchdog_set(&wd, dev);
>  	device_lock(dev);
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
> 

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- This looks like a new version of a previously submitted patch, but you
  did not list below the --- line any changes from the previous version.
  Please read the section entitled "The canonical patch format" in the
  kernel file, Documentation/SubmittingPatches for what needs to be done
  here to properly describe this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] PM: sleep: core: Avoid setting power.must_resume to false
  2021-08-10 13:54 Prasad Sodagudi
@ 2021-08-10 13:54 ` Prasad Sodagudi
  0 siblings, 0 replies; 13+ messages in thread
From: Prasad Sodagudi @ 2021-08-10 13:54 UTC (permalink / raw)
  To: gregkh, rjw; +Cc: len.brown, linux-kernel, linux-pm, pavel, psodagud

There are variables(power.may_skip_resume and dev->power.must_resume)
and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after
a system wide suspend transition.

Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
its "noirq" and "early" resume callbacks to be skipped if the device
can be left in suspend after a system-wide transition into the working
state. PM core determines that the driver's "noirq" and "early" resume
callbacks should be skipped or not with dev_pm_skip_resume() function by
checking power.may_skip_resume variable.

power.must_resume variable is getting set to false in __device_suspend()
function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
dev->power.usage_count variables. In problematic scenario, where
all the devices in the suspend_late stage are successful and some
device can fail to suspend in suspend_noirq phase. So some devices
successfully suspended in suspend_late stage are not getting chance
to execute __device_suspend_noirq() to set dev->power.must_resume
variable to true and not getting resumed in early_resume phase.

Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before
setting power.must_resume variable in __device_suspend function.

Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
---
 V1 -> V2: Fixed indentation and commit text to include scenario

 drivers/base/power/main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index d568772..9ee6987 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1642,7 +1642,11 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
 	}
 
 	dev->power.may_skip_resume = true;
-	dev->power.must_resume = false;
+	if ((atomic_read(&dev->power.usage_count) <= 1) &&
+	     (dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
+		dev->power.must_resume = false;
+	else
+		dev->power.must_resume = true;
 
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] PM: sleep: core: Avoid setting power.must_resume to false
@ 2021-08-10 13:54 Prasad Sodagudi
  2021-08-10 13:54 ` Prasad Sodagudi
  0 siblings, 1 reply; 13+ messages in thread
From: Prasad Sodagudi @ 2021-08-10 13:54 UTC (permalink / raw)
  To: gregkh, rjw; +Cc: len.brown, linux-kernel, linux-pm, pavel, psodagud

This is regarding suspend/resume(s2idle) scenario of devices and difference
between the LTS kernels 5.4 and 5.10 with respect to devices suspend and
resume. Observing that devices suspended in suspend_late stage are not
getting resumed in resume_early stage.
1) LTS kernel 5.4 kernel do not have this problem but 5.10 kernel
shows this problem.
2) 'commit 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")'
is skipping the driver early_resume callbacks.

In device_resume_early function dev->power.must_resume is used to skip the
resume call back. It looks this function is expecting that,
__device_suspend_noirq() would set dev->power.must_resume = true for the
devices which does not have DPM_FLAG_MAY_SKIP_RESUME flag set.

3) Problematic scenario is as follows -  During the device suspend/resume
scenario all the devices in  the suspend_late stage are successful and some
device can fail to suspend in suspend_noirq(device_suspend_noirq->
__device_suspend_noirq) phase.
As a device failed in dpm_noirq_suspend_devices phase, dpm_resume_noirq is
getting called to resume devices in dpm_late_early_list in the noirq phase.

4) During the Devices_early_resume stage
dpm_resume_early()-->device_resume_early() functions skipping the devices
early resume callbacks.
799         if (dev_pm_skip_resume(dev))
800                  goto Skip;

5) Devices suspended in suspend_late stage are not getting resumed in
Devices_early_resume stage because of
'commit 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")'
is skipping the driver early_resume callbacks when dev->power.must_resume is false.


Changelog:
v1 -> v2:
 - Fixed indentation comments.
 - Commit text updated to include scenario.

Prasad Sodagudi (1):
  PM: sleep: core: Avoid setting power.must_resume to false

 drivers/base/power/main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] PM: sleep: core: Avoid setting power.must_resume to false
  2021-08-10 13:42 Prasad Sodagudi
@ 2021-08-10 13:42 ` Prasad Sodagudi
  2021-08-10 15:05   ` Greg KH
  2021-08-10 15:06 ` Greg KH
  1 sibling, 1 reply; 13+ messages in thread
From: Prasad Sodagudi @ 2021-08-10 13:42 UTC (permalink / raw)
  To: gregkh, rjw; +Cc: len.brown, linux-kernel, linux-pm, pavel, psodagud

There are variables(power.may_skip_resume and dev->power.must_resume)
and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after
a system wide suspend transition.

Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
its "noirq" and "early" resume callbacks to be skipped if the device
can be left in suspend after a system-wide transition into the working
state. PM core determines that the driver's "noirq" and "early" resume
callbacks should be skipped or not with dev_pm_skip_resume() function by
checking power.may_skip_resume variable.

power.must_resume variable is getting set to false in __device_suspend()
function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
dev->power.usage_count variables. In problematic scenario, where
all the devices in the suspend_late stage are successful and some
device can fail to suspend in suspend_noirq phase. So some devices
successfully suspended in suspend_late stage are not getting chance
to execute __device_suspend_noirq() to set dev->power.must_resume
variable to true and not getting resumed in early_resume phase.

Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before
setting power.must_resume variable in __device_suspend function.

Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
---
 drivers/base/power/main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index d568772..9ee6987 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1642,7 +1642,11 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
 	}
 
 	dev->power.may_skip_resume = true;
-	dev->power.must_resume = false;
+	if ((atomic_read(&dev->power.usage_count) <= 1) &&
+	     (dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
+		dev->power.must_resume = false;
+	else
+		dev->power.must_resume = true;
 
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] PM: sleep: core: Avoid setting power.must_resume to false
@ 2021-08-10 13:42 Prasad Sodagudi
  2021-08-10 13:42 ` Prasad Sodagudi
  2021-08-10 15:06 ` Greg KH
  0 siblings, 2 replies; 13+ messages in thread
From: Prasad Sodagudi @ 2021-08-10 13:42 UTC (permalink / raw)
  To: gregkh, rjw; +Cc: len.brown, linux-kernel, linux-pm, pavel, psodagud

This is regarding suspend/resume(s2idle) scenario of devices and difference
between the LTS kernels 5.4 and 5.10 with respect to devices suspend and
resume. Observing that devices suspended in suspend_late stage are not
getting resumed in resume_early stage.
1) LTS kernel 5.4 kernel do not have this problem but 5.10 kernel
shows this problem.
2) 'commit 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")'
is skipping the driver early_resume callbacks.

In device_resume_early function dev->power.must_resume is used to skip the
resume call back. It looks this function is expecting that,
__device_suspend_noirq() would set dev->power.must_resume = true for the
devices which does not have DPM_FLAG_MAY_SKIP_RESUME flag set.

3) Problematic scenario is as follows -  During the device suspend/resume
scenario all the devices in  the suspend_late stage are successful and some
device can fail to suspend in suspend_noirq(device_suspend_noirq->
__device_suspend_noirq) phase.
As a device failed in dpm_noirq_suspend_devices phase, dpm_resume_noirq is
getting called to resume devices in dpm_late_early_list in the noirq phase.

4) During the Devices_early_resume stage
dpm_resume_early()-->device_resume_early() functions skipping the devices
early resume callbacks.
799         if (dev_pm_skip_resume(dev))
800                  goto Skip;

5) Devices suspended in suspend_late stage are not getting resumed in
Devices_early_resume stage because of
'commit 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")'
is skipping the driver early_resume callbacks when dev->power.must_resume is false.


Changelog:
v1 -> v2:
 - Fixed indentation comments.
 - Commit text updated to include scenario.

Prasad Sodagudi (1):
  PM: sleep: core: Avoid setting power.must_resume to false

 drivers/base/power/main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-08-10 15:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-26 15:24 [PATCH] PM: sleep: core: Avoid setting power.must_resume to false Prasad Sodagudi
2021-08-03 17:16 ` Greg KH
2021-08-06 15:07   ` psodagud
2021-08-06 15:16     ` Greg KH
2021-08-07  6:00     ` Greg KH
2021-08-08 16:10       ` [PATCH v2] " Prasad Sodagudi
2021-08-08 17:36         ` Greg KH
2021-08-10 13:42 Prasad Sodagudi
2021-08-10 13:42 ` Prasad Sodagudi
2021-08-10 15:05   ` Greg KH
2021-08-10 15:06 ` Greg KH
2021-08-10 13:54 Prasad Sodagudi
2021-08-10 13:54 ` Prasad Sodagudi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).