LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU
@ 2020-02-22 16:24 Srinivas Pandruvada
  2020-02-22 16:53 ` Chris Wilson
  2020-02-22 17:51 ` Borislav Petkov
  0 siblings, 2 replies; 9+ messages in thread
From: Srinivas Pandruvada @ 2020-02-22 16:24 UTC (permalink / raw)
  To: tony.luck, bp, tglx, mingo, hpa
  Cc: x86, linux-edac, linux-kernel, Srinivas Pandruvada, Chris Wilson

During cpu-hotplug test with CONFIG_PREEMPTION and CONFIG_DEBUG_PREEMPT
enabled, Chris reported error:

BUG: using smp_processor_id() in preemptible [00000000] code: kworker/1:0/17
caller is throttle_active_work+0x12/0x280

Here throttle_active_work() is a work queue callback scheduled with
schedule_delayed_work_on(). This will not cause this error for the use
of smp_processor_id() under normal conditions as there is a check for
"current->nr_cpus_allowed == 1".
But when the target CPU is offline the workqueue becomes unbound.
Then the work queue callback can be scheduled on another CPU and the
error is printed for the use of smp_processor_id() in preemptible context.

When the workqueue is not getting called on the target CPU, simply return.
This is done by adding a cpu field in the _thermal_state struct and match
the current CPU id.

Once workqueue is scheduled, prevent CPU offline. In this way, the log
bits are checked and cleared on the correct CPU. Also use get_cpu() to
get current CPU id and prevent preemption before we finish processing.

Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/mce/therm_throt.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/therm_throt.c b/arch/x86/kernel/cpu/mce/therm_throt.c
index 58b4ee3cda77..4dab8a4558f9 100644
--- a/arch/x86/kernel/cpu/mce/therm_throt.c
+++ b/arch/x86/kernel/cpu/mce/therm_throt.c
@@ -61,6 +61,7 @@
  * @new_event:			Stores the last high/low status of the
  *				THERM_STATUS_PROCHOT or
  *				THERM_STATUS_POWER_LIMIT.
+ * @cpu:			CPU id for this instance.
  * @level:			Stores whether this _thermal_state instance is
  *				for a CORE level or for PACKAGE level.
  * @sample_index:		Index for storing the next sample in the buffer
@@ -86,6 +87,7 @@ struct _thermal_state {
 	unsigned long		total_time_ms;
 	bool			rate_control_active;
 	bool			new_event;
+	int			cpu;
 	u8			level;
 	u8			sample_index;
 	u8			sample_count;
@@ -239,11 +241,19 @@ static void __maybe_unused throttle_active_work(struct work_struct *work)
 {
 	struct _thermal_state *state = container_of(to_delayed_work(work),
 						struct _thermal_state, therm_work);
-	unsigned int i, avg, this_cpu = smp_processor_id();
+	unsigned int i, avg, this_cpu;
 	u64 now = get_jiffies_64();
 	bool hot;
 	u8 temp;
 
+	get_online_cpus();
+	this_cpu = get_cpu();
+
+	if (state->cpu != this_cpu) {
+		state->rate_control_active = false;
+		goto end;
+	}
+
 	get_therm_status(state->level, &hot, &temp);
 	/* temperature value is offset from the max so lesser means hotter */
 	if (!hot && temp > state->baseline_temp) {
@@ -254,7 +264,7 @@ static void __maybe_unused throttle_active_work(struct work_struct *work)
 				state->count);
 
 		state->rate_control_active = false;
-		return;
+		goto end;
 	}
 
 	if (time_before64(now, state->next_check) &&
@@ -296,6 +306,10 @@ static void __maybe_unused throttle_active_work(struct work_struct *work)
 re_arm:
 	clear_therm_status_log(state->level);
 	schedule_delayed_work_on(this_cpu, &state->therm_work, THERM_THROT_POLL_INTERVAL);
+
+end:
+	put_cpu();
+	put_online_cpus();
 }
 
 /***
@@ -359,6 +373,7 @@ static void therm_throt_process(bool new_event, int event, int level)
 
 		state->baseline_temp = temp;
 		state->last_interrupt_time = now;
+		state->cpu = this_cpu;
 		schedule_delayed_work_on(this_cpu, &state->therm_work, THERM_THROT_POLL_INTERVAL);
 	} else if (old_event && state->last_interrupt_time) {
 		unsigned long throttle_time;
-- 
2.24.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU
  2020-02-22 16:24 [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU Srinivas Pandruvada
@ 2020-02-22 16:53 ` Chris Wilson
  2020-02-22 17:51 ` Borislav Petkov
  1 sibling, 0 replies; 9+ messages in thread
From: Chris Wilson @ 2020-02-22 16:53 UTC (permalink / raw)
  To: Srinivas Pandruvada, bp, hpa, mingo, tglx, tony.luck
  Cc: x86, linux-edac, linux-kernel, Srinivas Pandruvada

Quoting Srinivas Pandruvada (2020-02-22 16:24:32)
> During cpu-hotplug test with CONFIG_PREEMPTION and CONFIG_DEBUG_PREEMPT
> enabled, Chris reported error:
> 
> BUG: using smp_processor_id() in preemptible [00000000] code: kworker/1:0/17
> caller is throttle_active_work+0x12/0x280
> 
> Here throttle_active_work() is a work queue callback scheduled with
> schedule_delayed_work_on(). This will not cause this error for the use
> of smp_processor_id() under normal conditions as there is a check for
> "current->nr_cpus_allowed == 1".
> But when the target CPU is offline the workqueue becomes unbound.
> Then the work queue callback can be scheduled on another CPU and the
> error is printed for the use of smp_processor_id() in preemptible context.
> 
> When the workqueue is not getting called on the target CPU, simply return.
> This is done by adding a cpu field in the _thermal_state struct and match
> the current CPU id.
> 
> Once workqueue is scheduled, prevent CPU offline. In this way, the log
> bits are checked and cleared on the correct CPU. Also use get_cpu() to
> get current CPU id and prevent preemption before we finish processing.
> 
> Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
> Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>

I've pushed the patch to our CI, but it's not a frequent occurrence, so
it may be some time before I can state a t-b with any confidence.
-Chris

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU
  2020-02-22 16:24 [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU Srinivas Pandruvada
  2020-02-22 16:53 ` Chris Wilson
@ 2020-02-22 17:51 ` Borislav Petkov
  2020-02-23  0:25   ` Srinivas Pandruvada
  1 sibling, 1 reply; 9+ messages in thread
From: Borislav Petkov @ 2020-02-22 17:51 UTC (permalink / raw)
  To: Srinivas Pandruvada
  Cc: tony.luck, tglx, mingo, hpa, x86, linux-edac, linux-kernel, Chris Wilson

On Sat, Feb 22, 2020 at 08:24:32AM -0800, Srinivas Pandruvada wrote:
> During cpu-hotplug test with CONFIG_PREEMPTION and CONFIG_DEBUG_PREEMPT
> enabled, Chris reported error:
> 
> BUG: using smp_processor_id() in preemptible [00000000] code: kworker/1:0/17
> caller is throttle_active_work+0x12/0x280
> 
> Here throttle_active_work() is a work queue callback scheduled with
> schedule_delayed_work_on(). This will not cause this error for the use
> of smp_processor_id() under normal conditions as there is a check for
> "current->nr_cpus_allowed == 1".
> But when the target CPU is offline the workqueue becomes unbound.
> Then the work queue callback can be scheduled on another CPU and the
> error is printed for the use of smp_processor_id() in preemptible context.

So what's wrong with simply doing:

	if (cpu_is_offline(this_cpu))
		return;

?

You don't need to run the callback on an offlined CPU anyway...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU
  2020-02-22 17:51 ` Borislav Petkov
@ 2020-02-23  0:25   ` Srinivas Pandruvada
  2020-02-24 12:55     ` Borislav Petkov
  0 siblings, 1 reply; 9+ messages in thread
From: Srinivas Pandruvada @ 2020-02-23  0:25 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: tony.luck, tglx, mingo, hpa, x86, linux-edac, linux-kernel, Chris Wilson

On Sat, 2020-02-22 at 18:51 +0100, Borislav Petkov wrote:
> On Sat, Feb 22, 2020 at 08:24:32AM -0800, Srinivas Pandruvada wrote:
> > During cpu-hotplug test with CONFIG_PREEMPTION and
> > CONFIG_DEBUG_PREEMPT
> > enabled, Chris reported error:
> > 
> > BUG: using smp_processor_id() in preemptible [00000000] code:
> > kworker/1:0/17
> > caller is throttle_active_work+0x12/0x280
> > 
> > Here throttle_active_work() is a work queue callback scheduled with
> > schedule_delayed_work_on(). This will not cause this error for the
> > use
> > of smp_processor_id() under normal conditions as there is a check
> > for
> > "current->nr_cpus_allowed == 1".
> > But when the target CPU is offline the workqueue becomes unbound.
> > Then the work queue callback can be scheduled on another CPU and
> > the
> > error is printed for the use of smp_processor_id() in preemptible
> > context.
> 
> So what's wrong with simply doing:
> 
> 	if (cpu_is_offline(this_cpu))
> 		return;
> 
> ?
> 
If the condition is false, will it prevent offline CPU before executing
next statement and reschedule on another CPU? Although It will not
cause any error or crash but in rare circumstance may print premature
warning/normal message based on the current CPU's state.

So I can submit something like this:

diff --git a/arch/x86/kernel/cpu/mce/therm_throt.c
b/arch/x86/kernel/cpu/mce/therm_throt.c
index b38010b541d6..7aa7c9d1df2a 100644
--- a/arch/x86/kernel/cpu/mce/therm_throt.c
+++ b/arch/x86/kernel/cpu/mce/therm_throt.c
@@ -239,11 +239,14 @@ static void throttle_active_work(struct
work_struct *work)
 {
        struct _thermal_state *state =
container_of(to_delayed_work(work),
                                                struct _thermal_state,
therm_work);
-       unsigned int i, avg, this_cpu = smp_processor_id();
+       unsigned int i, avg, this_cpu = state->cpu;
        u64 now = get_jiffies_64();
        bool hot;
        u8 temp;
 
+       if (cpu_is_offline(this_cpu))
+               return;
+
        get_therm_status(state->level, &hot, &temp);
        /* temperature value is offset from the max so lesser means
hotter */
        if (!hot && temp > state->baseline_temp) {

Thanks,
Srinivas

> You don't need to run the callback on an offlined CPU anyway...
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU
  2020-02-23  0:25   ` Srinivas Pandruvada
@ 2020-02-24 12:55     ` Borislav Petkov
  2020-02-24 16:01       ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Borislav Petkov @ 2020-02-24 12:55 UTC (permalink / raw)
  To: Srinivas Pandruvada
  Cc: tony.luck, tglx, mingo, hpa, x86, linux-edac, linux-kernel, Chris Wilson

On Sat, Feb 22, 2020 at 04:25:59PM -0800, Srinivas Pandruvada wrote:
> If the condition is false, will it prevent offline CPU before executing
> next statement and reschedule on another CPU? Although It will not
> cause any error or crash but in rare circumstance may print premature
> warning/normal message based on the current CPU's state.

Why, offline CPU is offline CPU?

Btw, I'm asking whether you can do the simpler thing *instead* of your
patch. You basically don't run the workqueue callback on offlined CPUs:

	get_online_cpus();

	if (cpu_is_offline(smp_processor_id()))
		goto out;

	...


out:
	put_online_cpus();

Hmm?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU
  2020-02-24 12:55     ` Borislav Petkov
@ 2020-02-24 16:01       ` Thomas Gleixner
  2020-02-24 19:25         ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2020-02-24 16:01 UTC (permalink / raw)
  To: Borislav Petkov, Srinivas Pandruvada
  Cc: tony.luck, mingo, hpa, x86, linux-edac, linux-kernel, Chris Wilson

Borislav Petkov <bp@alien8.de> writes:

> On Sat, Feb 22, 2020 at 04:25:59PM -0800, Srinivas Pandruvada wrote:
>> If the condition is false, will it prevent offline CPU before executing
>> next statement and reschedule on another CPU? Although It will not
>> cause any error or crash but in rare circumstance may print premature
>> warning/normal message based on the current CPU's state.
>
> Why, offline CPU is offline CPU?
>
> Btw, I'm asking whether you can do the simpler thing *instead* of your
> patch. You basically don't run the workqueue callback on offlined CPUs:
>
> 	get_online_cpus();
>
> 	if (cpu_is_offline(smp_processor_id()))
> 		goto out;
>
> 	...
>
>
> out:
> 	put_online_cpus();

Which is wrong as well. Trying to "fix" it in the work queue callback is
papering over the root cause.

Why is any work scheduled on an outgoing CPU after this CPU executed
thermal_throttle_offline()?

When thermal_throttle_offline() is invoked the cpu bound work queues are
still functional and thermal_throttle_offline() cancels outstanding
work.

So no, please fix the root cause not the symptom.

Thanks,

        tglx




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU
  2020-02-24 16:01       ` Thomas Gleixner
@ 2020-02-24 19:25         ` Thomas Gleixner
  2020-02-24 21:05           ` Pandruvada, Srinivas
  2020-02-25  9:46           ` Srinivas Pandruvada
  0 siblings, 2 replies; 9+ messages in thread
From: Thomas Gleixner @ 2020-02-24 19:25 UTC (permalink / raw)
  To: Borislav Petkov, Srinivas Pandruvada
  Cc: tony.luck, mingo, hpa, x86, linux-edac, linux-kernel, Chris Wilson

Thomas Gleixner <tglx@linutronix.de> writes:
> Which is wrong as well. Trying to "fix" it in the work queue callback is
> papering over the root cause.
>
> Why is any work scheduled on an outgoing CPU after this CPU executed
> thermal_throttle_offline()?
>
> When thermal_throttle_offline() is invoked the cpu bound work queues are
> still functional and thermal_throttle_offline() cancels outstanding
> work.
>
> So no, please fix the root cause not the symptom.

And if you look at thermal_throttle_online() then you'll notice that it
is asymetric vs. thermal_throttle_offline().

Also you want to do cancel_delayed_work_sync() and not just
cancel_delayed_work() because only the latter guarantees that the work
is not enqueued anymore while the former does not take running or self
requeueing work into account.

Something like the untested patch below.

Thanks,

        tglx
---
--- a/arch/x86/kernel/cpu/mce/therm_throt.c
+++ b/arch/x86/kernel/cpu/mce/therm_throt.c
@@ -487,8 +487,12 @@ static int thermal_throttle_offline(unsi
 	struct thermal_state *state = &per_cpu(thermal_state, cpu);
 	struct device *dev = get_cpu_device(cpu);
 
-	cancel_delayed_work(&state->package_throttle.therm_work);
-	cancel_delayed_work(&state->core_throttle.therm_work);
+	/* Mask the thermal vector before draining evtl. pending work */
+	l = apic_read(APIC_LVTTHMR);
+	apic_write(APIC_LVTTHMR, l | APIC_LVT_MASKED);
+
+	cancel_delayed_work_sync(&state->package_throttle.therm_work);
+	cancel_delayed_work_sync(&state->core_throttle.therm_work);
 
 	state->package_throttle.rate_control_active = false;
 	state->core_throttle.rate_control_active = false;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU
  2020-02-24 19:25         ` Thomas Gleixner
@ 2020-02-24 21:05           ` Pandruvada, Srinivas
  2020-02-25  9:46           ` Srinivas Pandruvada
  1 sibling, 0 replies; 9+ messages in thread
From: Pandruvada, Srinivas @ 2020-02-24 21:05 UTC (permalink / raw)
  To: tglx, bp; +Cc: hpa, Luck, Tony, mingo, linux-kernel, chris, x86, linux-edac

On Mon, 2020-02-24 at 20:25 +0100, Thomas Gleixner wrote:
> Thomas Gleixner <tglx@linutronix.de> writes:
> > Which is wrong as well. Trying to "fix" it in the work queue
> > callback is
> > papering over the root cause.
> > 
> > Why is any work scheduled on an outgoing CPU after this CPU
> > executed
> > thermal_throttle_offline()?
> > 
> > When thermal_throttle_offline() is invoked the cpu bound work
> > queues are
> > still functional and thermal_throttle_offline() cancels outstanding
> > work.
> > 
> > So no, please fix the root cause not the symptom.
> 
> And if you look at thermal_throttle_online() then you'll notice that
> it
> is asymetric vs. thermal_throttle_offline().
> 
> Also you want to do cancel_delayed_work_sync() and not just
> cancel_delayed_work() because only the latter guarantees that the
> work
> is not enqueued anymore while the former does not take running or
> self
> requeueing work into account.
> 
> Something like the untested patch below.
Thanks Thomas. I am testing this patch.

-Srinivas

> 
> Thanks,
> 
>         tglx
> ---
> --- a/arch/x86/kernel/cpu/mce/therm_throt.c
> +++ b/arch/x86/kernel/cpu/mce/therm_throt.c
> @@ -487,8 +487,12 @@ static int thermal_throttle_offline(unsi
>  	struct thermal_state *state = &per_cpu(thermal_state, cpu);
>  	struct device *dev = get_cpu_device(cpu);
>  
> -	cancel_delayed_work(&state->package_throttle.therm_work);
> -	cancel_delayed_work(&state->core_throttle.therm_work);
> +	/* Mask the thermal vector before draining evtl. pending work
> */
> +	l = apic_read(APIC_LVTTHMR);
> +	apic_write(APIC_LVTTHMR, l | APIC_LVT_MASKED);
> +
> +	cancel_delayed_work_sync(&state->package_throttle.therm_work);
> +	cancel_delayed_work_sync(&state->core_throttle.therm_work);
>  
>  	state->package_throttle.rate_control_active = false;
>  	state->core_throttle.rate_control_active = false;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU
  2020-02-24 19:25         ` Thomas Gleixner
  2020-02-24 21:05           ` Pandruvada, Srinivas
@ 2020-02-25  9:46           ` Srinivas Pandruvada
  1 sibling, 0 replies; 9+ messages in thread
From: Srinivas Pandruvada @ 2020-02-25  9:46 UTC (permalink / raw)
  To: Thomas Gleixner, Borislav Petkov
  Cc: tony.luck, mingo, hpa, x86, linux-edac, linux-kernel, Chris Wilson

On Mon, 2020-02-24 at 20:25 +0100, Thomas Gleixner wrote:
> Thomas Gleixner <tglx@linutronix.de> writes:
> > Which is wrong as well. Trying to "fix" it in the work queue
> > callback is
> > papering over the root cause.
> > 
> > Why is any work scheduled on an outgoing CPU after this CPU
> > executed
> > thermal_throttle_offline()?
> > 
> > When thermal_throttle_offline() is invoked the cpu bound work
> > queues are
> > still functional and thermal_throttle_offline() cancels outstanding
> > work.
> > 
> > So no, please fix the root cause not the symptom.
> 
> And if you look at thermal_throttle_online() then you'll notice that
> it
> is asymetric vs. thermal_throttle_offline().
> 
> Also you want to do cancel_delayed_work_sync() and not just
> cancel_delayed_work() because only the latter guarantees that the
> work
> is not enqueued anymore while the former does not take running or
> self
> requeueing work into account.
> 
> Something like the untested patch below.
I tested this patch.
After simulating 20 million thermal interrupts and online/offline test
for 12+ hours,  don't see the issue.

So this change fixed the issue.

I can send change on your behalf or you can add
Tested-by: Pandruvada, Srinivas <srinivas.pandruvada@linux.intel.com>

Thanks,
Srinivas

> 
> Thanks,
> 
>         tglx
> ---
> --- a/arch/x86/kernel/cpu/mce/therm_throt.c
> +++ b/arch/x86/kernel/cpu/mce/therm_throt.c
> @@ -487,8 +487,12 @@ static int thermal_throttle_offline(unsi
>  	struct thermal_state *state = &per_cpu(thermal_state, cpu);
>  	struct device *dev = get_cpu_device(cpu);
>  
> -	cancel_delayed_work(&state->package_throttle.therm_work);
> -	cancel_delayed_work(&state->core_throttle.therm_work);
> +	/* Mask the thermal vector before draining evtl. pending work
> */
> +	l = apic_read(APIC_LVTTHMR);
> +	apic_write(APIC_LVTTHMR, l | APIC_LVT_MASKED);
> +
> +	cancel_delayed_work_sync(&state->package_throttle.therm_work);
> +	cancel_delayed_work_sync(&state->core_throttle.therm_work);
>  
>  	state->package_throttle.rate_control_active = false;
>  	state->core_throttle.rate_control_active = false;


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-02-25  9:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-22 16:24 [PATCH] x86/mce/therm_throt: Handle case where throttle_active_work() is called on behalf of an offline CPU Srinivas Pandruvada
2020-02-22 16:53 ` Chris Wilson
2020-02-22 17:51 ` Borislav Petkov
2020-02-23  0:25   ` Srinivas Pandruvada
2020-02-24 12:55     ` Borislav Petkov
2020-02-24 16:01       ` Thomas Gleixner
2020-02-24 19:25         ` Thomas Gleixner
2020-02-24 21:05           ` Pandruvada, Srinivas
2020-02-25  9:46           ` Srinivas Pandruvada

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).